Articles | Volume 7, issue 6
Research article
22 Nov 2022
Research article |  | 22 Nov 2022

Evaluating the mesoscale spatio-temporal variability in simulated wind speed time series over northern Europe

Graziela Luzia, Andrea N. Hahmann, and Matti Juhani Koivisto

As wind energy increases its share of total electricity generation and its integration into the power system becomes more challenging, accurately representing the spatio-temporal variability in wind data becomes crucial. Wind fluctuations impact power and energy systems, e.g. energy system planning, vulnerability to storm shutdowns, and available voltage stability support. To analyse such fluctuations and their spatio-temporal dependencies, time series of wind speeds at an hourly or higher frequency are needed. We provide a comprehensive evaluation of the global and mesoscale-model-derived wind time series against observations by using a set of metrics that we present as requirements for wind energy integration studies. We also perform a sensitivity analysis to find the best model setup of the Weather Research and Forecasting (WRF) model, focusing on evaluating the wind speed fluctuation metrics. The results show that using higher spatial resolution in the WRF model simulations improves the representation of temporal fluctuations; however, higher-spatial-resolution simulations often lower the correlations of wind time series with measurements. Thus, we recommend finer-spatial-resolution simulations for modelling power ramp or voltage stability studies but ERA5 rather than mesoscale simulations for studies where correlations with measurements are essential. We also show that the nesting strategy is an important consideration, and a smoother transition from the forcing data to the nested domains improves the correlations with measurements. All mesoscale model simulations overestimate the value of the spatial correlations in wind speed as estimated from observations. Still, the spatial correlations and the wind speed distributions are insensitive to the mesoscale model configuration tested in this study. Regarding these two metrics, mesoscale model simulations present more favourable results than ERA5.

1 Introduction

Many wind energy applications use meteorological data derived from atmospheric models; for example, in the production of wind resource atlases (Tammelin et al.2013; Dörenkämper et al.2020; Solbrekke et al.2021), extreme wind atlases (Larsén et al.2012), and wind turbine icing in cold climates (Hämäläinen and Niemelä2017). In general, most studies use time series of wind speed, but different applications require distinct qualities in meteorological data, and, therefore, different evaluations are necessary. For a wind resource atlas, for instance, accurate wind speed distributions are necessary (Dörenkämper et al.2020; Knoop et al.2020); for wind power forecasting, accurate timing is vital (Das et al.2017; Olson et al.2019). Energy and power system modelling, including optimal energy system planning (e.g. Gea-Bermúdez et al.2020; Brown et al.2018; Malvaldi et al.2017), the study of power system ramp rates and vulnerability to storm shutdowns (e.g. Murcia et al.2021), and available voltage stability support (e.g. Souxes et al.2019), requires an accurate representation of temporal dependencies and spatial correlations, as well as accurate wind speed distributions. Time synchronization with measurements (i.e. the high correlation between measured and simulated data) is important when the wind time series are used in conjunction with other data, e.g. electricity load time series (Gea-Bermúdez et al.2020).

Several studies have validated meteorological datasets for modelling weather-dependent wind power generation and its highly fluctuating behaviour. These works use data provided by global atmospheric reanalysis (e.g., Cannon et al.2015; González-Aparicio et al.2017; Gruber et al.2022), mesoscale numerical weather prediction (NWP) models (Murcia Leon et al.2021; Koivisto et al.2021), or both (Jourdier2020; Murcia et al.2022). Because mesoscale NWP models cannot represent the effects of the most detailed microscale processes, extra information, such as the effect of the terrain in the wind speed distribution, can be added by combining (i.e., adjusting) mesoscale with microscale data (e.g., Staffell and Pfenninger2016; Ruiz et al.2019; Murcia et al.2022). Due to their relatively low temporal resolution (usually available from 30 min to 1 h resolution) and intrinsic numerical smoothing, data from mesoscale models cannot include minute- to second-scale variability, and even hourly variability may be too low compared to measurements (Koivisto et al.2020). Therefore, it may be necessary to combine synthetic data through statistical methods (e.g. Hawker et al.2016; Larsén et al.2012; Murcia et al.2021) to represent wind fluctuations at shorter timescales.

Validation studies of time series from existent high-spatial-resolution datasets (of the order of a few kilometres) produced by mesoscale NWP models can be found in the literature for wind power integration studies. Jourdier (2020) compared datasets from several sources, including large-scale and regional downscaled reanalyses, with mesoscale data such as the NWP model AROME and the New European Wind Atlas (NEWA) (Hahmann et al.2020b; Dörenkämper et al.2020) in simulating mean wind speed, power production, and its temporal correlations over France. Murcia et al. (2022) performed a large-scale validation study, comparing the ERA5 reanalysis (Hersbach et al.2020) and two datasets based on the Weather Research and Forecasting (WRF), NEWA, and an ERA-Interim-based (Dee et al.2011) European-level atmospheric reanalysis (Nuño et al.2018) with and without the addition of microscale details provided by the Global Wind Atlas (Badger et al.2015). Regarding wind speed, validations were done using wind measurements over northern Europe on various time-series metrics, such as errors in autocorrelations, spatial correlations, and wind speed distribution. Both studies found that ERA5 is skilled but presents deficiencies in simulating wind speed in areas with more complex terrain if not corrected. Jourdier (2020) found that the higher-resolution regional models (i.e. AROME and COSMO-REA6) also show very good skills and reduced bias, especially in complex topography. Murcia et al. (2022) found that NEWA is significantly better than ERA5 in representing the temporal properties of the wind speed time series at individual locations or wind power plants.

Fewer articles present model development focused on time series for wind integration studies. Draxl and Clifton (2015) discussed that many efforts have been made to model the wind distributions for wind resource assessments with NWP models. Creating datasets for wind integration studies for modelling wind power variability, on the other hand, is a not straightforward step from the NWP outputs. Some studies utilize the mesoscale model to generate time series and validate it for an individual application. For example, Nuño et al. (2018) developed hourly time series of European transcontinental wind and solar photovoltaic generation using the WRF model to dynamically downscale a global reanalysis and analysed regionally aggregated power variability on different timescales, which is relevant for system planning, market studies, and others. Mehrens et al. (2016) assessed the WRF model's ability to simulate coherence, spatial correlation, and power spectra for a large range of distances over the North and Baltic seas. Focusing on local wind power generation variability, Koivisto et al. (2021) validated the WRF model time series aggregated for several European countries. The results of this article show that combining mesoscale and microscale data and the addition of missing power plant technical parameters through machine learning improve the representation of the annual capacity factors and hourly generation distributions for most countries. However, no significant differences are shown for the autocorrelations and spatial correlations. Draxl and Clifton (2015) generated a sub-hourly high-spatial-resolution dataset target for application in wind integration studies over the United States, but besides the high temporal resolution, the work focused on the validations only with daily and seasonal variability, in addition to wind speed distribution. By focusing on several essential aspects of time series, this paper adds to the development of time-series modelling of wind speeds for wind integration studies and evaluation techniques. The model comparison results allow users to select the most appropriate modelling and datasets for different applications.

This work focuses on modelling wind speed time series suitable for power and energy system applications. It adds to the literature by (1) investigating the impact of the interaction between the mesoscale model and its forcing data on the quality of the resultant time series and (2) providing a comprehensive evaluation of the different datasets, with a focus on how well they can represent temporal and spatial correlations over northern Europe. We perform a sensitivity study of the WRF model in multiple configurations, varying the influence of the forcing global reanalysis in the simulations to understand its role and distinguish the model configuration that outperforms various time series aspects. The results are also compared to ERA5 and NEWA mesoscale data. We hypothesize that these modelling aspects, defined by the nesting choice, size, and position of the domains, impact the accuracy of the time series more than the horizontal resolution of the model simulations.

This paper is structured as follows: Sect. 2 describes the simulations, the measured data, and the methodology used to generate and validate the time series; Sect. 3 presents the results of the time series comparisons; Sect. 4 presents the discussion and experiments ranking and Sect. 5, the conclusions and perspectives.

2 Data and methods

2.1 Simulated data

The simulations used in this work were produced by the WRF mesoscale model in two different versions, using the configuration for the model physics and model dynamics as in the New European Wind Atlas (NEWA) (Dörenkämper et al.2020). The analyses are presented for the WRF model version 4.2.1 (Skamarock et al.2019), which was the latest available version when running the experiments. However, the WRF model version 3.8 (Skamarock et al.2008), used for the NEWA production, was utilized as a control simulation for all experiments to check for possible impacts on the results due to the model modifications. The model updates could lead to changes in the simulated wind due to changes in parameterizations that can affect the results. The configuration of both versions was checked to ensure some degree of consistency. Still, tracking the exact combination of modifications in the model version and its possible effects can be difficult.

All simulations use nudging to the forcing reanalysis in the outer domain with time tendencies computed from reanalysis data at 6 h, as outlined in Hahmann et al. (2015). The WRF model has been broadly validated and used for wind resource assessment and sensitivity studies for northern Europe; such as Hahmann et al. (2015, 2020b) and Li et al. (2021) have shown that generally, the simulated wind profile matches the observations despite systematic biases and that the WRF model downscaling adds value to the reanalyses for wind energy applications.

The purpose of the nesting experiments (hereafter named “10 km”, “6 km”, “5 km”, and “3.3 km”) is to vary the influence of the forcing data in the innermost domain by changing the grid arrangement of the simulation or to verify the impact of the grid spacing for similar arrangements. The WRF model domains used in the simulations are presented in Fig. 1 using two nesting approaches: (1) using one nested domain (blue) and (2) using two nested domains (orange). The simulations were configured to have the innermost domains covering the same geographical area. The parent domain (d1) has the same dimensions and position for both arrangements. However, for nested domains, the number of grid points must be proportional to the parent grid ratio (Skamarock et al.2008), which means that even for similar arrangements (i.e. single nest with ratio 1/3 or 1/5), the innermost domains have slightly (a few km) different geographical extension. These differences were neglected for the comparisons, as shown in Fig. 1. The horizontal resolution of the innermost domain is a result of the nesting ratio and the resolution jump used, as can be seen in Table 1. We included one additional experiment to test the impact of different forcing data. The forcing data provide initial and boundary conditions for the simulations. All experiments were forced by the ERA5 reanalysis (Hersbach et al.2020) as in the NEWA configuration, except experiment “10 km_erai”, which was forced by the ERA-Interim reanalysis (Dee et al.2011). All experiments produced with WRF version 3.8 (as in the NEWA production) for checking the consistency among the results were analysed, but only that analogous to the 10 km experiment from version 4.2.1 (named “10 km_v3”) is presented here. This arrangement makes all experiments comparable to the 10 km experiment with only one aspect modified, in the nesting arrangement, resolution jump, forcing data, or model version. The multiple experiments were limited to 1 year of simulations due to computational limitations, and the year 2009 was selected as a function of the observations availability (see Sect. 2.2). We analysed the ERA5 wind speed anomaly at 100 m in 2009 related to the 30-year climatology (see Fig. A1) to ensure that the selected year does not affect the results by being an atypical year in terms of wind speed over the region studied. Compared to the long-term average, 2009 was slightly more windy (around 0–0.25 m s−1) in northern Europe, which is within 1 standard deviation (around 0.2–0.4 m s−1) for all mast locations.

Figure 1Location of the domains used in the WRF model simulations for two configurations: (1) single nest (domains 1 and 2, in blue) and (2) two nests (domains 1, 2, and 3, in orange). The red dots show the wind speed measurements locations. Base map created with Natural Earth.

Table 1Experiment names and the WRF model configuration. The nesting ratio refers to the grid of the relative parent domain, as shown in Fig. 1. Resolution jump is the grid spacing of the outer and inner grids.

Download Print Version | Download XLSX

Time series from NEWA and ERA5 reanalyses were included in the comparisons. All time series (from the simulations and the existent reanalyses) were extracted using a horizontal linear interpolation and a logarithmic vertical interpolation for each measurement location and its respective height. The WRF model simulations and ERA5 and NEWA reanalyses use the two closest levels for the vertical logarithmic interpolation. The NEWA dataset and the WRF model simulations have outputs in several fixed levels ranging from 25 to 250 m. From the ERA5 dataset, two fixed height levels (10 and 100 m) were used, and extrapolation is assumed for sites taller than 100 m. No assumption about the atmospheric stability condition is used, and the same process is applied at every time step throughout the entire year.

2.2 Measured data

Data from 14 met masts over northern Europe (Fig. 1) were processed and filtered using an adapted version of the quality control routine described in Ramon et al. (2020). The filter eliminates suspicious data or sequences of data, including implausible values or extreme variations, freezing, or stuck instrument readings. When necessary, a rough attempt was made to minimize the effect of flow distortion caused by the mast on the wind speed measurements. When wind speed measurements are available from only one boom at one height on the mast, winds originating at ±10 of the boom direction are filtered. At other sites, where wind speeds at one height are measured with more than one boom direction, the wind speed measurements are combined according to the wind direction to minimize flow distortion. Wind speed time series measured at heights ranging from 30 to 140 m and originally at 10 min averaged resolution were aggregated to hourly resolution. To ensure that all measured data have been reported at UTC times (information not always provided by the data source), an inspection of the cross-correlations between measured data and reanalysis (Fig. 2f) has been done for checking suspicious shifted lags. Missing or invalid data identified in the measured time series were marked NaN (not a number) also in the simulated data. The completeness of the series is shown in Table 2. The series covers 1 year of observations, and the year 2009 was chosen for being the period with time series from the maximum number of sites available. Anonymized stations were named according to the location: central North Sea (CNS), southern North Sea (SNS), and western Baltic Sea (WBS). Other information on the type of location (land, forest, coastal, or offshore) and measurement device (met mast or lidar) is presented in Table 2.

DTU Database (2020)CESAR Database (2020)Hasager et al. (2013)Marine Data Exchange (2020)FINO Offshore (2020)FINO Offshore (2020)DTU Database (2020)DTU Database (2020)Ramon et al. (2020)Hasager et al. (2013)Hasager et al. (2013)DTU Database (2020)DTU Database (2020)

Table 2Observational datasets. Type: meteorological masts (M), lidar (L); location: coastal (C), land (L), offshore (S), forest (F). Availability [%] refers to the valid data within time coverage during 2009 after the quality control and minimization of flow distortion.

* Includes a displacement height of 20.5 m based on Dellwik et al. (2006).

Download Print Version | Download XLSX

Figure 2Example of the analysis done for one location (a–f) and all locations (g) comparing only observed data and two reanalyses (for simplicity) during 1 year. (a) Wind speed time series; (b) probability density function (PDF) and Earth mover's distance (EMD); (c) cumulative density function (CDF); (d) autocorrelation function (ACF); (e) observed versus simulated wind speed; (f) cross-correlation function; (g) spatial correlations versus distance for all pairs of locations (dots) and their fitted curve (lines).


2.3 Evaluation metrics

Different qualities in wind speed time series are required for applications in power and energy system studies, as described in Sect. 1. We analyse five aspects of data quality using error metrics defined in Murcia et al. (2022). The comparisons against observations include the six WRF model experiments, NEWA, and ERA5 datasets. The time series evaluation metrics include the following:

  1. Correlation to measurements. Given an observed time series, X(t,i), and a simulated time series, Y(t,i), at time t and location i, the Pearson correlation coefficient ρ(X(t,i),Y(t,i)) is calculated for each simulation with respect to the measured data. This metric is important when the wind speed data (or the resulting generation data) has to be correctly correlated with other time-stamped datasets (e.g. electricity price).

  2. Error in the autocorrelation function. The autocorrelation functions (ACF), ρ(X(t,i),X(t-Δt,i)), and ρ(Y(t,i),Y(t-Δt,i)), at location i, are calculated for the observed and simulated time series, respectively. The error in the metric is computed as ACFXlag=Δt minus ACFYlag=Δt. The ACF metric represents how well the simulated data can represent temporal variability seen in the measured data.

  3. Error in wind speed distribution. This metric quantifies the difference between simulated and observed wind speed distribution. It is computed using the Earth mover's distance (EMD) technique, introduced in Hahmann et al. (2020b). The EMD is defined by the area between two cumulative density functions (CDFs) and, therefore, is always positive. An accurate wind speed distribution is essential for estimating a site's potential annual energy production, with a good fit also at the higher percentiles important for understanding storm shutdown risks. It is acknowledged that because of the shapes of power curves, certain parts of the wind speed distributions matter more than others when considering the wind generation output. However, in this study, the focus was kept on modelling wind speeds, so all wind speed ranges are considered equally in EMD.

  4. Error in the standard deviation of the first difference. The standard deviation of the first-difference time series σ(X(t,i)-X(t-Δt,i)) and σ(Y(t,i)-Y(t-Δt,i)) at location i is calculated for the observed and simulated time series, respectively. The error is computed as the difference between the simulated and observed standard deviations. This metric describes how well the simulated data can represent the 1 h ramps seen in the measured data.

  5. Error in the spatial correlation among measurements. The correlations ρ(X(t,i),X(t,j)) and ρ(Y(t,i),Y(t,j)) are calculated for the observed and simulated time series for all pairs of sites ij. The metric is computed by fitting equation ρij=exp(-dij/L), with dij and L in kilometres, to the correlations and distances (d) between the locations for both the measured and the simulated data and taking the ratio of the characteristic length scales, LY/LX. The smaller the length scale, the faster the correlations decay to zero as distance increases. Modelling spatial correlations well is relevant for system integration studies, as the probability of wind speeds being low or high or ramping up or down simultaneously in multiple locations impacts the aggregate wind generation variability in the system.

Figure 2 illustrates all the metrics used in this work but includes only three datasets for simplicity. Figure 2a shows an example of a time series for Børglum, a short mast at a coastal location in Denmark, with observed (OBS, in black), ERA5 (orange), and NEWA (blue) datasets. Figure 2b shows the wind speed distribution for the three datasets. The CDFs for ERA5 and NEWA, in which the EMD metric is computed with respect to OBS, are illustrated in Fig. 2c. Figure 2d shows the ACF for the first 24 h, although only the ACF at lag = 1 h is used in the comparisons. Figure 2e illustrates the correlations between simulated and OBS. Figure 2f shows the cross-correlation function of simulated data with OBS, which was used for checking the correctness of the timestamps in every observed time series. Lastly, Fig. 2g illustrates the comparison between parameter L computed from the spatial correlations over a distance of all 14 pairs of locations. Each metric described is computed for every location and experiment, as well as the median over different site types (onshore, coastal, and offshore) and the median of all locations for each metric. The final rank among all datasets for each metric is based on the medians of all sites.

3 Results

This section presents the results for each metric described in Sect. 2.3. The tables contain the results for each dataset (rows) and every site (columns). Columns 1 to 4 are inland, 5 to 9 are inland but close to the sea (hereafter named “coastal”), and 10 to 14 are offshore sites. The four last columns present the median over sites “onshore”, “coastal”, “offshore”, and “all” 14 sites, respectively. The colour palette represents the best results in dark purple and the worst results in brown. The rows are sorted by the column median “all”, with the most accurate results on the top and the least accurate results on the bottom of each table.

3.1 Correlation to measurements

In Fig. 3, time series from ERA5 reanalysis have higher correlation with measurements (median all = 0.91), followed by the 5 km experiment (0.89). The 3.3 km experiment is the least correlated among all datasets (0.80). The type of location impacts the correlations. Sites offshore have higher correlations than coastal sites, which have higher correlations than onshore sites. The worst correlations for all simulations compared to observations are Sorø and Lindenberg, both met masts (43 and 98 m tall, respectively) located in forested sites. This reveals the difficulties of mesoscale models in simulating the effects of the forest on the flow dynamics, e.g. due to oversimplification and an unappropriated representation of roughness length, as discussed in Dellwik et al. (2014).

Figure 3Correlation of simulated time series to measurements for the various experiments in Table 1 and reanalyses. The darkest purple colours are the most accurate metrics; the darkest brown is the least accurate. The rows are sorted by the column median all, with the most accurate results on the top and the least accurate results on the bottom of the table.


3.2 Autocorrelation function

The ACF results (Fig. 4) show a clear spatial-resolution impact. NEWA (3 km horizontal grid spacing) and the 3.3 km experiment present the smaller errors in ACF (∼0.031), while experiments with 10 km grid spacing (10 km_erai, 10 km, and 10 km_v3) and ERA5 (∼30 km) present the larger errors (0.047, 0.047, 0.049, and 0.051, respectively). As in the previous metrics, all simulations contain larger errors over forested sites (Sorø and Lindenberg). The ACF is simulated more accurately in offshore sites, followed by coastal and onshore sites.

Figure 4Error in autocorrelation function (ACF) at lag = 1 h of simulated time series to measurements for the various experiments in Table 1 and reanalyses. The darkest purple colours are the best results; the darkest brown is the worst. The rows are sorted by the column median all, with the most accurate results on the top and the least accurate results on the bottom of the table.


3.3 Standard deviation of first difference

As in the ACF analysis, the standard deviation (SD) of the first difference (Fig. 5) is impacted by the grid spacing of the simulations. The 3.3 km experiment and NEWA simulations show smaller errors in this metric (−0.30 for both simulations), while experiments with 10 km grid spacing (10 km_erai, 10 km, and 10 km_v3) and ERA5 time series present the larger errors (−0.54, −0.55, −0.58, and −0.66, respectively). All simulations underestimate the metric (negative values). Unlike all previous metrics, the SD of the first difference is more accurately represented over inland sites (especially for Sorø). The reason for larger errors in coastal and offshore sites can be due to the difficulties of mesoscale models in simulating turbulence over and close to the sea (Floors et al.2018). On the other hand, the displacement height applied in the simulated time series over Sorø can mask the 1 h step changing errors. Originally, the time series interpolated at Sorø height (43 m) overestimates the wind speed above the canopy of the trees at this forested site (Dellwik et al.2014) and, therefore, produces errors in other metrics, such as correlations with measurements and wind speed distribution. We lowered the level of the interpolated time series using the fixed displacement height of −20.5 m without taking into account that displacement height depends, among other things, on the wind speed (Dellwik et al.2006). It is possible that the simulated time series exaggerates the turbulence at the displaced height and alleviates the underestimation in the SD of first difference metrics only for Sorø. For comparison, Lindenberg is a forest site, but we have not applied displacement height, and the errors are consistent with other inland sites. Even when the Sorø results are disregarded from the median onshore, the onshore sites are more accurately simulated with respect to the standard deviation of the first difference than coastal and offshore sites.

Figure 5Error in the standard deviation of first difference (SD of first diff) between the simulated time series and the measurements for the various experiments in Table 1 and reanalyses. The darkest purple colours are the best results; the darkest brown is the worst. The rows are sorted by the column median all, with the most accurate results on the top and the least accurate results on the bottom of the table.


Figure 6Earth mover's distance (EMD; m s−1) between simulations and measurements for the various experiments in Table 1 and reanalyses. The darkest purple colours are the best results; the darkest brown is the worst. The rows are sorted by the column median all, with the most accurate results on the top and the least accurate results on the bottom of the table.


3.4 Wind speed distribution

The analysis of the wind speed distribution (Fig. 6) shows more homogeneous results over all experiments. The 3.3 km simulation has slightly smaller EMD (0.27 m s−1), while its more equivalent simulation, NEWA, presents an intermediate result (0.29 m s−1). Larger EMDs are found for inland sites (especially Sorø and Tystofte); however, the coastal sites DockingShoal and Lillgrund also present large EMD values. There is no clear sequence for the quality of wind speed distribution concerning the type of location, although onshore sites have larger EMD values in all datasets. All datasets underestimate (overestimate) the low (high) wind speed values for inland observations at lower heights (less than 50 m tall). Figure 7 highlights this issue.

Figure 7Probability density function (PDF) and Earth mover's distance (EMD) between simulations and measurements (OBS) for the various experiments in Table 1 and reanalyses and two inland sites measured at lower heights.


3.5 Spatial correlations

The metric to assess the spatial correlations (Fig. 8) is computed as described in Sect. 2.3. All simulations overestimate the correlations for most points, leading to larger parameters L, which agrees with previous studies (e.g. Murcia et al.2022; Mehrens et al.2016). The observed value of L is 410 km, while the ones derived from the simulations vary between 496 km in the NEWA simulations and 541 km in the ERA5 time series. Mehrens et al. (2016) discuss the WRF models inability to resolve wind variability sufficiently at higher frequencies due to the numerical smoothing, resulting in exaggerated correlations. Except for the NEWA time series, all simulations produce similar results despite horizontal grid spacing. The coefficient of determination r-squared (r2) for all fitted curves and the standard error (e) of the estimated parameters L are also shown in Fig. 8.

Figure 8Correlation versus distance for each pair of sites (Table 2) and the fitted curves for the measurements (black) and the various model simulations in Table 1 and reanalyses. The estimated de-correlation length L is also shown for each simulation.


Figure 9Boxplots of the metrics for all stations as a function of the model experiment: (a) correlations (CORR) to measurements, (b) error in the autocorrelation function (ACF), (c) error in the standard deviation (SD) of first difference, and (d) Earth mover's distance (EMD). The model experiments are sorted as a function of their median, from the best to the worst.


4 Impacts from model setup

The boxplots in Fig. 9 represent the ranking among all simulated datasets in the first four presented metrics. The plots show the median (50th percentile), the first quartile (25th percentile), the third quartile (75th percentile), and the maximum and minimum values, as well the outliers. The boxplots are ordered from best (lowest median, left) to worst (highest median, right) in all metrics.

The ranking for correlation to measurement (Fig. 9a) does not indicate a clear impact from spatial resolution. However, ERA5 (coarser resolution) presents a higher correlation with measurements, which can be in part due to spatial smoothness. Also, ERA5 has the advantage of data assimilation, which periodically adjusts and approximates the simulation to the observations. Regarding the nesting arrangement and the way correlations are passed into inner domains, the “5 km” experiment was more similar to the observed time series than its comparable “6 km” experiment. The 5 km experiment uses a nesting ratio of 1/3 (domain 1/domain 2), but setting domain 1 using 15 km and the 6 km experiment uses a ratio of 1/5 and the resolution of domain 1 equal to 30 km (close to the reanalysis resolution). The 10 km experiment (also ratio 1/3 but the resolution of domain 1 set to 30 km) correlates technically the same as the 6 km experiment. As the WRF model freely develops the flow in the inner domains, it loses the correlation to the reanalysis. Double nesting amplifies this effect, such as the 3.3 km experiment ratio 1/3/3). Using a smoother resolution jump, such as 15 to 5 km instead of 30 to 6 km, could be an advantage in keeping the correlations in the inner domain consistent with the driving reanalysis. Further tests are needed to confirm this behaviour. Nevertheless, the ratio 1/3 – 15 to 5 km is double as computationally expensive as the 1/5 – 30 to 6 km or the 1/3 – 30 to 10 km. The comparison to the NEWA time series (1/3/3, 27 to 9 to 3 km) and the 3.3 km experiment (1/3/3, 30 to 10 to 3.3 km) supports this hypothesis. However, the NEWA simulations are not directly comparable to the 3.3 km simulations because NEWA uses different choices of domains size and placement (e.g. domain 1 in NEWA is much larger than in 3.3 km, and domain 3 is longitudinally longer in NEWA, while in the 3.3 km experiment it is larger in latitude).

The boxplot of error in autocorrelation function (Fig. 9b) shows a clear impact from spatial grid spacing. The NEWA and “3.3 km” simulations present smaller errors than the measured time series. The 10 km grid spacing experiments and the ERA5 time series show the most significant errors. Coarser-resolution simulations exaggerate the correlations due to the inherent spatial smoothness of the atmospheric models, which can be seen in the results for all simulations (Fig. 4). The same interpretation can be made from the boxplot for the standard deviation of the first difference (Fig. 9c), although in this metric experiments with very similar results, such as NEWA and the 3.3 km experiment, and the 10 km experiment and 10 km_erai, have inverted its ranking positions.

The EMD boxplot (Fig. 9d) has the least conclusive ranking order among the metrics. There is no apparent influence from the spatial resolution in the wind speed distribution since the ranking alternates finer- and coarser-resolution experiments. Both the 5 km and 6 km present intermediate results and nearly identical values. From these results, there is also no significant indication of an impact on the simulated wind speed distribution, neither positive nor negative, from the complexity of nesting (e.g. single versus two nested domains), the choice of nesting ratio, or the resolution jump.

The fitted spatial correlations (Fig. 8) show a clear distinction between NEWA and the rest of the simulations. The NEWA time series presents the smaller value of parameter L and the closest to the parameter determined for the measured points. A ranking of the simulations can be seen in Fig. 10, showing the NEWA simulations with the smallest simulated Ly over observed Lx, followed by all the other simulations with close overestimated results. As for the EMD, the spatial correlations explain the ranking order, on neither the nesting choice nor the resultant spatial resolution. Part of the spread among the results from a single experiment comes from using various measurement heights. The results found for this metric agree with Murcia et al. (2022) that all simulations overestimate the spatial correlations and that the NEWA time series modelled this aspect of the time series more accurately than the ERA5 dataset.

To check the consistency of the results in different periods, we recalculated all the metrics for the winter (January–March) and summer (July–September) months. The results (not shown) keep a similar ranking order to the annual time series for all metrics except for the EMD and spatial correlations. In any seasonal period considered, the EMD values range from approximately 0.3–0.4 m s−1, but the ranking of the simulations is different (not shown). For the spatial correlations, Fig. 10 shows a different order for each considered period. Almost all simulations shown have higher correlations during winter months than summer months. For all simulations and the observed time series, the de-correlation length L is larger during winter than during summer (not shown). This could be explained by the larger spatial scale of winter versus summer atmospheric processes and their variability. Nevertheless, this result contradicts Solbrekke et al. (2020), although that study only includes correlations versus distances over the northern North Sea and the Norwegian Sea and a limited number of measurement sites.

Figure 10Ratio of simulated and observed characteristic length scales LY/LX, both parameters fitted from curves in Fig. 8, and the standard error of the estimated parameters (bars). Winter refers to months January–March; summer refers to July–September. L>1 means that simulations overestimate the parameter L.


All five WRF model experiments were repeated using the WRF model version 3.8, although only the 10 km_v3 one was included in the plots for comparison with the 10 km (WRF version 4.2.1). The rank is unchanged from that with WRF V4.2.1 (with minor differences) for the correlations to measurements, error in ACF, and the error in SD of the first difference (not shown). However, the ranking order is changed for the EMD and the spatial correlations. Nevertheless, the conclusions for these two metrics do not show a clear impact from grid spacing or model nesting. The direct comparison between the 10 km and 10 km_v3 shows no clear effect from the two WRF model versions, and both experiments present a similar position in the rank for most metrics (Fig. 9).

Lastly, an experiment testing different forcing data (10 km_erai) was included to compare simulations forced by ERA5 versus ERA-Interim reanalyses. For all metrics, the 10 km and 10 km_erai present a close position in the ranking (Fig. 9). The three experiments with 10 km grid spacing are clustered among the ranks due to the same resultant grid spacing and model nesting.

5 Conclusions

To investigate how to improve the mesoscale modelling of wind time series over northern Europe for power and energy system purposes, we performed a sensitivity study to various WRF model setups, including varying nesting configuration (one or two inner domains), nesting ratio (1/3 or 1/5), and resolution of the innermost domain (10, 6, 5, or 3.3 km). Simulations using different model versions and forced by different reanalyses are also explored. Five metrics relevant to wind power integration studies are presented for the time series derived from the WRF model simulations and compared to those from the New European Wind Atlas and the ERA5 reanalysis. We also ranked the time series simulation's metrics to identify significant factors controlling the simulation performance in their generated wind speed time series. Measured data from 14 sites over land, coastal, and offshore locations in northern Europe were used.

We found that the model configuration affects the value of the wind time series correlations with measurements metrics more than the grid spacing. Thus, we recommend ERA5 reanalysis over the mesoscale simulations for studies where the correlations with measurements are essential. However, when producing mesoscale simulations for power and energy system purposes, a smoother resolution jump from outer to inner domains benefits the simulations by keeping them more correlated to the forcing reanalysis. This is especially relevant when the wind speed time series are combined with other series data (e.g. electric load or price time series). Finer-spatial-resolution simulations such as NEWA and the 3.3 km experiment may be best for applications where temporal variability has to be well modelled, such as power ramp analyses or voltage stability studies. For more accurate simulations in terms of wind speed distribution and spatial correlations, NEWA presents more favourable results than ERA5.

The value of the metrics at the considered sites shows more accurate results for offshore and coastal than for inland locations in all metrics, except for the standard deviation of the first difference. Simulated sites in forest landscapes generally have more significant errors, especially when measurements are taken at lower heights (i.e. less than 50 m tall). This could be due to model deficiencies in simulating boundary layer processes near the ground in more complex terrain, in agreement with Hahmann et al. (2020b).

The evaluation of correlations to measurements indicates that strengthening the influence of the forcing from the reanalysis data on the mesoscale model simulation can be achieved by using a smooth transition between the computational domains. Thus, a nest transition from 15 to 5 km (domain 1 to domain 2) is more effective than using 30 to 6 km (considering the forcing data resolution close to 30 km) for maintaining the high correlations from the reanalysis in the inner domain. A comparison between simulations using three domains (30/10/3.3 and 27/9/3 km) confirms this result. However, the NEWA and 3.3 km simulations are not entirely comparable because they differ in the size of the outer domain. A large nudged outer domain appears to be important for improving the correlation with observations in the inner domains. Still, our results do not provide a systematic validation of this hypothesis. The ERA5-derived wind speed time series has the largest correlation to measurements for all sites, and this behaviour is in agreement with Jourdier (2020). From experiences in weather forecasting, it is known that higher resolution does not always produce improved statistics (Mass et al.2002) because the various metrics are sensitive to the smoothness of the time series.

The ranking order in the autocorrelation function and standard deviation of the first difference is a function of decreasing spatial grid spacing rather than the nesting arrangement. This is probably a consequence of the higher frequency of occurrence of convective processes in finer grid spacing domains, as it is discussed in Mass et al. (2002) and Vincent et al. (2013). For the wind speed distribution, the results are inconclusive regarding the impact of the model configuration or the spatial resolution on the quality of the time series. The analysis of the spatial correlations confirmed results from previous papers that all simulations exaggerate the spatial correlations (Murcia et al.2022; Mehrens et al.2016) and that NEWA time series can simulate this aspect more accurately than does the time series derived from the ERA5 reanalysis (Murcia et al.2022). Mass et al. (2002) show that finer horizontal resolution leads to lower correlations due to a higher spatial variability. However, our results for spatial correlations do not find an explanation in the model setup and are sensitive to the period of the year. Simulated time series longer than 1 year are needed to investigate these findings better. Also, this could be because our tested grid spacings are very similar (from 10 to 3.3 km), while in Mass et al. (2002), the simulation resolutions have a larger range from 36 to 4 km.

Due to computational cost, many other details related to the model setup have not been tested. For example, we used the same size and position of the innermost domain for all simulations. Therefore, we did not test the sensitivity of the simulated time series to these aspects. Hahmann et al. (2020a) found that smaller domains in the WRF simulation tend to show smaller wind speed biases but higher root mean square errors (RMSEs) compared to observations. They claim, however, that it was unclear if this resulted from the domain size or the location of the boundaries in relation to the large-scale flow. Further tests, including these two model setup aspects, could point to improvements in modelling time series correlated with measurements since RMSE and correlations are related metrics. Additional numerical experiments on grid spacing could be carried out to clarify the potential impacts of horizontal resolution on the simulated spatial correlations and wind speed distribution. Simulations using a much larger outer domain than the one in Fig. 1 and the same inner domain could explain the different performances between NEWA and 3.3 km (Fig. 10) in representing spatial correlations. Finally, because of the shape of the power curves, a further analysis focusing on the errors (i.e. EMD) in certain parts of the wind speed distribution that contribute the most to energy production could be carried out by assigning higher weights to values between the cut-in and cut-out wind speeds.

Appendix A

Figure A1(a) Long-term mean wind speed and (b) standard deviation of the annual mean wind speed (1990–2020) and (c) wind speed anomaly during 2009 with regards to the long-term mean. All statistics are computed using ERA5 at 100 m a.g.l. (above ground level).

Code availability

The WRF model is an open-source code and can be obtained from (WRF Model User's Page2022). We used WRF versions 3.8 and 4.2.1 (; Skamarock et al.2008, and; Skamarock et al.2019). The code modifications, namelists, and table files we used are available from the NEWA GitHub repository (; Hahmann et al.2020a). The WRF model namelists and geofiles used in the experiments described in this paper are available at the DTU Data webpage (; Luzia2022). The code used in the calculation of EMD metric is available from (Pele and Werman2009).

Data availability

The following projects and organizations have kindly provided the tall mast data used for the evaluations: Cabauw data were supplied by the Cabauw Experimental Site for Atmospheric Research (CESAR Database2020); FINO 1 and 2 were provided by German Federal Maritime and Hydrographic Agency (BSH) (FINO Offshore2020). Børglum, Høvsøre, Lillgrund, Sorø, and Tystofte data were obtained from the Technical University of Denmark (DTU Database2020). CNS1 offshore data were provided by the NorseWind project (Hasager et al.2013). Marine Data Exchange provided DockingShoal data, maintained by the Crown State UK (Marine Data Exchange2020). Lindberg data were provided by the Tall Tower Dataset (Ramon et al.2020).

The WRF model simulations were initialized using ERA5 (; Hersbach et al.2018) and ERA-Interim (Dee et al.2011) reanalyses downloaded from ECMWF and the Copernicus Climate Change Service. The experiments can be reproduced using the information described in the “Code availability” section.

Author contributions

GL, ANH, and MJK worked on the conceptualization, methodology, and writing (review and editing). GL processed the data, ran the model simulations, and wrote the original draft. ANH and MJK supervised the work.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


The authors acknowledge support from the PSfuture project, Denmark (La Cour Fellowship, DTU Wind Energy). The authors would like to thank Oscar Manuel Garcia Santiago for supporting the coding of time series extraction from non-regular grids.

Financial support

This research has been supported by the Danmarks Tekniske Universitet (La Cour Fellowship).

Review statement

This paper was edited by Julie Lundquist and reviewed by two anonymous referees.


Badger, J., Hahmann, A., Larsén, X., Badger, M., Kelly, M., Olsen, B., and Mortensen, N.: The Global Wind Atlas: An EUDP project carried out by DTU Wind Energy, Tech. rep., DTU Wind Energy, (last access: 15 December 2021), 2015. a

Brown, T., Schlachtberger, D., Kies, A., Schramm, S., and Greiner, M.: Synergies of sector coupling and transmission reinforcement in a cost-optimised, highly renewable European energy system, Energy, 160, 720–739,, 2018. a

Cannon, D., Brayshaw, D., Methven, J., Coker, P., and Lenaghan, D.: Using reanalysis data to quantify extreme wind power generation statistics: A 33 year case study in Great Britain, Renew. Energy, 75, 767–778,, 2015. a

CESAR Database: Cabauw Experimental Site for Atmospheric Research,, last access: 15 December 2020. a, b

Das, K., Litong-Palima, M., Maule, P., Altin, M., Hansen, A. D., Sørensen, P. E., and Abildgaard, H.: Adequacy of frequency reserves for high wind power generation, IET Renew. Power Generat., 11, 1286–1294,, 2017. a

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteorol. Soc., 137, 553–597,, 2011. a, b, c

Dellwik, E., Landberg, L., and Jensen, N. O.: WAsP in the Forest, Wind Energy, 9, 211–218,, 2006. a, b

Dellwik, E., Arnqvist, J., Bergström, H., Mohr, M., Söderberg, S., and Hahmann, A.: Meso-scale modeling of a forested landscape, J. Phys.: Conf. Ser., 524, 012121,, 2014. a, b

Dörenkämper, M., Olsen, B. T., Witha, B., Hahmann, A. N., Davis, N. N., Barcons, J., Ezber, Y., García-Bustamante, E., González-Rouco, J. F., Navarro, J., Sastre-Marugán, M., Sīle, T., Trei, W., Žagar, M., Badger, J., Gottschall, J., Sanz Rodrigo, J., and Mann, J.: The Making of the New European Wind Atlas – Part 2: Production and evaluation, Geosci. Model Dev., 13, 5079–5102,, 2020.  a, b, c, d

Draxl, C. and Clifton, A.: The Wind Integration National Dataset (WIND) Toolkit, Appl. Energy, 22, 355–366,, 2015. a, b

DTU Database: Technical University of Denmark – Online Meteorological Data,, last access: 15 December 2020. a, b, c, d, e, f

FINO Offshore: FINO123 – Research platforms in the North Sea and Baltic Sea,, last access: 15 December 2020. a, b, c

Floors, R. R., Hahmann, A. N., and Pena Diaz, A.: Evaluating Mesoscale Simulations of the Coastal Flow Using Lidar Measurements, J. Geophys. Res.-Atmos., 123, 2718–2736,, 2018. a

Gea-Bermúdez, J., Pade, L.-L., Koivisto, M. J., and Ravn, H.: Optimal generation and transmission development of the North Sea region: Impact of grid architecture and planning horizon, Energy, 191, 116512,, 2020. a, b

González-Aparicio, I., Monforti, F., Volker, P., Zucker, A., Careri, F., Huld, T., and Badger, J.: Simulating European wind power generation applying statistical downscaling to reanalysis data, Appl. Energy, 199, 155–168,, 2017. a

Gruber, K., Regner, P., Wehrle, S., Zeyringer, M., and Schmidt, J.: Towards global validation of wind power simulations: A multi-country assessment of wind power simulation from MERRA-2 and ERA-5 reanalyses bias-corrected with the global wind atlas, Energy, 238, 121520,, 2022. a

Hahmann, A. N., Vincent, C. L., Peña, A., Lange, J., and Hasager, C. B.: Wind climate estimation using WRF model output: method and model sensitivities over the sea, Int. J. Climatol., 35, 3422–3439,, 2015. a, b

Hahmann, A. N., Davis, N. N., Dörenkämper, M., Sile, T., Witha, B., and Trei, W.: WRF configuration files for NEWA mesoscale ensemble and production simulations, Zenodo [code],, 2020a. a, b

Hahmann, A. N., Sīle, T., Witha, B., Davis, N. N., Dörenkämper, M., Ezber, Y., García-Bustamante, E., González-Rouco, J. F., Navarro, J., Olsen, B. T., and Söderberg, S.: The making of the New European Wind Atlas – Part 1: Model sensitivity, Geosci. Model Dev., 13, 5053–5078,, 2020b. a, b, c, d

Hämäläinen, K. and Niemelä, S.: Production of a Numerical Icing Atlas for Finland, Wind Energy, 20, 171–189,, 2017. a

Hasager, C. B., Stein, D., Courtney, M., Peña, A., Mikkelsen, T., Stikland, M., and Oldroyd, A.: Hub height ocean winds over the North Sea observed by the NORSEWInD lidar array: Measuring techniques, quality control and data management, Remote Sens., 5, 4280–4303,, 2013. a, b, c, d

Hawker, G. S., Bukhsh, W. A., Gill, S., and Bell, K. R. W.: Synthesis of wind time series for network adequacy assessment, in: 2016 Power Systems Computation Conference (PSCC), 20–24 June 2016, Genoa, Italy, 1–7,, 2016. a

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., and Thépaut, J.-N.: ERA5 hourly data on single levels from 1959 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set],, 2018. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049,, 2020. a, b

Jourdier, B.: Evaluation of ERA5, MERRA-2, COSMO-REA6, NEWA and AROME to simulate wind power production over France, Adv. Sci. Res., 17, 63–77,, 2020. a, b, c, d

Knoop, S., Ramakrishnan, P., and Wijnant, I.: Dutch Offshore Wind Atlas Validation against Cabauw Meteomast Wind Measurements, Energies, 13, 6558,, 2020. a

Koivisto, M., Jónsdóttir, G. M., Sørensen, P., Plakas, K., and Cutululis, N.: Combination of meteorological reanalysis data and stochastic simulation for modelling wind generation variability, Renew. Energy, 159, 991–999,, 2020. a

Koivisto, M. J., Plakas, K., Hurtado Ellmann, E. R., Davis, N., and Sørensen, P.: Application of microscale wind and detailed wind power plant data in large-scale wind generation simulations, Elect. Power Syst. Res., 190, 106638,, 2021. a, b

Larsén, X. G., Ott, S., Badger, J., Hahmann, A. N., and Mann, J.: Recipes for Correcting the Impact of Effective Mesoscale Resolution on the Estimation of Extreme Winds, J. Appl. Meteorol. Clim., 51, 521–533,, 2012. a, b

Li, H., Claremar, B., Wu, L., Hallgren, C., Körnich, H., Ivanell, S., and Sahlée, E.: A sensitivity study of the WRF model in offshore wind modeling over the Baltic Sea, Geosci. Front., 12, 101229,, 2021. a

Luzia, G.: Supplementary files for “Evaluating the spatio-temporal variability in simulated wind speed time series over northern Europe”, DTU [code],, 2022. a

Malvaldi, A., Weiss, S., Infield, D., Browell, J., Leahy, P., and Foley, A. M.: A spatial and temporal correlation analysis of aggregate wind power in an ideally interconnected Europe, Wind Energy, 20, 1315–1329,, 2017. a

Marine Data Exchange: The Crown State – Marine Data Exchange,, last access: 15 December 2020. a, b

Mass, C. F., Ovens, D., Westrick, K., and Colle, B. A.: Does increasing horizontal resolution produce more skillful forecasts?: The Results of Two Years of Real-Time Numerical Weather Prediction over the Pacific Northwest, B. Am. Meteorol. Soc., 83, 407–430, 2002. a, b, c, d

Mehrens, A. R., Hahmann, A. N., Larsén, X. G., and von Bremen, L.: Correlation and coherence of mesoscale wind speeds over the sea, Q. J. Roy. Meteorol. Soc., 142, 3186–3194,, 2016. a, b, c, d

Murcia, J. P., Koivisto, M. J., Sørensen, P., and Magnant, P.: Power fluctuations in high installation density offshore wind fleets, Wind Energ. Sci., 6, 461–476,, 2021. a, b

Murcia, J. P., Koivisto, M. J., Luzia, G., Olsen, B. T., Hahmann, A. N., Sørensen, P. E., and Als, M.: Validation of European-scale simulated wind speed and wind generation time series, Appl. Energy, 305, 117794,, 2022. a, b, c, d, e, f, g, h, i

Murcia Leon, J. P., Koivisto, M. J., Sørensen, P., and Magnant, P.: Power fluctuations in high-installation-density offshore wind fleets, Wind Energ. Sci., 6, 461–476,, 2021. a

Nuño, E., Maule, P., Hahmann, A., Cutululis, N., Sørensen, P., and Karagali, I.: Simulation of transcontinental wind and solar PV generation time series, Renew. Energy, 118, 425–436,, 2018. a, b

Olson, J. B., Kenyon, J. S., Djalalova, I., Bianco, L., Turner, D. D., Pichugina, Y., Choukulkar, A., Toy, M. D., Brown, J. M., Angevine, W. M., Akish, E., Bao, J. W., Jimenez, P., Kosovic, B., Lundquist, K. A., Draxl, C., Lundquist, J. K., McCaa, J., McCaffrey, K., Lantz, K., Long, C., Wilczak, J., Banta, R., Marquis, M., Redfern, S., Berg, L. K., Shaw, W., and Cline, J.: Improving wind energy forecasting through numerical weather prediction model development, B. Am. Meteorol. Soc., 100, 2201–2220,, 2019. a

Pele, O. and Werman, M.: Fast and robust earth mover's distances, in: 2009 IEEE 12th International Conference on Computer Vision, 460–467, (last access: 25 March 2022), 2009. a

Ramon, J., Lledó, L., Pérez-Zanón, N., Soret, A., and Doblas-Reyes, F. J.: The Tall Tower Dataset: a unique initiative to boost wind energy research, Earth Syst. Sci. Data, 12, 429–439,, 2020. a, b, c

Ruiz, P., Nijs, W., Tarvydas, D., Sgobbi, A., Zucker, A., Pilli, R., Jonsson, R., Camia, A., Thiel, C., Hoyer-Klick, C., Dalla Longa, F., Kober, T., Badger, J., Volker, P., Elbersen, B., Brosowski, A., and Thrän, D.: ENSPRESO – an open, EU-28 wide, transparent and coherent database of wind, solar and biomass energy potentials, Energ. Strat. Rev., 26, 100379,, 2019.  a

Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Duda, M. G., Huang, X. Y., Wang, W., and Powers, J. G.: A Description of the Advanced Research WRF Version 3, Tech. rep., National Center for Atmospheric Research [code],, 2008. a, b, c

Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Liu, Z., Berner, J., Wang, W., Powers, J. G., Duda, M. G., Barker, D. M., and Huang, X.-Y.: A Description of the Advanced Research WRF Version 4, Tech. rep., National Center for Atmospheric Research [code],, 2019. a, b

Solbrekke, I. M., Kvamstø, N. G., and Sorteberg, A.: Mitigation of offshore wind power intermittency by interconnection of production sites, Wind Energ. Sci., 5, 1663–1678,, 2020. a

Solbrekke, I. M., Sorteberg, A., and Haakenstad, H.: The 3 km Norwegian reanalysis (NORA3) – a validation of offshore wind resources in the North Sea and the Norwegian Sea, Wind Energ. Sci., 6, 1501–1519,, 2021. a

Souxes, T., Granitsas, I., and Vournas, C.: Effect of stochasticity on voltage stability support provided by wind farms: Application to the Hellenic interconnected system, Elect. Power Syst. Res., 170, 48–56,, 2019. a

Staffell, I. and Pfenninger, S.: Using bias-corrected reanalysis to simulate current and future wind power output, Energy, 114, 1224–1239,, 2016. a

Tammelin, B., Vihma, T., Atlaskin, E., Badger, J., Fortelius, C., Gregow, H., Horttanainen, M., Hyvönen, R., Kilpinen, J., Latikka, J., Ljungberg, K., Mortensen, N. G., Niemelä, S., Ruosteenoja, K., Salonen, K., Suomi, I., and Venäläinen, A.: Production of the Finnish Wind Atlas, Wind Energy, 16, 19–35,, 2013. a

Vincent, C. L., Larsén, X. G., Larsen, S. E., and Sørensen, P.: Cross-Spectra Over the Sea from Observations and Mesoscale Modelling, Bound.-Lay. Meteorol., 146, 297–318,, 2013. a

WRF Model User's Page: Users' page for the Weather Research and Forecasting Model,, last access: 25 March 2022. a

Short summary
This paper presents a comprehensive validation of time series produced by a mesoscale numerical weather model, a global reanalysis, and a wind atlas against observations by using a set of metrics that we present as requirements for wind energy integration studies. We perform a sensitivity analysis on the numerical weather model in multiple configurations, such as related to model grid spacing and nesting arrangements, to define the model setup that outperforms in various time series aspects.
Final-revised paper