Understanding uncertainties in wind resource assessment associated with the
use of the output from numerical weather prediction (NWP) models is important
for wind energy applications. A better understanding of the sources of error
reduces risk and lowers costs. Here, an intercomparison of the output from 25
NWP models is presented for three sites in northern Europe characterized by
simple terrain. The models are evaluated using a number of statistical
properties relevant to wind energy and verified with observations. On average
the models have small wind speed biases offshore and aloft (

Numerical weather prediction (NWP) models are increasingly being used in wind energy applications, e.g., wind power resource mapping and site assessment, for planning and developing wind farms, power forecasting, electricity scheduling, maintenance of wind farms, and energy trading on electricity markets. In site assessment, NWP models are commonly part of the model chain used to estimate the annual energy production (AEP) and are responsible for a large part of the uncertainty of this estimate.

The extensive use of NWP models, and the vast customization space of each model, means that a strong demand exists for quantification of (a) the overall model uncertainties and (b) the sensitivity of the uncertainties to the choice of subcomponents and parameters. Understanding the sensitivities and uncertainties of the NWP model output can reduce their associated risks and improve decision making. Model users aware of the sensitivity of individual model components will be able to optimize the model setup for specific applications.

In the following, the NWP models will be referred to as mesoscale models,
signifying that they partly resolve atmospheric phenomena in the mesoscale
range, defined as the range of horizontal length scales from about one to
several hundreds of kilometers

A common way to assess NWP model uncertainties is to use an ensemble
approach, where a number of parallel model runs, referred to as ensemble
members, are run with slightly perturbed initial conditions

Mesoscale model uncertainties in wind speed near the ground are particularly
sensitive to some model components, e.g., the choice of planetary boundary
layer (PBL) scheme, the spin up and simulation time, and the grid spacing. In
the last couple of decades these sensitivities have been studied in great
detail.

Several studies have investigated the WRF model sensitivities in regions of
complex terrain.

Sensitivities to the choice of modeling system have also been studied for
wind energy applications.

Community-driven model intercomparison projects provide an opportunity to
study both model uncertainties and sensitivities to model components. In the
last decade, several intercomparison projects have been successfully carried
out based on model output submitted by modelers from the wind energy
community. The Bolund experiment

In this paper, a blind intercomparison of the output from 25 different NWP simulations is presented for three locations in northern Europe. The study is based on model output submitted by the modeling community to an open call for model data for a benchmarking exercise co-organized by the European Wind Energy Association (EWEA, now WindEurope) and the European Energy Research Alliance, Joint Programme Wind Energy (EERA JP WIND). The three chosen sites represent some of the simplest terrains: offshore, inland near the coast, and inland in flat terrain, where the smoothing of the terrain representation is not an issue. The three sites have quality observations from tall meteorological masts with many heights. The main objectives of this study are (1) to highlight and quantify the uncertainties of the models and serve as motivation for future analysis of model uncertainties and (2) to identify model setup decisions that have an impact on the model performance. The models are evaluated using simple metrics relevant to wind energy applications.

The structure of the paper is as follows. In Sect. 2 we present a detailed description of the methodology used, including a description of the three study sites and the models used by the participants. Section 3 presents the intercomparison results, and finally Sect. 4 contains the summary and conclusions of the study.

Site description, including latitude and longitude coordinates,
classification of the site, the height of the mast

Three sites with quality measurements from tall meteorological masts with
different terrain characteristics were chosen for this study: (1) FINO3, an
offshore mast in the North Sea, (2) Høvsøre, a land mast near the
Danish west coast, and (3) Cabauw, a land mast in the Netherlands. The mast
locations are shown in Fig.

Map of northern Europe with the three site locations used in the model intercomparison: (1) FINO3, in the North Sea. (2) Høvsøre, Denmark. (3) Cabauw, the Netherlands.

FINO3

Availability of wind speed and direction observations for

Figure

At FINO3, the wind speed measurements at three of the heights, 50, 70, 90 m,
are a combination of the measurements from three anemometers at three
separate booms 120

EWEA issued an open call for data and the submission procedure consisted of a template spreadsheet and a questionnaire available for download from the EWEA website. The participants filled the spreadsheet with the time series of the required variables at each location and height. The questionnaire contained details about the setup of the modeling system used. The participants returned the spreadsheet to EWEA, who passed it on to the authors in an anonymous version.

The requested model variables were hourly wind speed and direction, air temperature, and atmospheric stability. The questionnaire asked about the modeling setup, i.e., the model code and version, the SL and PBL schemes, the LSM, the grid nest size(s) and spacing(s), the vertical levels, the land use data, the length of the simulation, the spin-up time, and the source of the initial and boundary conditions. The participants were also asked to comment on any additional modifications made to the model, including assimilation, ensemble, or other methods used.

Table

For reference, wind time series from the ERA-Interim reanalysis

This study is based on direct comparison between the observations and model output at collocated positions, as well as intercomparison of the modeled output. The sampling frequency for the study was chosen to be 1 h. For the observation data this means hourly mean values; for the mesoscale models the inter-hourly variation is small; thus, instantaneous values were used. To ensure temporal consistency between observations and modeled output, instances of missing data from the observations were removed from the modeled output. Furthermore, to get consistent vertical profiles, only instances where all heights for a particular mast with available data were used. The model output submitted was assumed to be quality checked by the submitter, but it was also checked by the authors for obvious nonphysical or inconsistent behavior and not used in that case. The number of models excluded was between two and four at each of the sites, but no model was excluded from all three sites.

The emphasis of this study is on the wind speed,

Calculate

Remove models whose mean

Recalculate

Variations in wind speed often scale with the mean wind speed. Thus, to allow
for intercomparison of wind speed variation intensity across vertical levels,
we define the coefficient of variation,

To diagnose the wind shear in the boundary layer, we use the wind shear
exponent,

The RMSE and the normalized RMSE (NRMSE) were used
as error metrics to obtain single-value measures of the error across heights
at a site. The RMSE and NRMSE are defined as

Vertical profiles of mean wind speed (

To investigate the errors associated with the use of each model in wind energy applications, we performed a simple wind resource assessment exercise, using both measurements and modeled time series at FINO3.

A typical approach to resource assessment is to run a mesoscale model for a
number of years, followed by a downscaling process where the
wind climate
statistics obtained from the mesoscale model are used as input to a
microscale model

Given the wind climate and the turbine power curve, the expected power output can be calculated for any site. Since the participants in this intercomparison were not requested to submit the model-specific orography and roughness maps near each site, it is not possible to go through the generalization procedure and subsequent downscaling process at the inland sites. However, for the offshore site FINO3 there are no effects of orography, and the differences in roughness between the models can be assumed to be negligible. Therefore, we can use the raw model output at this site to estimate the wind resources estimated by each of the models, without the generalization procedure.

We performed the wind resource exercise at 90 m at FINO3, assuming first a single Vestas V80 turbine at the site, and then repeated for the exercise for the wind farm of Horns Rev, which is an 80-turbine wind farm located near FINO3. The resource estimations for the wind farm include the simple wake parametrization present in the WAsP model, which was used to estimate the power losses.

The following subsection is dedicated to the general performance of the models and their ability to capture the mean and the distributions of a number of wind-related quantities. As previously stated, the goal is to highlight the weaknesses of the models to encourage further analysis of model sensitivities.

Figure

Wind speed distributions at the three sites (FINO3 at 90 m,
Høvsøre at 80 m, and Cabauw at 80 m) for the observations (black),
the ERA-Interim data set (green), the mesoscale models MM

At Høvsøre (Fig.

Wind direction distributions at the three sites (FINO3 at 90 m,
Høvsøre at 80 m, and Cabauw at 80 m), based on 24 sectors, for the
observations (black), the ERA-Interim data set (green), the mesoscale models

At Cabauw (Fig.

Figure

Figure

Figure

Figure

Distribution of mean absolute error (MAE) for wind speed at the
three sites for five stability classes: unstable (U), near-unstable (NU),
neutral (N), near-stable (NS), and stable (S). See definitions in
Table

Ranges of inverse Obukhov length (

Vertical profiles of the coefficient of variation for wind speed

It is generally acknowledged that non-neutral atmospheric stability
conditions pose one of the greatest challenges for MMs

At all three sites, the smallest deviations between modeled and measured
wind speeds are found when the models perceive the surface layer stability
from unstable (U) to stable (S). The MAE in these cases typically range from
10 to 35 %, with just a few models outside of the 3

Figure

Participants in the study in alphabetical order.

Coefficient of variation for wind speed

At Høvsøre,

At Cabauw,

The coastal site Høvsøre and the offshore site FINO3 is used to
investigate whether there is a dependency of the coefficient of variation for
wind speed (shown in Fig.

At FINO3, the coefficient of variance is almost constant with height and
slightly lower for easterly winds than for westerly flow. This is true for
both models and observations. The sample size for easterly winds is smaller,
about half, than for westerly flow. However, both sample sizes are large (

At Høvsøre, the coefficient of variation is larger for westerly than
for easterly winds. Easterly winds show larger coefficients of variation at
10 m than higher up. The reduction of

The dependence on height of

Figure

Frequency of occurrence of the shear exponent (

At Høvsøre and Cabauw, the distributions of

To identify what model setup choices lead to better model performance, the
statistics of each model across all heights are reduced to just two values at
each site: NRMSE for wind speed (NRMSE

Figure

RMSE for wind speed shear exponent (RMSE

Setup description of the 25 model setups ranked by horizontal grid
spacing of the finest grid. The columns are the model name and version
(model), the PBL scheme (PBL), the land surface model (LSM), whether nesting
was used (Nest.), the horizontal grid spacing (

The models were then grouped according to specific model components. Given
the range of setup choices that influence the model performance, large groups
were needed to obtain useful statistics. With this in mind, three setup
options were chosen for analysis: PBL scheme, grid spacing, and simulation
lead time; statistics of NRMSE

Statistics of NRMSE for wind speed (NRMSE

The PBL scheme in a MM ensures an accurate representation of thermodynamic
and kinematic structures of the lower troposphere

To study the influence of the PBL schemes used, the MMs were split into three
groups: YSU, MYJ, and Other. The statistics of NRMSE

At FINO3, the group consisting of models not using either the YSU or MYJ PBL
schemes generally has smaller wind speed errors; even though the group also
contains the model with the largest NRMSE

At Høvsøre, the three groups have very similar mean wind speed
error statistics, with YSU showing only slightly smaller errors. However, for
wind shear exponent, the models in the YSU group have the smallest errors,
both on average and for the median model.

At Cabauw, the YSU group has smaller errors than the other groups for both wind speed and wind shear exponent, but the errors for the median model in the YSU and MYJ groups are quite similar. The single most accurate model is found in the Other group, but that group as a whole has larger errors.

Statistics of NRMSE for wind speed (NRMSE

A mesoscale model should be able to explicitly resolve smaller and smaller
phenomena as the grid spacing is decreased.

Table

Statistics of NRMSE for wind speed (NRMSE

As the solution in mesoscale models is integrated forward in time, the
uncertainties associated with the errors in the initial conditions increase

Table

At Høvsøre, the short and long groups have similar error statistics for
wind speed, and both measures are lower than those for the medium group. For
RMSE

At Cabauw, the smallest errors for both wind speed and shear exponent are on
average found in the long group, while the median model with the smallest
errors is in the short group. It is worth noting that five of the seven
models in the long group use the YSU PBL scheme, and in Sect.

As described in Sect.

Distribution of errors from the model's output at 90 m at FINO3
for the following errors: (1) the mean wind speed

Figure

The mesoscale models in this study are able to reproduce the observed
mean wind speed profiles and the distributions of wind speed well. At FINO3 and
above 10 m at Høvsøre, the average of the models has a bias of 3 %
or less. The largest mean wind speed biases (7–9 %) are found at the
lowest levels at Høvsøre and Cabauw. Similarly, the MMs were able to
reproduce the relative variations of wind speed well in most cases
(Fig.

For future benchmarking exercises, our study shows that the focus should be on the model representation of surface characteristics, such as orography and land use, and their associated surface roughness. An attempt was made here to include these details, but because only a subset of the participants supplied this information, it was not feasible. Further studies could also benefit from including more land masts with low to moderate complexity, where capturing the surface characteristics is important, but still manageable by mesoscale models.

The impact of choosing specific model subcomponents was studied in some detail. To allow this, the output from the models was reduced to two metrics at each site, one related to the wind speed bias (NRMSE for wind speed) and one related to the shape of the wind speed profile (RMSE for wind speed shear exponent). The models were then separated into large groups according to their model setup for three setup choices: PBL scheme, grid spacing, and simulation lead time. At FINO3, the grouping revealed that the models using the MYJ PBL scheme had smaller wind speed and shear exponent errors than those that use the YSU scheme. At Høvsøre and Cabauw, the opposite was true. However, the differences between the two groups were not significant and the median model from the two groups had similar errors. Grouping the models according to grid spacing showed that the models with 3 km grid spacing or smaller had lower errors than the group with the largest grid spacings. For these sites, no conclusive evidence was found that reducing the grid spacing below 3 km results in smaller errors. For simulation lead time, the median model from the group with short lead times had the smallest errors at all sites, with the exception of the shear exponent error at Høvsøre. However, no significant difference between the mean of the groups was found, which suggests that the PBL scheme and grid spacing may be of greater importance for the performance at these sites. Future studies should include many more runs to provide more robust statistics, which can provide a basis for best-practice guidelines for wind energy applications using NWP models.

Last, we used the observed and modeled time series for a classical wind energy application, the estimation of power production at a hypothetical wind farm at FINO3. The power production, including wake losses, was estimated for both a single turbine and for a wind farm, using a standard power curve. The exercise showed that while a large spread exists between the modeled power density, it is reduced when the power is calculated using a power curve. It also showed the importance of accurately estimating the wind direction distribution since a small deviation in the distributions might induce large changes in the power production because of its sensitivity to the wind farm layout.

The output data from the mesoscale models have been submitted to the European Wind Energy Association (EAWE) for the mesoscale benchmarking study under an agreement that ensures that individual participants are anonymous in the reported results, and that the model output was not publicly shared. The measurements from the meteorological masts FINO3, Høvsøre, and Cabauw are provided by the data owners under an agreement of not sharing the data with any third party.

The authors declare that they have no conflict of interest.

We would like to thank the three anonymous reviewers for constructive
criticism. Their feedback elevated the level of the paper. Funding from the
EU and the Danish Energy Agency through the project EUDP 14-II and ERA-NET Plus
– New European Wind Atlas is greatly appreciated. The authors would also
like to thank the European Wind Energy Association (EWEA) for organizing this
mesoscale benchmarking study, the German Federal Ministry for the
Environment, Nature Conservation, and Nuclear Safety (BMU), and the Project
Management Jülich (PTJ) for sharing the FINO3 mast data. We would also like to thank the Cabauw
Experimental Site for Atmospheric Research (CESAR) for making the
measurements from the Cabauw mast freely available online
(