To accurately plan and manage wind power plants, not only does the time-varying wind resource at the site of interest need to be assessed but also the uncertainty connected to this estimate. Numerical weather prediction (NWP) models at the mesoscale represent a valuable way to characterize the wind resource offshore, given the challenges connected with measuring hub-height wind speed. The boundary condition and parametric uncertainty associated with modeled wind speed is often estimated by running a model ensemble. However, creating an NWP ensemble of long-term wind resource data over a large region represents a computational challenge. Here, we propose two approaches to temporally extrapolate wind speed boundary condition and parametric uncertainty using a more convenient setup in which a mesoscale ensemble is run over a short-term period (1 year), and only a single model covers the desired long-term period (20 year). We quantify hub-height wind speed boundary condition and parametric uncertainty from the short-term model ensemble as its normalized across-ensemble standard deviation. Then, we develop and apply a gradient-boosting model and an analog ensemble approach to temporally extrapolate such uncertainty to the full 20-year period, for which only a single model run is available. As a test case, we consider offshore wind resource characterization in the California Outer Continental Shelf. Both of the proposed approaches provide accurate estimates of the long-term wind speed boundary condition and parametric uncertainty across the region (

This work was authored in part by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by the U.S. Department of Energy Office of Energy Efficiency and Renewable Energy Wind Energy Technologies Office, by the Bureau of Ocean Energy Management (BOEM) under agreement no. IAG-19-2123 and by the National Offshore Wind Research and Development Consortium under agreement no. CRD-19-16351. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.

Offshore wind energy keeps increasing its market penetration as an inexpensive and clean source of energy. In some areas of the world, such as the North Sea in Europe, offshore wind represents a well-established source of electricity, with a total installed capacity of about 15 GW and a planned increase of up to 74 GW by 2030

Such extensive growth requires an accurate long-term characterization of the offshore wind resource

Tens of billions of dollars will be invested in the US offshore wind energy industry in the coming years. In order to minimize the financial risk associated with such major investments, not only is a characterization of the time-varying offshore wind resource needed, but an assessment of the uncertainty connected to this numerical prediction is of primary importance. A 1 % uncertainty change in the mean wind resource translates to a 1.6 %–1.8 % uncertainty for the long-term wind plant annual energy production

When considering NWP models, the choices of the model setup and inputs have a direct impact on the model wind speed prediction and therefore on its uncertainty.

Here, we consider wind speed characterization in the California OCS, and we propose and compare two innovative techniques for modeled wind speed long-term boundary condition and parametric uncertainty quantification. To do so, we consider a setup that is computationally more affordable, wherein WRF ensembles are only run over a short period (1 year) and are accompanied by a single, long-term (20 years) WRF simulation. First, we use a machine-learning algorithm to temporally extrapolate the WRF-based boundary condition and parametric uncertainty from the ensemble year to the full 20-year period. While machine learning has been successfully applied to various atmospheric (e.g.,

In the remainder of this paper, we describe the experimental setup and our proposed methods to quantify and temporally extrapolate modeled wind speed boundary condition and parametric uncertainty in Sect. 2. Section 3 validates the techniques used and compares the mean long-term predictions from the two approaches. Also, we discuss physical insights into the main drivers for offshore wind speed boundary condition and parametric uncertainty. Finally, we conclude and suggest future work in Sect. 4.

We consider a 20-year numerical data set recently developed by the National Renewable Energy Laboratory to provide accurate cost estimates for floating wind in the California OCS (Fig.

Map of the inner domain of the WRF numerical simulations for the California OCS. The current three wind energy lease areas are shown in red.

As described in detail in

Common specification for all of the WRF runs considered in the analysis.

The 16 WRF ensemble members are constructed based on variations in boundary conditions and key WRF model parameters that previous research determined to have a primary impact on modeled wind speed:

Reanalysis forcing product is selected between ERA5, developed by the European Centre for Medium-Range Weather Forecasts (

PBL parameterization is chosen between the Mellor–Yamada–Nakanishi–Niino (MYNN;

Sea surface temperature product is selected between the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) data set produced by the UK Met Office

Land surface model is chosen between the Noah model and the updated Noah multiparameterization model

In our analysis, we use hourly average data (calculated from 5 min WRF raw output), and we quantify the WRF wind speed boundary condition and parametric sensitivity in terms of the across-ensemble standard deviation of the WRF-predicted 100 m wind speed at any hour,

The first approach we use to temporally extrapolate the boundary condition and parametric uncertainty in 100 m modeled wind speed is a machine-learning gradient-boosting model (GBM)

Qualitative illustration of the concept used to temporally extrapolate the 100 m modeled wind speed boundary condition and parametric uncertainty through the proposed machine-learning and analog ensemble approaches. Wind speed uncertainty is directly quantified as its WRF across-ensemble normalized standard deviation (Eq.

The input features we use to feed the GBM are all taken (as hourly averages) from the single WRF setup that is run for the full 20-year period and are

wind speed at 100 m above ground level (a.g.l.)

sine and cosine

Sine and cosine are used to preserve the cyclical nature of this feature. Both are needed because each value of sine only (or cosine only) is linked to two different values of the cyclical feature.

of wind direction at 100 m a.g.l.air temperature at 40 m a.g.l.

wind shear coefficient calculated between 10 and 200 m a.g.l.

inverse of Obukhov length at 2 m a.g.l.

100 m wind speed standard deviation calculated from the preceding 2 h

100 m wind speed standard deviation calculated from the preceding 6 h

sine and cosine

sine and cosine

The learning algorithm is trained using the root-mean-square error (RMSE) as a performance metric to tune the algorithm weights. To avoid overfitting, we implement regularization during the training of the learning algorithm using the hyperparameters and value ranges listed in Table

Hyperparameters considered for the gradient-boosting model.

The second approach we use to quantify and extrapolate modeled wind speed boundary condition and parametric uncertainty is based on the AnEn approach. At each site and for each hour (hereafter referred to as the “target hour”), the AnEn considers a set of atmospheric variables, which are consistent between the AnEn and the machine-learning approach, in a 3 h window centered on the considered time stamp. Then, the AnEn looks for analog atmospheric conditions at the considered site using data from the single long-term WRF setup for the year 2017. More in detail, the multivariate atmospheric state within the considered time window is compared with the atmospheric conditions modeled by the long-term WRF setup in all of the 3 h time windows in 2017. For each hour in 2017, the AnEn calculates a similarity metric, formally defined as a multivariate Euclidean distance measure

Once the similarity metric is calculated for all of the hours (

Variability in the RMSE of two weight optimization schemes: site-specific optimization and general domain optimization.

The results from the AnEn approach are sensitive to the predictor weights,

Optimal weights associated with each physical variable in assessing the closeness of the match metric to identify the analogs.

As a first step, we need to assess the accuracy and validity of our proposed approaches for the wind speed boundary condition and parametric uncertainty extrapolation. As an initial validation step, we compare the distributions of the atmospheric variables used as inputs to the machine-learning and AnEn algorithms for 2017 with what is found in the full 20-year period. In fact, in order for both approaches to be accurate, it is essential that the considered atmospheric variables in 2017 (i.e., with which the models are trained) experience a range of variability representative of the full 20-year period (i.e., to which the models are applied).

Distributions of the atmospheric variables considered as inputs to the machine learning and AnEn algorithms from 2017 only and from the full 20-year period for a single site within the Humboldt wind energy lease area. Data are expressed in terms of their probability densities.

By qualitatively comparing the distributions of the seven atmospheric variables at one of the three wind energy lease areas (Fig.

After proving that the basic assumptions of the proposed approaches are validated by the data, we need to test the accuracy of their predictions.

Map of testing bias, cRMSE, and

To do so, at each grid cell we quantify the mean bias, centered or unbiased root-mean-square error (cRMSE), and coefficient of determination,

Now that the accuracy of both the proposed approaches has been assessed, we can analyze their long-term results. Figure

Median hourly boundary condition and parametric uncertainty for the 100 m wind speed, as derived from the machine-learning approach

A strong agreement between the two approaches clearly emerges, with the AnEn approach predicting slightly lower values, as discussed from the analysis of the mean bias in Fig.

When focusing on offshore wind energy development, additional considerations are needed to understand how modeled wind speed uncertainty varies for the most relevant scenarios for energy production. When segregating data, having a long-term record allows for robust assessments of the variability among the considered classifications, which might otherwise have been much murkier when considering data from a short-term period only. Therefore, the 20-fold increase in the size of the uncertainty data set provided by our proposed approaches brings an essential advantage to this direction.

Seasonality has a primary importance for the energy market, especially in a region such as California, with a strong peak in annual demand in summer, which recently led to detrimental rolling blackouts in the region. In this fragile scenario, assessing the uncertainty in the naturally varying long-term wind speed predictions could help assess the value that offshore wind energy can deliver to the California energy market and achieve more accurate planning of the balance between supply and demand. Figure

Seasonal deviation in median hourly normalized uncertainty in 100 m wind speed for winter (December, January, and February) and summer (June, July, and August), as derived from the machine learning approach

For most of the considered domain, we find a larger sensitivity in WRF-predicted wind speed in the winter months, with the GBM showing a slightly larger seasonal deviation than the AnEn approach. At the Morro Bay and Diablo Canyon lease areas, the median winter uncertainty is between 2 % and 8 % larger than the annual median at the same locations. On the other hand, the Humboldt lease area shows a near-zero winter deviation, with the machine-learning approach predicting slightly increased winter uncertainty values and AnEn predicting slightly negative ones. We find opposite results when considering the more energy-demanding summer months. Both Morro Bay and Diablo Canyon show a lower boundary condition and parametric uncertainty in summer with a difference from their annual median values smaller than 4 %. On the other hand, negligible variability is observed at the Humboldt lease area. We note that spring and fall months displayed intermediate results when compared to summer and winter (figures not shown).

Finally, we quantify the impact of different stability regimes on the long-term wind speed uncertainty. Various approaches to classify atmospheric stability offshore have been proposed and applied offshore (e.g.,

Daily distribution of atmospheric stability at the Humboldt wind energy lease area, as determined from the bulk Richardson number calculated over the lowest 200 m from the 20-year WRF simulation.

We see a predominance of near-neutral and stable conditions with a very weak diurnal variability. This is consistent with the sea surface temperature being generally colder than the near-surface air (because of ocean upwelling), which causes a predominantly stable stratification. Similar conditions are found at the other two wind energy lease areas. The maps in Fig.

Deviation in median hourly normalized uncertainty for the 100 m wind speed from the annual median for different atmospheric stability regimes, as derived from the machine-learning approach

The proposed approaches show a remarkable agreement. Neutral conditions show the lowest boundary condition and parametric uncertainty. At the three wind energy lease areas, we find uncertainty values about 2 %–4 % lower than the overall median in near-neutral conditions. On the other hand, the rare unstable cases show the largest uncertainty with deviations up to

As offshore wind energy becomes a widespread source of clean energy worldwide, the importance of having an accurate, long-term characterization of the offshore wind resource is crucial, not only in terms of its mean value but also of the uncertainty associated with this estimate. In our analysis, we focused on the California Outer Continental Shelf (OCS), where a significant offshore wind energy development is expected in the near future, to propose innovative techniques to temporally extrapolate hub-height wind speed boundary condition and parametric uncertainty from a short-term mesoscale numerical ensemble to a long-term single model run. First, we propose a gradient-boosting model algorithm, in which a regression model is trained over the short-term numerical ensemble to predict its variability and then applied to the long-term single model run. We compare this technique with an analog ensemble (AnEn) approach, wherein the extrapolated uncertainty for each time stamp in the long-term run is calculated by looking for similar atmospheric conditions within the short-term mesoscale numerical model ensemble. Adopting our proposed approaches for uncertainty extrapolation helps save significant computational resources as the desired long-term boundary condition and parametric uncertainty information can be derived from a much simpler setup, wherein the computationally expensive numerical ensembles are only run over a short-term period.

We find that both our proposed approaches agree well with the mesoscale model ensemble variability, thus providing a robust representation of the long-term wind speed boundary condition and parametric uncertainty. While AnEn has a slightly larger

Clearly, the magnitude of the boundary condition and parametric uncertainty component that we quantified in our analysis is strictly connected to the (limited) number of choices sampled within the considered model setups. Given this underdispersive behavior of the numerical weather prediction ensembles

Data from the WRF simulations over the California OCS are available at

The supplement related to this article is available online at:

NB and MO envisioned the analysis. MO ran the numerical simulation. NB performed the machine-learning analysis in close consultation with MO. WH performed the AnEn analysis with the guidance of GC and SA. NB wrote the majority of the manuscript with significant contributions and feedback from all coauthors.

The contact author has declared that neither they nor their co-authors have any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was performed using computational resources sponsored by the U.S. Department of Energy's Office of Energy Efficiency and Renewable Energy and located at the National Renewable Energy Laboratory. The authors thank Michael Rossol for his help in performing some of the computations using the National Renewable Energy Laboratory's High-Performance Computing Center.

This paper was edited by Joachim Peinke and reviewed by two anonymous referees.