Research article 05 Nov 2018
Research article  05 Nov 2018
Assessing variability of wind speed: comparison and validation of 27 methodologies
 ^{1}National Renewable Energy Laboratory, Golden, CO 80401, USA
 ^{2}Department of Atmospheric and Oceanic Sciences, University of Colorado Boulder, Boulder, CO 80309, USA
 ^{1}National Renewable Energy Laboratory, Golden, CO 80401, USA
 ^{2}Department of Atmospheric and Oceanic Sciences, University of Colorado Boulder, Boulder, CO 80309, USA
Correspondence: Joseph C. Y. Lee (joseph.lee@nrel.gov)
Hide author detailsCorrespondence: Joseph C. Y. Lee (joseph.lee@nrel.gov)
Because wind resources vary from year to year, the intermonthly and interannual variability (IAV) of wind speed is a key component of the overall uncertainty in the wind resource assessment process, thereby creating challenges for wind farm operators and owners. We present a critical assessment of several common approaches for calculating variability by applying each of the methods to the same 37year monthly windspeed and energyproduction time series to highlight the differences between these methods. We then assess the accuracy of the variability calculations by correlating the windspeed variability estimates to the variabilities of actual wind farm energy production. We recommend the robust coefficient of variation (RCoV) for systematically estimating variability, and we underscore its advantages as well as the importance of using a statistically robust and resistant method. Using normalized spread metrics, including RCoV, high variability of monthly mean wind speeds at a location effectively denotes strong fluctuations of monthly total energy generation, and vice versa. Meanwhile, the windspeed IAVs computed with annualmean data fail to adequately represent energyproduction IAVs of wind farms. Finally, we find that estimates of energygeneration variability require 10±3 years of monthly mean windspeed records to achieve a 90 % statistical confidence. This paper also provides guidance on the spatial distribution of windspeed RCoV.
The P50, a widely used parameter in the windenergy industry, is an estimate of the threshold of annual energy production of a wind farm that the facility is expected to exceed 50 % of the time (Clifton et al., 2016). The P50 is usually estimated to apply over the lifetime of a wind farm, typically 20 years. To estimate P50 in the wind resource assessment process, a single percentage value is usually assigned to represent the uncertainty for the desired time period at a wind site (Brower, 2012). The interannual variability (IAV) of wind resources, along with site measurements and windpowerplant performance, is an important component of the overall uncertainty in power production (Clifton et al., 2016; Klink, 2002; Lackner et al., 2008; Pryor et al., 2006). The IAV is also incorporated in the measure–correlate–predict process (Lackner et al., 2008), which usually considers wind measurements spanning less than 2 years.
Analysts and researchers use numerous metrics to quantify windspeed variability, and the most common method is standard deviation (σ). For instance, the variability in historical or future wind resources is often represented as the σ from the annualmean wind speed of a certain location (Brower, 2012). As wind turbine power generation is a function of wind speed, the variability of wind resources has important implications for the resultant longterm energy production. Financially, when the wind resource is projected to fluctuate more from year to year (Hdidouan and Staffell, 2017), the levelized cost of wind energy increases as well.
Because the profitability of wind farms depends on wind variability, past research has explored the implications of interannual and longterm variability in wind energy. Pryor et al. (2009) analyze trends of annual wind speed and IAV, without explicitly quantifying IAV values. Archer and Jacobson (2013) evaluate the seasonal variability of windenergy capacity factor. Lee et al. (2018) assess the spatial discrepancies between windspeed variabilities of different temporal scales, from hourly mean to annualmean data. Bett et al. (2013) use σ and Weibull parameters to assess the wind variability in Europe. Extreme event analysis also offers another perspective to assess variability. For example, Cannon et al. (2015) examine extreme windenergy generation events via reanalysis data and discuss the associated seasonal and IAV qualitatively. Leahy and McKeogh (2013) also quantify the return periods of multiweek wind droughts.
To quantify variability, the normalized σ or the coefficient of variation (CoV), the σ divided by the mean of a time series, is a commonly used tool. Justus et al. (1979) calculate and compare the CoVs of monthly and annual wind speeds at different sites across the United States. Baker et al. (1990) quantify interannual and interseasonal variations of both wind speed and energy production at three locations in the Pacific Northwest. They find the annual CoVs ranged from 4 % to 10 %, matching the conclusions from Justus et al. (1979). Recently, Li et al. (2010) calculate hubheight windspeed variance and σ over 30 years to spatially evaluate seasonal and IAV in the Great Lakes region. Bodini et al. (2016) estimate the IAV of wind resources with a modified version of CoV, using observed meteorological data in Canada. As the sample period increases, the IAVs of most sites gradually increase, averaging 5 % to 6 % among the chosen sites (Bodini et al., 2016). Krakauer and Cohan (2017) correlate the CoVs of monthly mean wind speeds with different climate oscillation indices and find the global mean CoV at 8 %. In addition to characterizing wind speed, the metric is also used to evaluate the benefits of grid integration. For example, Rose and Apt (2015) conclude that the interannual CoV of aggregate windenergy generation in the central United States is 3±0.1 %, much smaller than that of individual wind plants, which varies between 5.4 % and 12 %, ±4.2 %.
Aside from CoV, other metrics representing the spread of data have also been chosen to estimate variability in the literature. For example, the robust coefficient of variation (RCoV) normalizes the median absolute deviation (MAD) with the median. Gunturu and Schlosser (2012) quantify the spatial RCoV of windpower density in the United States and demonstrate that the regions east of the Rockies, especially the Plains, generally have weaker variability and higher availability of wind resources. The seasonality index, originally used in Walsh and Lawler (1981) for precipitation purposes, is another measure to express variability. The seasonality index is defined as the sum of the absolute deviations of monthly averages from the annual mean, normalized with the annual mean. Chen et al. (2013) use the seasonality index to assess the interannual trend and the variability of wind speed in China, and they relate windspeed IAVs to climate oscillations.
Alternative variability metrics emphasize the longterm trends via contrasting wind speeds of different periods. The “wind index”, used in Pryor et al. (2006) and Pryor and Barthelmie (2010), is a ratio of wind speeds of a reference period and an analysis period. An entirely different wind index evaluated in Watson et al. (2015) is a ratio of spatially averaged wind speeds during two different periods.
Despite the importance of longterm variability, the windenergy industry lacks a systematic method to quantify this uncertainty. As various metrics to assess variability exist, a comprehensive comparison of measures is necessary. Therefore, the goal of this study is to evaluate various methods of estimating intermonthly and IAV in a reliable way using a longterm, consistent database. Specifically, our objective is to determine an optimal metric or metrics for relating windspeed variability to energyproduction variability. We describe the windspeed and energygeneration data, the methodology, and the chosen variability metrics in Sect. 2. We evaluate different variability measures via two case studies in Sect. 3. We also contrast the results computed from monthly mean and annualmean data, and we illustrate the spatial distribution of windspeed variability in Sect. 3. We then recommend the best practice in using the ideal method in Sect. 4. We focus on the applicability of imposing such metrics to quantify the variabilities of wind speeds and windenergy production.
2.1 Wind and energy data
In this study, we use a 37year time series of monthly mean wind speed and monthly total windenergy production in the contiguous United States (CONUS). For wind speed, we use hourly horizontal wind components in the National Atmospheric and Space Administration's ModernEra Retrospective Analysis for Research and Applications, Version 2 (MERRA2), reanalysis data set (Gelaro et al., 2017; GMAO, 2015) from 1980 to 2016. We use these components to derive the monthly mean wind speed at 80 m above the surface, which represents hub height in this study, via the power law (Eq. 1) and the hypsometric equation (Eq. 2):
In Eq. (1), u(z_{1}) and u(z_{2}) are the horizontal wind speeds, at heights z_{1} and z_{2}, in which wind speeds are the square root of the sum of squared horizontal wind components, and α is the shear exponent. In Eq. (2), R_{d} is the dry air gas constant, $\stackrel{\mathrm{\u203e}}{T}$ is the average temperature between levels z_{1} and z_{2}, and p_{1} and p_{2} are the atmospheric pressures at z_{1} and z_{2}. In most grid cells, we use the MERRA2 meteorological output at 10 and 50 m above the surface to calculate α, so as to extrapolate the wind speed at 80 m. In mountainous regions, the heights at 850 or 500 hPa may be closer to 80 than 10 m above the surface; in that case, we use data at the next available level of 850 or 500 hPa to derive the heights of that level and thus to extrapolate the wind speed at 80 m.
The horizontal resolution of the MERRA2 is 0.5^{∘} in latitude (about 56 km) and 0.625^{∘} in longitude (about 53 km). The MERRA2 reanalysis interpolates the data and the metadata at the exact output latitude and longitude; hence the wind speed, air density, and elevation refer to the grid points with the particular sets of latitude and longitude (Bosilovich et al., 2016). Thus, the longest distance between a wind farm and the closest MERRA2 gridcell center is about 39 km.
For energyproduction data, we use the net monthly energy production of wind farms in megawatt hours (MWh) from the US Energy Information Administration (EIA) between 2003 and 2016. Each of the wind farms has a unique EIA identification number. After we leave out about 300 wind sites with incomplete or substantially zero production data, a total of 607 wind farms in the CONUS are selected for this analysis. For simplicity, the CONUS in this analysis is defined as the area bounded by 127^{∘} W, 65^{∘} W, 24^{∘} N, and 50^{∘} N, and geographically includes the 48 states in CONUS and Washington, D.C. (Fig. 1).
2.2 Methodology
2.2.1 Linear regression and data postprocessing
We focus on the direct relationship between wind speed and energy production to investigate approaches for calculating longterm variability. Therefore, we must minimize the influence from other determinants of energy production, such as curtailment and maintenance. First, we eliminate data with zero values for monthly energy production, which is typical in the first months of a new wind farm. Next, we linearly regress the monthly total energy production on the monthly mean MERRA2 80 m wind speed at the closest grid point to each wind farm from 2003 to 2016. In other words, each wind site is assigned its own regression equation. We then remove any production data below the 90 % prediction interval to exclude underproduction for reasons other than low wind speeds, and omit the data above the 99 % prediction interval, or potentially erroneous overproduction. Prediction intervals are calculated via the t values and the standard error of prediction (Montgomery and Runger, 2014). In other words, we define the outliers of energy production using the threshold of 1.64 times below the standard error and 2.58 times above the standard error of the sitespecific regression. We also apply a thirdorder polynomial fit (Archer and Jacobson, 2013), and it leads to very similar results to the linear model. Hence, we focus on presenting the results from the linear fit in this study.
After regressing the outlierfree energy data on wind speed, we then filter the wind farms based on the coefficient of determination (R^{2}), which indicates the confidence of the linear regression. We select the R^{2} threshold of 0.75: 349 of the original 607 wind farms pass this filter. Through this filter, we ensure that wind speed is the primary driver of energy production in the wind farms with high R^{2} values. Lunacek et al. (2018) also use a similar R^{2}filtering method with a threshold of 0.7. Considering some farms lack years of complete generation data, we extend the monthly energy production to 37 years using the same sitespecific linear models with the monthly MERRA2 wind speed. In other words, we compute any missing energyproduction data from 1980 to 2016 based on the linear fit from the years that do exist in the data set. Herein, we refer to this longterm extension of data as the predicted energy production. Of the 349 wind farms, 7.5 years is the median of the energy data that are derived via the linear fit, given the available EIA records between 2003 and 2016.
We then further apply a second filter using the Pearson's correlation coefficient (r) between the predicted and actual monthly energy production, and we only choose the 195 wind farms with r larger than 0.8. As a result, of the rfiltered wind sites, we ensure wind speed is the primary driver of windpower production, and we confirm the energy predictions match well with those observed.
The nonfiltered, R^{2}filtered, and rfiltered wind farms carpet most of the popular wind farm regions across the CONUS (Fig. 1), even with the high r threshold of 0.8. Thus, the rfiltered samples provide a sufficient representation of the wind farms across the United States. To illustrate our analysis with examples, we select one site in Oregon (OR) and another site in Texas (TX) that demonstrate distinct windspeed distributions. We choose the two sites to contrast the results of different variability metrics throughout the paper; both sites pass the r filter (Fig. 1).
Recognizing that the horizontal resolution of the MERRA2 data could be perceived as undermining the linear regressions, we explore any possible role of the distance between the closest MERRA2 grid point and the actual wind farm, but we find no statistical relationship. In particular, horizontal and vertical discrepancies between the model and the observations do not affect the resultant R^{2} in the linear regressions. More than half of the 607 wind farms pass the R^{2} filter, and more than half of those pass the r filter (Fig. 2a). Additionally, the correlation between R^{2} and the horizontal distance between the closest MERRA2 grid point and the actual wind farm is close to zero (Fig. 2b); the correlation between R^{2} and the vertical difference between the modeled grid point and the actual wind site is also weak (Fig. 2c). In other words, the horizontal and vertical distances between the MERRA2 grid points and the wind farms have no apparent impact on the representativeness of the wind farms in the linear regression.
Additionally, we analyze the uncertainty of the linearregression method. We first test the influence of the error term in the regression, to account for the uncertainty associated with the input data. After a wind farm passes the R^{2} threshold of 0.75, we add a random value within 1 standard error to the predicted energy production of each month. This random error term introduces uncertainty to the regression process but does not affect the R^{2} of the sitespecific regression. Furthermore, we also test the sensitivity of the R^{2} and r thresholds by analyzing the results after modifying those limits. Specifically, we loosen the R^{2} and r thresholds to 0.6 and 0.7, and we tighten the R^{2} and r thresholds to 0.85 and 0.9. Loosening these thresholds increases the sample sizes of the wind farms that pass the filters and tightening the thresholds results in the opposite.
We test other factors that could undermine these regressions. We considered the hubheight air density extrapolated from MERRA2 as another regressor in the regressions, but air density is a statistically insignificant predictor and thus is not discussed in the rest of this study. When we replace the prediction interval with the confidence interval, the sample sizes increase from 349 and 195 sites to 555 and 209 wind farms. However, at least 7 years of energy data are derived from the regression for 99 % of the samples, because confidence intervals are smaller than prediction intervals by definition. We also considered removing the longterm means and the impacts of annual cycles, yet the sample sizes decrease to 121 and 69 locations, and the regression fills at least some of the energy data for more than 99 % of the sites. Finally, to ensure these results were not specific to the MERRA2 data set, we perform the same analysis on the ERAInterim reanalysis data set (Dee et al., 2011). The results of the key variability parameters such as σ, CoV, and RCoV resemble the findings using MERRA2; hence we focus on the MERRA2 findings in this study.
Our analysis, although comprehensive, is constrained by the quality of our data. On the one hand, reanalysis data sets have errors and biases in windspeed predictions from complexities in elevation and surface roughness (Rose and Apt, 2016). Reanalysis data sets also demonstrate longterm trends of surface wind speeds (Torralba et al., 2017). The MERRA2 data set can also depict different meteorological environments than those at the wind farm locations, especially in complex terrain. The MERRA2 data of coarse temporal and spatial resolutions may also represent a lower intermonthly or IAV than the wind sites actually experience. Thus, regressing actual energy production on reanalysis wind speed adds uncertainty to our analysis. On the other hand, constrained by the monthly total energyproduction data from the EIA, our analysis ignores the signals finer than monthly cycles. The quality of the EIA data also varies across wind sites; therefore the filtering process via linear regression is necessary.
2.2.2 Variability metrics relating wind speeds and energy production
To evaluate the variabilities of both the wind speeds and the predicted energy generation from the filtered wind farms, we investigate a total of 27 combinations and variations of existing methods describing the spread of data. We categorize different variability metrics according to statistical robustness (insensitivity to assumptions about the data; for example, Gaussian distribution) and statistical resistance (insensitivity to outliers) (Wilks, 2011). Of the 27 variability methods tested, we select four representative measures to perform a comparison and discuss in detail, according to their robustness, resistance, and the nature of normalization by an average metric:

RCoV, defined as the MAD divided by the median (Gunturu and Schlosser, 2012; Watson, 2014), is a spread metric divided by an average metric and is both statistically robust and resistant.

Range (maximum minus minimum) divided by trimean (weighted average among quartiles) is a spread metric normalized by an average metric, and the numerator is not resistant.

CoV (Baker et al., 1990; Bodini et al., 2016; Hdidouan and Staffell, 2017; Krakauer and Cohan, 2017; Rose and Apt, 2015; Wan, 2004), defined as the σ divided by the mean, is a spread metric normalized by an average metric, and neither the denominator nor the numerator are robust or resistant.

σ is simply a spread metric that is not robust or resistant.
Among the four measures, only RCoV is completely statistically robust and resistant, and the first three methods are all normalized spread metrics. We further describe all the tested variability methods comprehensively in Table B1 in Appendix B. Each of these metrics is easy to implement via basic Python packages such as NumPy and SciPy with no more than a few lines of code. In addition, based on the exponential scaling relationship between power and wind speed developed by Bandi and Apt (2016), we also analyze the results from the exponential CoV and the exponential RCoV in this paper (Table B1).
In addition to calculating variabilities with the spread measures, we evaluate other diagnostics that describe distribution characteristics. These diagnostics include averaging metrics, such as the arithmetic mean (not resistant) and median (the 50th percentile, which is resistant); symmetry metrics, such as skewness (involving the third moment, not robust or resistant) and the Yule–Kendall Index (YKI, robust and resistant); a tailedness metric, namely kurtosis (involving the fourth moment, not robust or resistant); the Weibull scale and shape parameters (not robust); and the autocorrelation with a 1year lag to dissect the interannual cycles. We summarize the diagnostics evaluated in this analysis in Table B2. Along with the regression results, results from the four representative variability metrics and other distribution diagnostics demonstrate differences between the two selected sites (Table 2).
Herein, we quantify the variabilities of the 37year extended time series of wind speed and energy production via different methods, using a range of time frames: 1 year, 2 years, and up to 37 years for each wind farm. A metric is considered useful when the resultant windspeed variability correlates well with the resultant energyproduction variability across wind farms, even when random errors are implemented and the thresholds R^{2} and r are changed. In this analysis, we compare results with three correlation metrics: Pearson's r, Spearman's rank correlation coefficient (r_{s}), and Kendall's rank correlation coefficient (τ) (Table 1).
To assess the applicable time frames of various variability metrics, we evaluate the asymptote period of correlations for each method. In most cases, the correlation coefficients approach the 37year value after a certain analysis time frame. Using RCoV as an example, the Pearson's r's of shorter analysis periods (1year, 2year, etc.) gradually converge to the 37year value at 0.856 as the RCoVcalculation time frame expands (Fig. 5a). Hence, for each metric, assuming the 37year correlation coefficient represents the longterm correlation, we calculate the normalized differences between the correlation coefficients and the 37year value in each time frame, starting from 1 year. When the absolute mean of the normalized differences drops below 0.05 in a particular year, we determine that year as the length of data required for reliable results via that variability method. In other words, the asymptote year of a certain metric illustrates that the error of the resultant correlation between windspeed and energyproduction variability via that data length is less than 5 % from the longterm value. For example, the asymptote period of RCoV correlations is 3 years according to Pearson's r (Table 3).
To relate the IAVs between wind speed and energy production, we also perform the same analysis for annualmean data. Strictly speaking, calculating the variabilities using monthly mean data yields intermonthly variabilities, because the results account for monthly, seasonal, and annual signals. To isolate the signals from interannual variations, we also examine the metrics and their correlations between the annual means of hubheight wind speeds and energy production, after linear regressing and filtering via monthly data. However, the samples from each site are then limited to 37 data points of annual wind speed and energy production. Besides, selecting detrended data from longterm means to calculate variabilities and their correlations leads to trivial results because of the small sample sizes and hence is omitted in this study.
2.2.3 Investigation of windspeed RCoV
After we demonstrate that RCoV is the most systematic approach in linking windspeed and energygeneration variabilities in Sect. 3.2, we further examine the details of using RCoV, specifically determining the minimum length of windspeed data necessary to quantify variability effectively. We use 37 years of wind speed in every MERRA2 grid cell in the CONUS (a total of 5049 grid points), and we calculate the RCoVs with 1 to 37 years of data for each grid cell. Because the RCoVs calculated using data between 1980 and 2016 are only samples of the true longterm windspeed variability and hence the results involve uncertainty, we select a confidence interval approach.
We assume that the distribution of RCoV is Gaussian with infinite years of wind speed. Hence, we use a chisquare (χ^{2}) distribution to set bounds for the σ's from samples of RCoV. In other words, because the derived RCoVs differ with the years of wind speeds sampled, we use the χ^{2} distribution to quantify the confidence intervals of RCoV for each sample size. To determine the minimum data required for RCoV calculation, we use the following criterion (Montgomery and Runger, 2014):
where σ_{37} is the predetermined 37year σ of RCoV; n_{i} is the sample size of n years in year i, which is between 1 and 36 years; ${\mathit{\sigma}}_{i}^{\mathrm{2}}$ is the variance of the sample of RCoVs in year i; and ${\mathit{\chi}}_{\mathit{\alpha}/\mathrm{2},{n}_{i}\mathrm{1}}^{\mathrm{2}}$ is the percentage point of the χ^{2} distribution given the confidence level of α and the degrees of freedom of n_{i}−1. We select a pair of α levels, 90 % and 95 %; hence we use four percentage points of the χ^{2} distribution at 0.025, 0.05, 0.95, and 0.975 to construct the respective confidence intervals. Because the 37year RCoV is an estimate of the truth, which is the windspeed RCoV of infinite years, its singular value does not yield any variance or possess any distribution shape. Thus, to construct the confidence interval of the σ of the truth, we set the predetermined σ_{37} as a fraction of the 37year RCoV. Particularly, the σ_{37}'s are 10 % and 5 % of the 37year RCoV for the 90 % and 95 % confidence levels, respectively.
In summary, for each grid point, we first determine an uncertainty bound based on the 37year windspeed RCoV of the location: we assign a 37year σ, which is either 5 % or 10 % of the 37year RCoV and, depending on the confidence level, has either a 95 % or 90 % confidence level. For each year i, from 1 to 37 years, we calculate the pairs of χ^{2}derived σ's of year i, which represent the lower and upper bounds of the confidence interval. When both of the χ^{2}derived σ's become smaller than the predetermined 37year σ, year i becomes the minimum length of data required to calculate RCoV effectively at the specific confidence level. We analyze the windspeed RCoV via both monthly mean and annualmean wind speeds. We label the resultant minimum length of windspeed data based on the χ^{2} method as the convergence year, in contrast to the asymptote period which determines the asymptote year of correlation coefficients.
3.1 Case studies: Oregon and Texas sites
We select two sites from two different geographical regions with considerable windenergy deployment, the southern Plains and the Pacific Northwest in the United States, to contrast the results of various variability metrics. Based on the sitespecific regressions, we extend the monthly energyproduction time series to 37 years (Fig. 3a and b) for the two sites. Both sites pass the R^{2} filter at 0.75 and the r filter at 0.8. Although the OR site is farther from the closest MERRA2 grid point in a region with more complex terrain, the resultant R^{2} (0.87) and predicted–actualenergy Pearson's r (0.91) are larger than those of the TX site (0.79 and 0.81, respectively) (Table 2). The 37yearaverage wind speed of about 7.6 m s^{−1} at the TX site is larger than that of the OR site at about 6.8 m s^{−1} (Table 2). Additionally, the 12monthlag autocorrelations demonstrate that the annual cycle of monthly wind speeds of the TX site is stronger than that of the OR site, yet the autocorrelations of the sites, 0.53 and 0.32, are still lower than the CONUS median of 0.58 (Table 2).
None of the monthly and annual windspeed distributions of the sites are perfectly Gaussian. According to the kurtosis, skewness, and YKI values of the monthly mean wind speeds (Table 2), the monthly windspeed distribution at the OR site skews towards lower wind speeds with more and stronger extremes (Fig. 3c). The skewed distribution at the OR site leads to 71.2 % of the monthly wind speeds located within 1σ from the mean, compared to the classic Gaussian of 68.3 %. Nevertheless, although the TX site monthly windspeed distribution is very close to symmetric with fewer outliers (Fig. 3d), which is supported by nearzero skewness and YKI (Table 2), only 64.6 % of monthly data fall within 1σ from its mean. For annualmean wind speeds, the averaging with a 12month time span at both sites reduces the ranges and thus leads to kurtosis close to −1 (Table 2). Although the skewness and YKI are close to 0 (Table 2), only 59.5 % and 56.8 % of the annualmean wind speeds fall within 1σ from the means of the OR and TX sites, respectively.
The four selected variability methods yield similar resultant monthly variabilities that are close to the respective CONUS medians based on the 37year monthly data. For variabilities of monthly wind speeds, the differences between the two sites are slight because the comparison among the results of the four metrics is inconclusive (Table 2): the monthly variabilities are not far from the national medians (Table 2). However, results from the normalized spread metrics (RCoVs, range divided by trimean, and CoV) using the 37year and the observed energy production illustrate that the OR site generates more variable wind power than the TX site (Table 2). The magnitudes of the variabilities between the 37year and the actual monthly energy production are also comparable, and the discrepancies between them are larger at the TX site than the OR site. Nonetheless, the predicted and the observed monthly energy production of the two sites demonstrate similar variability characteristics overall.
Moreover, when we apply the four selected methods to the annualmean data, the metrics describe IAV exactly. For both variables, wind speed and energy generation, nearly all metrics illustrate that the OR site has stronger IAV than the TX site, except for using σ to quantify energyproduction IAV (Table 2). Echoing the results of the monthly data mentioned previously, the use of normalized metrics suggests the energy production at the OR site varies more than that at the TX site, intermonthly and interannually. Note that all the IAVs are smaller than the variabilities calculated using monthly data (Table 2), because the annual averaging collapses variations in the data.
Additionally, the magnitudes of energy variabilities and IAVs are also nearly or more than twice as large as those of wind speed (Table 2). The reason is the nature of the power curve: windpower generation is a function of wind speed cubed at wind speeds below rated. Therefore, small windspeed variations propagate into large energyproduction fluctuations that are discernible in monthly and yearly data.
3.2 Variability metrics comparisons
Matching the windspeed and energy variabilities over 37 years at each rfiltered site, RCoV, as a statistically robust and resistant metric, yields the highest Pearson's r (0.86) among the four highlighted methods as well as all the variability metrics evaluated (Fig. 4 and Table B1). A perfect variability measure would link windspeed and windpower variations closely together with a correlation of unity, and so RCoV, with the highest Pearson's r, is the best of all. On the one hand, a strong correlation between the windspeed RCoV and the energyproduction RCoV implies that the high windspeed variability at a wind farm translates to high energygeneration variability, and vice versa (Fig. 4a). For instance, the moderate 37year windspeed RCoVs of the OR and TX sites indicate modest fluctuations in energy production between months (Fig. 4a). On the other hand, a nonresistant method, range divided by trimean, leads to a lower r (0.64) and suggests the OR site has variable wind speed and energy production (Fig. 4b). For the other two nonrobust and nonresistant methods, the CoV results in a modest r (0.70) with a similar scatter as the RCoV (Fig. 4c); the σ, not normalized by an average metric, does not relate windspeed and energy variabilities effectively (Fig. 4d). The positions of the two wind farms relative to the rest of the sites in Fig. 4 illustrate that the TX site experiences average variabilities in wind resource and energy production, whereas the OR site has aboveaverage energygeneration variability. Overall, the four methods lead to different representations of energy variability at the OR site.
By increasing the years included in the variability calculations using monthly data, the resultant correlations of most metrics vary less, the correlations gradually converge to their 37year values, and their asymptote periods vary. The 37year Pearson's r values from the four selected metrics between windspeed and energyproduction variabilities in Fig. 4 transform into the 37year marks in Fig. 5, and we use a 5 % threshold of normalized deviation to determine the asymptote periods. Particularly, the r's from RCoV and CoV (Fig. 5a and c) reach their respective asymptotes steadily with longer length of data, whereas the r's from range divided by trimean do not (Fig. 5b). The 37year correlation using σ is weak and thus the method is not actually useful: while the r's approach the 37year benchmark (Fig. 5d), this correlation value is so low (0.2) as to be ineffective. Paired with a high longterm r, the asymptote period of a metric indicates the appropriate time span of windspeed data required to represent the variability of windenergy production. For example, the resultant r's using RCoV approach a high value after just 3 years, meaning one needs 3 years of windspeed data to estimate the windspeed variability so as to adequately infer the energyproduction variability of a certain or potential wind farm via RCoV.
The three correlation coefficients (Pearson's r, Spearman's r_{s}, and Kendall's τ) yield consistent results among all variability metrics tested; hence we primarily present the results using Pearson's r here. Table 3 summarizes the 37year correlations (r, r_{s}, and τ), between the windspeed variabilities and the energyproduction variabilities using the rfiltered data, and the respective asymptote periods of the methods. The r and τ of RCoV are the largest (0.86 and 0.67, respectively) among all variability metrics, and the associate asymptote periods are also relatively short (2 to 3 years) (Table 3). Another normalized, robust, and resistant spread metric, interquartile range (IQR) divided by median, results in the highest r_{s}, and the r_{s} of RCoV is the second largest (Table 3). More importantly, the asymptote periods of RCoV are the smallest of all, regardless of the choice of correlation coefficient. In other words, fewer years of data are necessary to calculate RCoV to effectively relate windspeed and energy variabilities than any other metric. Overall, when a spread metric yields strong correlations between variabilities of wind speed and energy generation, the correlation metrics agree with each other (Table 3). Therefore, the results in this paper focus on Pearson's r, which is a commonly used correlation coefficient.
In addition to the spread metrics, other distribution diagnostics also yield strong correlations between the 37year monthly wind speed and energy production. For example, kurtosis and skewness result in r and r_{s} above 0.9. Because we determine the asymptote periods based on normalized deviations, when the 37year correlation benchmark of a metric is high, the respective asymptote period tends to be shorter. Therefore, only 1 year of monthly data is required to compute kurtosis and skewness adequately, except for using r_{s} in kurtosis, where those r_{s}'s of the smaller number of years are low (Table 3). Moreover, the symmetry and the shape of the energyproduction distribution can be characterized using windspeed data, given the moderately strong correlations of YKI and the Weibull shape parameter (Table 3).
Additionally, we also perform the same correlation and asymptote analyses on the data from changing the R^{2} and r filter thresholds as well as the data with random error, and RCoV again yields the strongest correlations and the shortest asymptote periods among all methods. We adjust the R^{2} and r requirements in the linearregression process, thus changing the filtered sample sizes. On the one hand, reducing the R^{2} threshold to 0.6 and the r threshold to 0.7 increases the respective sample sizes to 461 and 306 wind farms, but weakens the correlations between windspeed and energy variabilities for all methods (Table B3). On the other hand, increasing the R^{2} threshold to 0.85 and the r threshold to 0.9 strengthens the windspeed–energy correlations of all the metrics and shrinks the sample sizes to 212 and 83 wind farms, respectively (Table B3). Modifying the filtering thresholds leads to different r's yet similar asymptote periods among all metrics. Moreover, we also test the vigorousness of our findings by introducing an error term, randomized based on the standard error, in predicting the 37year energy production. The error term adds uncertainty to resemble the reality of noisy windspeed and powerproduction data. We introduce the error term to the predicted energy production for each of the 349 wind farms that pass the original R^{2} threshold of 0.75. This approach weakens the correlations and lengthens the asymptote periods for most metrics (Table B3). Overall, according to the results from the R^{2}–r threshold and the random error tests, RCoV yields the highest r's among all methods, and its asymptote periods remain reasonably short.
Further, normalized and simple spread metrics yield different relative windspeed variabilities between wind sites. On the one hand, the correlations coefficients between 37year monthly mean windspeed RCoV and CoV, two spread metrics that are normalized by average metrics, are nearly unity (Fig. 6a). The comparison between two simple spread metrics, MAD and σ, results in correlation coefficients close to 1 also (Fig. 6d). The relative positions of the OR site highlight the differences between Fig. 6a and d: compared to other wind farms, the OR site has moderate windspeed RCoV and CoV, but small MAD and σ. Compared to Fig. 6a, the lower r_{s} and τ in Fig. 6d illustrate that MAD and σ can misrepresent the relative windspeed variabilities of a wind site. On the other hand, the results between a normalized spread metric (RCoV and CoV) and the respective simple spread metric (MAD and σ), which is also the numerator of the normalized spread metric, lead to weaker correlations (Fig. 6b and c). The r, r_{s}, and τ between 37year monthly windspeed RCoV and σ are 0.684, 0.738, and 0.579, respectively (not shown). The wind sites with slower average wind speeds and thus disproportionately larger normalized spread results cause the deviations from perfect correlations in Fig. 6b and c. Therefore, normalized spread metrics, which account for the differences in windspeed magnitude, become advantageous over simple spread metrics in comparing variabilities of wind sites. Note that we demonstrate similar comparisons between windspeed spread metrics via annualmean data in Fig. A2 (Appendix A).
Meanwhile, using annualmean data to compute IAVs can lead to misleading interpretations. Scatterplots of the 37year windspeed and energy IAVs similar to Fig. 4 are illustrated in Fig. A1, via the same 195 rfiltered sites. The correlations via yearly averages are generally weaker except for a few metrics, including range divided by mean, which yields the largest r of all (Table B4). However, the 37year correlations do not adequately represent the longterm values (Table B4), so even though the resultant asymptote periods are longer than those using monthly data, the asymptote analysis method is unsuitable for annual data. Moreover, using annual averages greatly limits the sample size at each site even with 37 years of hourly windspeed data. Statistically, a smaller sample leads to a smaller spread of that distribution. Accordingly, with few years of data, small spreads in annualmean wind speeds result in a tight cluster of IAVs among all the wind farms. Therefore, the compact collection of windspeed and energyproduction IAVs causes strong correlations, solely because of the small number of annual averages used in the IAV calculation. Thus, the correlations via annual means demonstrate a downward trend with increasing length of data, regardless of the variability metrics chosen (Fig. 7). Although the correlations approach the 37year values, the weakening correlations with more years included in the IAV calculations imply that using less data is preferred in connecting the two IAVs. Note that the spread cannot be computed with one data point and hence the correlations between windspeed IAVs and energy IAVs do not exist with a single year of data (Fig. 7). Overall, the asymptote analysis causes deceptive results, and, given the nature of the annual data, we cannot determine the sufficient length of data to effectively link the IAVs of wind speed and energy production. In other words, relating windspeed IAV and energygeneration IAV with annualmean data is flawed.
3.3 Windspeed RCoV calculation and spatial distribution
Now that we have established that RCoV is a powerful and accurate way to relate windspeed and energygeneration variations, we assess the required amount of data to calculate the RCoV of wind speed. We compute the sitespecific RCoVs using different spans of monthly mean wind speeds, including the OR and the TX sites (Fig. 8). The variations of RCoVs decrease as more years are included in the calculations, and for each location we use the 37year windspeed RCoV as the longterm benchmark. For example, the 37year windspeed RCoV of 0.082 at the OR site means that the median among the absolute deviations from the median is 8.2 % of the median monthly mean wind speed (Fig. 8a and Table 2). We determine the 37year σ's as 10 % and 5 % of the 37year RCoV, and we apply the χ^{2} approach at 90 % and 95 % confidence levels, respectively, to derive the convergence years, or the minimum length of windspeed data required to calculate RCoV effectively. The convergence years of the OR and TX sites are 12 and 25 years with a 90 % confidence, and 20 and 31 years with a 95 % confidence, respectively (Table B5). In other words, for the OR site, one needs 12 years of monthly mean wind speeds to compute RCoV with a 90 % confidence that the resultant RCoV is within a 10 % deviation from the 37year RCoV.
To quantify the intermonthly variability of wind speed at a wind farm, RCoV requires 10 years of monthly windspeed records with a 90 % confidence. In general, the σ's of windspeed RCoVs across the CONUS decrease with more years included in the RCoV calculation (Fig. 9a). For each grid point, the sample size of RCoV also becomes smaller, from 37 RCoVs of 1 year of data to 1 RCoV of 37 years of data, and hence the σ of RCoV decreases as the length of the analysis period of wind speed increases (Fig. 9a). With the σ's of RCoVs across 37 years, we determine the convergence years via the χ^{2} method. For a certain confidence level, the cumulative fraction of the CONUS grid cells that exceed the associated threshold of χ^{2}derived confidence intervals increases with the length of data (Fig. 9b). Among all of the MERRA2 grid cells in the CONUS, the median convergence year is 10 years and the associated MAD is 3 years at a 90 % confidence level (Fig. 9b and Table B5). In other words, to assess the windspeed variability via RCoV with a maximum of 10 % error from the longterm value and a 90 % confidence, one needs 10±3 years of monthly mean windspeed records.
Moreover, raising the confidence level extends the minimum length of windspeed data to compute RCoV. At the 95 % confidence level, the median convergence year is 20 years, and 2.5 % of grid points in the CONUS require more than 37 years of monthly mean data to calculate RCoV (Fig. 9b and Table B5). Additionally, using yearly mean wind speeds instead of monthly data to calculate RCoV requires much longer time to reach convergence. At a 95 % confidence, 33 years of annualmean data is the average required length, and half of the CONUS grid points have convergence years of more than 37 years (Fig. 9b and Table B5). We also perform the same analysis on CoV and σ of wind speeds (Table B5). Although CoV and σ need fewer years to attain convergence, these nonrobust and nonresistant methods yield worse correlations between windspeed and energyproduction variabilities than RCoV, and hence we focus on demonstrating the RCoV results.
Spatial distributions of windspeed RCoVs across the CONUS identify locations with reliable wind resources. Based on the sitespecific convergence years at a 90 % confidence level (Fig. 10a), we calculate the RCoVs with monthly mean wind speeds of the particular time spans at each grid point and normalize with the CONUS median (Fig. 10b). Regions requiring long windspeed records are irregularly scattered across the continent, such as the Northeast, the Dakotas, and Texas. The mountainous states generally illustrate high RCoVs, including the Appalachians and the Rockies. Given the strong correlations between the windspeed RCoV and energyproduction RCoV, Fig. 10b offers a realistic estimation of the general spatial pattern of the variability in windenergy production as well. Note that, qualitatively, Fig. 10b is similar to the maps of windspeed variability in Fig. 13a of Gunturu and Schlosser (2012) and in Fig. 3 in Hamlington et al. (2015), which also illustrate the variability of wind resources in the CONUS. In addition, using a 10year fixed length of windspeed data for all CONUS grid points to compute RCoV results in a nearly identical spatial distribution to the pattern in Fig. 10b.
Further, an ideal location for wind farms should exhibit ample wind speeds with low variability. We combine the spatial variations of the normalized RCoV and the longterm wind resource (Fig. 10b and c), and we differentiate regions according to the CONUS median RCoV and wind speed (Fig. 10d). Favorable candidates for wind farm developments have aboveaverage wind speeds and belowaverage variabilities, such as the Plains, parts of the upper Midwest, spots in the Columbia River region, and pockets nears the coasts of the Carolinas; poor places for wind power with weak winds and strong variabilities include the Appalachians and most of the Northeast.
The convergence years in some CONUS grid points are beyond 37 years when we increase the confidence level from 90 % to 95 % (Fig. 9b and Table B5), and those grid points do not demonstrate any geographical pattern as in Fig. 10a. Additionally, when using RCoV to represent IAV, the spatial patterns of required data lengths and the resultant normalized RCoVs for annual data are notably different from the monthly mean results, and geographical features seem to be irrelevant (Fig. A3). Furthermore, the categorical features of CoV resemble those of RCoV for onshore wind resources in the CONUS, whereas using σ results in notably distinct classifications of CONUS wind resources (Figs. 10d and A4).
When using statistically robust and resistant variability metrics, higher correlations between variabilities of wind speed and energy production emerge. Statistically robust methods do not assume or require any underlying windspeed distributions, and statistically resistant methods are insensitive to windspeed extremes. Of all methods, three robust and resistant metrics, RCoV, MAD divided by trimean, and IQR divided by median, result in the largest three r's in Tables 3 and B1, which suggests that they are the most useful metrics to quantify longterm variability. Depending on the meteorological data availability, windspeed characteristics, and terrain complexity, different methods are appropriate in different conditions. Nevertheless, robust and resistant methods are best able to relate windspeed variability and energygeneration variability, and RCoV is the most effective of all the metrics.
Overall, of all the methods we considered, RCoV consistently yields the strongest correlations between windspeed and energy variabilities and exhibits reasonable asymptote periods (Tables 3 and B1), even after accounting for random standard errors and modifying the R^{2} and r thresholds (Table B3). In addition, assessing windspeed RCoV with a 90 % confidence requires 10±3 years of windspeed data (Fig. 9 and Table B5), which exceeds the asymptote periods of 2 to 6 years to yield strong windspeed and energyproduction correlations (Table 3). Even though different locations require various spans of data (Fig. 10a), the average of the resultant RCoVs using 10 years of wind speeds leads to nearly identical spatial distributions (Fig. 10b). Therefore, to effectively quantify windspeed variability and thus adequately derive energygeneration variability, we recommend using the RCoV with 10 years of monthly mean windspeed data.
Annualmean data are inadequate to relate windspeed and energyproduction IAVs or to represent windspeed IAVs. We cannot determine the minimum years of data to relate annual windspeed and energy IAVs because their correlations decline with the length of data (Fig. 7). Moreover, the coarse time resolution of annual averages smooths out the fluctuations of smaller timescales. Yearly mean wind speeds also possess different distribution characteristics, such as skewness and kurtosis, compared to those of finer temporal resolutions (Lee et al., 2018). The nonzero kurtosis and skewness in Table 2 and in Lee et al. (2018) illustrate that most of the distributions of annualmean wind speeds in the CONUS are nonGaussian. Hence, using nonrobust metrics, such as σ, to evaluate IAV with samples of annual means from nonGaussian distributions can lead to incorrect representations of variability.
Additionally, extended years of windspeed data are also necessary to compute RCoV and represent IAV (Fig. A3a), and the resultant IAVs (Fig. A3b) differ from the variabilities calculated via monthly wind speeds (Fig. 10b). For instance, the low IAVs in the Appalachians (Fig. A3b) calculated with yearly mean wind speeds contradict the pattern of high monthly mean windspeed RCoVs in mountainous areas (Fig. 10b) as well as the findings in past research (Gunturu and Schlosser, 2012; Hamlington et al., 2015). Furthermore, some of the grid points require more than 37 years of yearly mean data to calculate windspeed RCoV with statistical confidence (Fig. 9 and Table B5). Although RCoV does not yield the strongest 37year r in relating windspeed and energy IAVs, readers should be cautious when using a limited number of annualmean data to derive IAVs. In short, to effectively assess the longterm variability of wind farm productivity, one should use wind speeds finer than yearly mean data.
Regions with ample wind resources and low variability favor windenergy developments, coinciding with the locations of many existing wind farms in the CONUS (Fig. 10d). Wind farms in the Plains and parts of the upper Midwest benefit from the aboveaverage wind speeds and the belowaverage windspeed RCoVs. Other regions, such as parts of the Columbia River region and the Carolinas, also experience strong, consistent winds. The Northeast and the Appalachians are relatively unfavorable for producing a stable, onshore windenergy supply, whereas the area east of Cape Cod in Massachusetts and the sections along the West Coast exhibit a promising offshore wind resource. Wind farm developers should account for wind resource as well as its longterm variability in repowering existing turbines and building new wind farms.
Furthermore, mathematically, a normalized spread metric, namely a spread statistic divided by an average metric, is more useful than solely a spread metric in assessing variability, and a normalized spread metric should always be presented with the corresponding averaging metric. For example, RCoV and CoV between wind speed and energy yield larger r's than MAD and σ (Table 3 and Fig. A1), and the r's between windspeed RCoV and CoV are also higher than those comparisons involving MAD and σ (Fig. 6). For σ, the root mean square of the deviation from the mean is not statistically robust or resistant, and 1σ means the uncertainty is 18.3 % from the mean. Hence, CoV, or the σ divided by the mean, is the respective normalized uncertainty metric to σ. For instance, the windspeed CoVs of both the OR and TX sites are about 0.13 (Table 2), implying the σ is 13 % from the mean. In contrast, using RCoV, or the MAD divided by the median, is a robust and outlierresistant metric of normalized uncertainty. For example, the windspeed RCoVs of the OR and TX sites are 0.08 and 0.09, respectively (Table 2), indicating the MADs are 8 % and 9 % from their median wind speeds. Even though RCoV is not as commonly used and not as intuitive as σ or CoV, RCoV is unrestricted by any underlying distribution assumptions. Overall, to correctly and effectively use the normalized spread metrics, both the normalized spread metric and the average value need to be stated clearly in pairs. In other words, the interpretation of “the variability is 2 %” oversimplifies the statistics of uncertainty quantification. Therefore, we recommend presenting both the RCoV and the median of a time series together in estimating variability.
Distribution diagnostics, other than the variability metrics, are also effective in identifying the characteristics of windenergy production. We examine distribution parameters resulting in strong windspeed–energy correlations, including kurtosis and YKI (Tables 3 and B2), which assess the degree of deviations from a Gaussian distribution. For example, we confirm that the monthly and annual windspeed distributions for our case studies in OR and TX are not perfectly Gaussian because of their nonzero kurtosis and skewness values (Table 2), as well as their portions of data within 1σ. Moreover, a multimodal or an asymmetric windspeed distribution (Fig. 3c and d) also implies a nonGaussian energyproduction distribution. Gaussian distribution is invalid for wind speeds across averaging timescales in general (Lee et al., 2018). Hence, understanding the underlying distribution of wind resources can validate the applications and the legitimacy of Gaussian statistics, especially in quantifying P50 and the associated losses and uncertainties.
Windspeed variability is a crucial component in assessing the overall uncertainty of P50, which is the estimated average energy production of a wind farm. This study highlights the importance of using rigorous methods to estimate intermonthly and interannual variability. To search for suitable ways to quantify this uncertainty under different conditions, we investigate 27 combinations of spread metrics over 607 wind farms in the United States, with closer examination of two geographically distinct sites. We evaluate the methods for robustness to nonGaussian distributions and resistance to extreme values, in contrast to the common practice of using only standard deviation (σ). We calculate variabilities using monthly and annual mean wind speeds from the MERRA2 reanalysis data set and wind farm monthly net energy production from the EIA. We find that within the contiguous United States (CONUS), statistically robust and resistant methods predict variabilities more accurately, particularly in that windspeed variabilities strongly correlate with observed energyproduction variabilities.
We recommend using the robust coefficient of variation (RCoV) to quantify variabilities of wind resource and energy production. RCoV, defined as the median of absolute deviation from the median wind speed divided by the median of the wind speed, is a robust and resistant spread metric, in contrast to σ. RCoV yields strong correlations consistently (a Pearson's correlation coefficient, or a Pearson's r, of 0.856 with 37 years of monthly means) in various sensitivity tests via different correlation coefficients, whereas σ does not. In other words, using RCoV, a wind farm with high windspeed fluctuations also possesses high variations in windenergy generations and vice versa, whereas other metrics do not reflect that relationship as effectively. RCoV, as a normalized spread metric, also leads to a more accurate depiction of windspeed variabilities than σ, a simple spread metric. Contrary to the custom of displaying uncertainty in one percentage value, we advise users to assess both the RCoV and the median in estimating intermonthly variability. Moreover, depending on the location, on average 10±3 years of monthly windspeed data are necessary to compute windspeed RCoV with a 90 % statistical confidence, such that the resultant RCoV deviates within 10 % of the longterm RCoV.
RCoV characterizes the spreads of the distributions of wind resources and windenergy production. The relatively low monthly mean windspeed RCoVs in the central United States indicate stable longterm wind resources, and the RCoV overall spatial distribution in the CONUS agrees with the findings from past research. Other distribution diagnostics, such as kurtosis and skewness, also result in strong correlations between monthly mean wind speed and energy generation, and thus they adequately represent energyproduction characteristics.
Because the longterm correlations between the windspeed and energyproduction interannual variabilities (IAVs) are weak (a Pearson's r of 0.668 for RCoV with 37 years of data) and decrease with the length of data, we cannot determine the minimum length of annual mean data required for skillful assessment of IAV. Hence, we do not recommend calculating IAVs with annualmean data. Although the concept of IAV has been essential in determining the annual energy production in the wind resource assessment process, annualmean wind speeds mask signals of finer temporal scales and thus lead to unreliable representations of longterm variability. Overall, uncertainty arises in the process of calculating IAVs based on limited samples, whereas RCoV yields credible intermonthly variabilities considering the adequate amount of monthly mean data.
Now that we have highlighted the preferred structure of using RCoV, we can assess finerscale variations using highresolution windspeed and energyproduction data. With data of different temporal scales, the autocorrelation of wind resources and its relationship with longterm energyproduction variations can also be quantified. The influence of climatic cycles on energy production can be explored. Furthermore, applying the concept of RCoV to reduce the uncertainty of P50 and assist financial decisions can be beneficial to the industry.
The MERRA2 data and the EIA data used in this study are publicly available at http://disc.sci.gsfc.nasa.gov/ (last access: 31 October 2017; Gelaro et al., 2017) and http://www.eia.gov/renewable (last access: 31 October 2017).
All authors formulated the research idea and designed the methodology together. JCYL performed the analysis; MJF and JKL provided critical feedback. JCYL prepared the manuscript with contributions from the two coauthors.
Julie K. Lundquist is an Associate Editor of Wind Energy Science. Joseph C. Y. Lee and M. Jason Fields have no conflict of interest.
This work was authored by the National Renewable Energy Laboratory, operated by the Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE), under contract no. DEAC3608GO28308. Funding was provided by the U.S. Department of Energy Office of Energy Efficiency and Renewable Energy's Wind Energy Technologies Office. The views expressed in the article do not necessarily represent the views of the DOE or U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
The authors would like to thank our collaborators, Vineel Yettella and Mark Handschy of the Cooperative Institute for Research in Environmental Sciences
(CIRES) at the University of Colorado Boulder; our colleagues at NREL,
especially Paul Veers; and Cory Jog at EDF Renewable Energy.
Edited by: Christian Masson
Reviewed by: two anonymous referees
Archer, C. L. and Jacobson, M. Z.: Geographical and seasonal variability of the global “practical” wind resources, Appl. Geogr., 45, 119–130, https://doi.org/10.1016/j.apgeog.2013.07.006, 2013.
Baker, R. W., Walker, S. N., and Wade, J. E.: Annual and seasonal variations in mean wind speed and wind turbine energy production, Sol. Energy, 45, 285–289, https://doi.org/10.1016/0038092X(90)900133, 1990.
Bandi, M. M. and Apt, J.: Variability of the Wind Turbine Power Curve, Appl. Sci., 6, 262, https://doi.org/10.3390/app6090262, 2016.
Bett, P. E., Thornton, H. E., and Clark, R. T.: European wind variability over 140 yr, Adv. Sci. Res., 10, 51–58, https://doi.org/10.5194/asr10512013, 2013.
Bodini, N., Lundquist, J. K., Zardi, D., and Handschy, M.: Yeartoyear correlation, record length, and overconfidence in wind resource assessment, Wind Energ. Sci., 1, 115–128, https://doi.org/10.5194/wes11152016, 2016.
Bosilovich, M. G., Lucchesi, R., and Suarez, M.: MERRA2: File Specification, GMAO Office Note No. 9 (Version 1.1), available at: https://gmao.gsfc.nasa.gov/pubs/docs/Bosilovich785.pdf (last access: 1 August 2017), 2016.
Brower, M. C.: Wind resource assessment: a practical guide to developing a wind project, Wiley, Hoboken, New Jersey, USA, 2012.
Cannon, D. J., Brayshaw, D. J., Methven, J., Coker, P. J., and Lenaghan, D.: Using reanalysis data to quantify extreme wind power generation statistics: A 33 year case study in Great Britain, Renew. Energy, 75, 767–778, https://doi.org/10.1016/j.renene.2014.10.024, 2015.
Chen, L., Li, D., and Pryor, S. C.: Wind speed trends over China: quantifying the magnitude and assessing causality, Int. J. Climatol., 33, 2579–2590, https://doi.org/10.1002/joc.3613, 2013.
Clifton, A., Smith, A., and Fields, M.: Wind Plant Preconstruction Energy Estimates: Current Practice and Opportunities, NREL/TP500064735, National Renewable Energy Laboratory, Golden, Colorado, USA, available at: http://www.nrel.gov/docs/fy16osti/64735.pdf (last access: 19 July 2017), 2016.
Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V, Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., MongeSanz, B. M., Morcrette, J.J., Park, B.K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.N., and Vitart, F.: The ERAInterim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597, https://doi.org/10.1002/qj.828, 2011.
Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., Randles, C. A., Darmenov, A., Bosilovich, M. G., Reichle, R., Wargan, K., Coy, L., Cullather, R., Draper, C., Akella, S., Buchard, V., Conaty, A., da Silva, A. M., Gu, W., Kim, G.K., Koster, R., Lucchesi, R., Merkova, D., Nielsen, J. E., Partyka, G., Pawson, S., Putman, W., Rienecker, M., Schubert, S. D., Sienkiewicz, M., and Zhao, B.: The ModernEra Retrospective Analysis for Research and Applications, Version 2 (MERRA2), J. Climate, 30, 5419–5454, https://doi.org/10.1175/JCLID160758.1, 2017.
GMAO (Global Modeling and Assimilation Office): MERRA2 tavg1_2d_slv_Nx: 2d, 1Hourly, TimeAveraged, SingleLevel, Assimilation, SingleLevel Diagnostics V5.12.4, Greenbelt, MD, USA, 2015.
Gunturu, U. B. and Schlosser, C. A.: Characterization of wind power resource in the United States, Atmos. Chem. Phys., 12, 9687–9702, https://doi.org/10.5194/acp1296872012, 2012.
Hamlington, B. D., Hamlington, P. E., Collins, S. G., Alexander, S. R., and Kim, K.Y.: Effects of climate oscillations on wind resource variability in the United States, Geophys. Res. Lett., 42, 145–152, https://doi.org/10.1002/2014GL062370, 2015.
Hdidouan, D. and Staffell, I.: The impact of climate change on the levelised cost of wind energy, Renew. Energ., 101, 575–592, https://doi.org/10.1016/j.renene.2016.09.003, 2017.
Justus, C. G., Mani, K., and Mikhail, A. S.: Interannual and MonthtoMonth Variations of Wind Speed, J. Appl. Meteorol., 18, 913–920, https://doi.org/10.1175/15200450(1979)018<0913:IAMTMV>2.0.CO;2, 1979.
Klink, K.: Trends and Interannual Variability of Wind Speed Distributions in Minnesota, J. Climate, 15, 3311–3317, https://doi.org/10.1175/15200442(2002)015<3311:TAIVOW>2.0.CO;2, 2002.
Krakauer, N. and Cohan, D.: Interannual Variability and Seasonal Predictability of Wind and Solar Resources, Resources, 6, 29, https://doi.org/10.3390/resources6030029, 2017.
Lackner, M. A., Rogers, A. L., and Manwell, J. F.: Uncertainty Analysis in MCPBased Wind Resource Assessment and Energy Production Estimation, J. Sol. Energy Eng., 130, 31006–31010, https://doi.org/10.1115/1.2931499, 2008.
Leahy, P. G. and McKeogh, E. J.: Persistence of low wind speed conditions and implications for wind power variability, Wind Energy, 16, 575–586, https://doi.org/10.1002/we.1509, 2013.
Lee, J. C.Y., Fields, M. J., Lundquist, J. K., and Lunacek, M.: Determining variabilities of nonGaussian windspeed distributions using different metrics and timescales, J. Phys. Conf. Ser., 1037, 072038, https://doi.org/10.1088/17426596/1037/7/072038, 2018.
Li, X., Zhong, S., Bian, X., and Heilman, W. E.: Climate and climate variability of the wind power resources in the Great Lakes region of the United States, J. Geophys. Res., 115, D18107, https://doi.org/10.1029/2009JD013415, 2010.
Lunacek, M., Jason Fields, M., Craig, A., Lee, J. C. Y., Meissner, J., Philips, C., Sheng, S., and King, R.: Understanding Biases in PreConstruction Estimates, J. Phys. Conf. Ser., 1037, 062009, https://doi.org/10.1088/17426596/1037/6/062009, 2018.
Montgomery, D. C. and Runger, G. C.: Applied statistics and probability for engineers, 6th Edn., Wiley, Hoboken, New Jersey, USA, 2014.
Pryor, S. C. and Barthelmie, R. J.: Climate change impacts on wind energy: A review, Renew. Sust. Energ. Rev., 14, 430–437, https://doi.org/10.1016/j.rser.2009.07.028, 2010.
Pryor, S. C., Barthelmie, R. J., and Schoof, J. T.: Interannual variability of wind indices across Europe, Wind Energy, 9, 27–38, https://doi.org/10.1002/we.178, 2006.
Pryor, S. C., Barthelmie, R. J., Young, D. T., Takle, E. S., Arritt, R. W., Flory, D., Gutowski, W. J., Nunes, A., and Roads, J.: Wind speed trends over the contiguous United States, J. Geophys. Res., 114, D14105, https://doi.org/10.1029/2008JD011416, 2009.
Rose, S. and Apt, J.: What can reanalysis data tell us about wind power?, Renew. Energ., 83, 963–969, https://doi.org/10.1016/j.renene.2015.05.027, 2015.
Rose, S. and Apt, J.: Quantifying sources of uncertainty in reanalysis derived wind speed, Renew. Energ., 94, 157–165, https://doi.org/10.1016/j.renene.2016.03.028, 2016.
Torralba, V., DoblasReyes, F. J., and GonzalezReviriego, N.: Uncertainty in recent nearsurface wind speed trends: a global reanalysis intercomparison, Environ. Res. Lett., 12, 114019, https://doi.org/10.1088/17489326/aa8a58, 2017.
Walsh, R. P. D. and Lawler, D. M.: Rainfall seasonality: description, spatial patterns and change through time, Weather, 36, 201–208, https://doi.org/10.1002/j.14778696.1981.tb05400.x, 1981.
Wan, Y.H.: Wind Power Plant Behaviors: Analyses of LongTerm Wind Power Data, NREL/TP50036551, National Renewable Energy Laboratory, Golden, Colorado, USA, available at: https://www.nrel.gov/docs/fy04osti/36551.pdf (last access: 19 July 2017), 2004.
Watson, S.: Quantifying the variability of wind energy, WIREs Energy Environ., 3, 330–342, https://doi.org/10.1002/wene.95, 2014.
Watson, S. J., Kritharas, P., and Hodgson, G. J.: Wind speed variability across the UK between 1957 and 2011, Wind Energy, 18, 21–42, 2015.
Wilks, D. S.: Statistical methods in the atmospheric sciences, Academic Press, Amsterdam, the Netherlands, 2011.