Measure–correlate–predict (MCP) approaches are often used to correct wind measurements to the long-term wind conditions on-site. This paper investigates systematic errors in MCP-based long-term corrections which occur if the measurement on-site covers only a few months (seasonal biases). In this context, two common linear MCP methods are tested and compared with regard to accuracy in mean, variance, and turbine energy production – namely, variance ratio (VR) and linear regression with residuals (LR). Wind measurement data from 18 sites with different terrain complexity in Germany are used (measurement heights between 100 and 140 m). Six different reanalysis data sets serve as the reference (long-term) wind data in the MCP calculations. All these reanalysis data sets showed an overpronounced annual course of wind speed (i.e., wind speeds too high in winter and too low in summer). However, despite the mathematical similarity of the two MCP methods, these errors in the data resulted in very different seasonal biases when either the VR or LR methods were used for the MCP calculations. In general, the VR method produced overestimations of the mean wind speed when measuring in summer and underestimations in the case of winter measurements. The LR method, in contrast, predominantly led to opposite results. An analysis of the bias in variance did not show such a clear seasonal variation. Overall, the variance error plays only a minor role for the accuracy in energy compared to the error in mean wind speed. Besides the experimental analysis, a theoretical framework is presented which explains these phenomena. This framework enables us to trace the seasonal biases to the mechanics of the methods and the properties of the reanalysis data sets. In summary, three aspects are identified as the main influential factors for the seasonal biases in mean wind speed: (1) the (dis-)similarity of the real wind conditions on-site in correlation and correction period (representativeness of the measurement period), (2) the capability of the reference data to reproduce the seasonal course of wind speed, and (3) the regression parameter

An extensive measurement campaign generally constitutes an essential part of wind resource assessment and, therefore, of a successful wind energy project. In most cases, these measurements provide around 1 year of wind data at the site of interest

For this purpose, reference data are needed, which should be available for a long-term period of one to two decades

Over the recent past, reanalysis data gained more and more popularity in the wind industry and are now used extensively in wind resource assessment

A statistical procedure relating the reference data to the measured data is performed to derive a correction function. In this context measure–correlate–predict (MCP) approaches have evolved to become a standard tool for wind farm developers

Numerous MCP methods are used in modern wind resource assessment applications. They range from simple linear models

In order to enable a precise determination of the relationship between measurement and reference data, a sufficient amount of measurement data is necessary; that is, the concurrent period needs to be long enough. Various studies have been presented in which the question is addressed of how long the time span covered by the measurement should be. In general, it is recommended to be at least 1 year

From an economic perspective, though, there is a strong desire to reduce the duration of the measurement in order to save time and money

However, seasonal effects occur when the measurement does not cover all seasons

Several studies have investigated the accuracy of a long-term correction (LTC) of short-term wind measurements in dependence of the measurement duration

Table

Details of the measurement sites. The duration of the individual measurements is exactly 1 year. The measurements were carried out between May 2013 and April 2019.

The following six different reanalysis data sets serve as reference data in the MCP calculations.

It should be noted that both the anemosM2 and anemosE5 models generally provide a temporal resolution of 10 min. In order to guarantee comparability of the results, these were averaged to 1 h, ensuring the same temporal resolution for all reanalysis data sets.

In general, reanalysis data are modeled for different locations on a geographical grid. In this study, data were selected from the grid point closest to the respective site. For data sets 3–6 data at more than one height level were provided. In these cases, the data at the height closest to the measurement were used (i.e., 100 and 150 m for EMD-ConWx and EMD-WRF Europe

This study compares wind speed statistics as observed over different periods in the investigated data – namely short-term data and long-term data. For this purpose, the convention is applied that capital letters are used for long-term variables (e.g., the long-term corrected wind speed), while parameters in lowercase letters represent data from the short-term period. The subscript labels “meas”, “rea”, and “corr” refer to measurement, reanalysis, and corrected data, respectively.

In this study, short-term periods with a duration of 90 consecutive days are investigated. For the selection of these short-term periods, a sliding window algorithm with an increment of 3 d is used; i.e., the first 90 d period starts on 1 January, the second on 4 January, etc. When this sliding window reaches the end of the period of the original measurement campaign, the data from the beginning of the data set are appended. This ensures that all seasons are considered equally. In this way, one hundred twenty-two 90 d measurement periods were investigated for all sites. This procedure is applied equally to measurement and reanalysis data, guaranteeing that the respective time series values match consistently.

Illustration of the general procedure used in this study regarding the MCP predictions. After extracting the short-term data of measured (

In a first step, the data in each of the 90 d periods are investigated with respect to, e.g., mean and variance of wind speed (Sect.

Secondly, MCP predictions are performed. Applying the linear MCP methods described below in Sect.

The results, therefore, do not represent the overall errors (or uncertainty) of an LTC in general, which is usually performed over a period of 10 years or more

The procedure as depicted in Fig.

It should be noted that in practical applications, a sectorwise regression is often performed for an LTC of measurement data comprising a whole year. This means that the regression parameters are calculated separately for different wind direction bins, which allows taking the effects of terrain on wind flow into account. This can be important especially in a complex environment

When a correction is performed, the MCP methods may generate a few negative wind speed values. In this study, these values were set to zero.

As mentioned in the introduction, the correlation coefficient of site and reference data should be evaluated before a long-term correction is performed. It is obvious that the correlation coefficient is lower when considering short-term periods (this will shortly be addressed in Sect.

In this section, a brief overview of the two MCP methods used in this study is given. Both implement a linear model to derive a relation between measurement (

The probably most widely used linear model is simple linear regression. In this approach the respective regression parameters

A model which addresses this shortcoming and further develops the simple linear regression approach is the linear regression with residuals (LR) method discussed in

In

In

For each MCP calculation according to Sect.

To derive this error score, the theoretical 1-year energy production of a wind turbine is calculated using the power curve of a 3.2 MW wind turbine

Before experimental analysis is presented, in this section theoretical aspects are discussed. It should be noted that these theoretical considerations are, to some extent, also valid for a long-term assessment which is based on an entire year of measurement data (i.e., as most commonly done in wind resource assessment today). In this case, the inter-annual variations of the wind conditions represent the key factor. However, these are usually smaller than the seasonal variations during the year, which are discussed below.

Both mean and variance of the predicted wind speed distribution have an impact on the power production of a wind turbine which is, eventually, the main target value of a wind resource assessment. In this section, the importance of an error in each of the two statistical metrics is investigated.

It is known that the power in wind is proportional to the wind speed in third power

Applying the (simplified) formula of the Taylor series method for propagation of error

The available 1-year measurement data (see Sect.

Note that simplifications were applied (e.g., neglection of the skewness of the distribution) and that the output of Eq. (

Following these considerations, the sections below address the question of which factors influence the accuracy of the estimation of the mean and the variance when a long-term correction is performed based on one of the two linear MCP approaches.

In both cases of the VR and the LR method, the mean value of the corrected wind speed data is given by

Using the definition of

For the (absolute) bias in mean wind speed this results in

From Eq. (

This part of Eq. (

Similarly to term (1) but related to the reanalysis data, this term reflects the differences of wind conditions in the measurement and long-term period given by the reanalysis data.

The regression parameter

Note that Eq. (

Similarly to the considerations on mean wind speed above, in this section a theoretical perspective on the accuracy in variance is given.
For the variance of the corrected data

In the case of the LR method, the respective formula reads (cf. Eqs.

the accuracy of the reanalysis data in reproducing the annual variability of variance (similarly as discussed for the VR method);

the correlation coefficient; and

the residuals determined in the measurement period or, more specifically, the representativeness of their measured standard deviation

It should be noted that, from a mathematical point of view, factors (2) and (3) are strongly connected (e.g., a lower correlation coefficient implies higher scatter around the linear fit and, hence, variance of the residuals). Therefore, in the experimental section, the analysis is focused on factors (1) and (2).

In the theoretical analysis, different factors were identified which have an impact on the accuracy in mean and variance when an LTC is performed. In the following sections, these are investigated experimentally. Afterwards, MCP calculations are presented. Systematic biases are described and discussed. In a last section, the variation of the results between the different sites is explicitly considered.

In Fig.

Average annual course of (normalized) wind speed in reanalysis and measurement data. Normalization was done by dividing the mean wind speeds observed in the 90 d periods by the respective annual mean. The individual results obtained at the 18 sites were then averaged arithmetically.

As the diagram shows, the annual course of wind conditions is marked by significantly lower mean wind speeds in summer and stronger winds in winter periods. This pattern typically prevails in Central Europe

In order to further analyze this aspect, a parameter

Figure

Deviation between reanalysis and measurement data in (normalized) mean wind speed (period of 90 d, arithmetically averaged over all sites).

Motivated by their relevance in Eq. (

Temporal variations of regression parameter

Comparing the respective definitions of

Normalized linear correlation coefficient between measurement and reanalysis data (periods of 90 d, arithmetically averaged over all sites). In the context of normalization the curves were shifted to a mean of 1 to better identify the (relative) temporal variations during the year.

According to Eq. (

In order to further investigate the capability of the reanalysis data in reproducing the seasonal course of variance, a measure

Deviation from reanalysis to measurement data in (normalized) variance (period of 90 d, arithmetically averaged over all sites).

The differences in variance reach values of up to

MCP calculations based on 90 d of measurement are now presented. For each reanalysis data set, an average value of the individual error scores related to one measurement period is calculated by arithmetically averaging over all sites. First, the focus of the analysis is put on mean and variance of wind speed. Afterwards, seasonal biases in the (theoretical) energy production of a wind turbine are analyzed. In this context, the influence of the systematic biases in both mean and variance on the accuracy in energy production is investigated on an experimental level. The analysis in these sections is focused on systematic biases first. The variability of the results (standard deviation) is presented and discussed in a dedicated section afterwards (Sect.

Figure

Temporal variations during the year of the bias in mean wind speed using the

Strong differences compared to these observations and even contrary behavior can be found when the LR method is used (Fig.

In line with the theoretical considerations in Sect.

As was shown in Fig.

One further example is analyzed briefly here. The largest overestimation of the annual course of wind speed was found for the EMD-WRF Europe

In summary, it can be stated that the capability of the reanalysis data in reproducing the seasonal course of the true wind conditions on-site is an important aspect when considering the bias in mean wind speed. However, positive (or negative) deviations in the seasonal course do not transform to negative (or positive) biases directly. The regression parameter

Note that the influence of the seasonality in

In a study of

With regard to an LTC of short-term wind measurements, the results of this work only partly agree with these findings from literature. It was shown both theoretically and experimentally that, concerning systematic, seasonal biases, a strong dependence on the selected MCP method occurs.

In a study of

In contrast to existing publications, therefore, this study delivers in-depth explanations of the seasonal biases and the differences when applying either the VR or the LR method for an LTC. The considerations in Sect.

In this section, the bias of the MCP predictions with respect to variance is presented and discussed. Figure

Temporal variations during the year of the bias in variance Err

The curves displayed in Fig.

As shown in Fig.

In summary, the amplitudes in Fig.

In Sect.

Err

Temporal variations during the year of the bias in the theoretical annual energy production of a wind turbine Err

Besides that, one specific characteristic of Err

Towards an explanation approach for this observation, the regression parameters

Scatter plot of normalized measured and MERRA-2 data and regression lines to these data using either the VR or the LR method. Normalization was performed by dividing all wind speed values by the overall measured mean. The diagram was produced using the entire measurement data of the 18 sites (at the heights specified in Table

This aspect can be expected to average out when considering mean wind speeds. However, it apparently becomes important in the case of energy production estimation where the shape of the power curve leads to a different importance (or weighting) of wind speed values of different ranges. Any wind-speed-dependent errors of the reanalysis data can further contribute to this issue.

The variations between the sites can be judged an important measure to characterize the reliability of the results. Furthermore, they give an indication for the uncertainty if the systematic, seasonal biases could be removed (e.g., by applying a correction function). Therefore, the standard deviations of Err

Bias variations between the sites (1 SD – standard deviation) with regard to the accuracy of predicting the mean wind speed

Similar to the biases, the variations (standard deviations) are significantly higher for Err

On average, smallest values can be observed in the beginning of the year and in fall (i.e., measurement periods starting in January/February or September/October) for both Err

This study delivered in-depth analysis of seasonal effects in the long-term correction of short-term wind measurements. The provided findings can contribute to a further development of reanalysis data as well as improved MCP methods in this respect.

In a first step, the importance of the accuracy in mean and variance of wind speed was evaluated with regard to a precise estimate of the energy in the wind. It was shown on a theoretical level that the relative error in mean contributes 6 times as much as the relative error in variance in this context. Experimental analysis, in contrast, showed that much larger biases in variance than in mean prevail when MCP predictions are performed (absolute values of more than 15 % were obtained in comparison to values of

A formula was derived which delivered the explanation for the seasonal biases in mean wind speed when applying either the variance ratio or linear regression with residuals method. It was shown that the representativeness of the measurement period, i.e., the similarity of the wind conditions in correlation and correction period, is important. Moreover, the capability of the reference (here reanalysis) data to reproduce the seasonal course proved to be a decisive factor. Lastly, the regression parameter

The largest biases were obtained in the case of measurement periods with non-representative wind conditions (i.e., significantly lower or higher mean wind speeds compared to the annual mean – usually summer and winter periods in Central Europe). The magnitude was shown to depend on the reanalysis data set. Furthermore, a strong dependence on the MCP method was identified; very different, partly even contrary characteristics in the seasonal biases were found for the VR and LR methods.

In general, measurement periods in transitional seasons (spring, fall) not only resulted in smallest biases but also gave the smallest variation between the sites and, thus, the highest reliability of the results. The amplitudes of seasonal bias and standard deviation of the results obtained at the individual sites were roughly of the same magnitude. If short-term wind measurements are used for wind resource assessments, it is, therefore, highly recommended to conduct these measurements in periods which are likely to be characterized by representative wind conditions (with respect to mean wind speed).

Further research is necessary on how the systematic biases and, finally, the uncertainty of the long-term correction of short-term wind measurements can be reduced in an efficient and expedient way. The authors suggest that this could be approached in different ways. On the one hand, a manual correction based on the experiences described above would reduce the biases. However, the reliability (standard deviation) would not change. A statistics-based approach (e.g., averaging the results of different MCP approaches and/or reference data) as well as machine learning approaches (e.g., learning the seasonal effects from other data sets) might result in larger improvements.
On the other hand, the shortcomings of the reference (here reanalysis) data in reproducing the seasonal course could be addressed. Discrepancies regarding temporal changes in synoptic weather patterns or atmospheric stability processes can be named as possible examples for such weaknesses. The inclusion of further meteorological data reflecting these characteristics could form the basis of a physically motivated approach here. The usefulness of removing seasonal biases in, e.g., wind profile extrapolation by including additional parameters like relative humidity was demonstrated in

The software MATLAB was used to generate the results and figures. The codes can be requested from the corresponding author.

The ERA5 and MERRA-2 reanalysis data are freely available on the websites given in the reference list. All other data (including the measurement data) are confidential and/or commercial data and cannot be provided.

AB had the lead in writing the manuscript and developing the theoretical analysis and methodology for this study. AB also performed all data analysis and visualization. LP contributed to the conceptualization, development of the methodology, and to writing the manuscript. LP, DC, and AG had a supervisory role during the development of the methodology, data analysis, and the writing process. DC was also responsible for the funding acquisition and the project administration. AG performed valuable preliminary work. All authors revised and edited the manuscript.

The authors declare that they have no conflict of interest.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors would like to express their gratitude to GWU Umwelttechnik GmbH, Notus Energy, NES GmbH, the Meteorological Institute of the University of Hamburg, and the Karlsruhe Institute of Technology for providing measurement data. Furthermore, the authors thank EMD Deutschland GbR and anemos GmbH for providing mesoscale reanalysis data.

This research was funded by the Federal Ministry of Economic Affairs and Energy (Bundesministerium für Wirtschaft und Energie, BMWi) on the basis of a decision by the German Bundestag (grant no. 0324159E).

This paper was edited by Sara C. Pryor and reviewed by two anonymous referees.