the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Determining the ideal length of wind speed series for wind speed distribution and resource assessment
Abstract. Accurate wind resource assessment depends on wind speed data that capture local wind conditions, crucial for energy estimates and site selection. The International Electrotechnical Commission (IEC) recommends at least one year of data collection, yet this duration may not fully account for interannual variability. While studies often maximize data length, guidance on the minimum duration required for reliable wind speed and power estimates remains limited. To address this gap, we propose a method to quantify the errors introduced by using wind speed series of different lengths for wind speed distributions fitting, relative to long-term data. This allows us to determine the minimum number of hourly observations needed to for a given accuracy level. We apply our method to in-situ weather station observations and ERA5 reanalysis data at 10-meter and 100-meter heights. Our results show that key parameters, including mean, standard deviation, and Weibull parameters, stabilize with relatively short records (~1 month of hourly data), whereas skewness requires at least 1.6 years, and kurtosis requires 88.6 years to stabilize. ERA5 data stabilize with fewer observations but differ from in-situ measurements, requiring careful use. Moreover, combining available hourly data for distribution fitting produces parameters comparable to those obtained when controlling for diurnal and seasonal effects, suggesting discontinuous data can be viable under certain conditions. These findings offer a practical framework for optimizing data collection in wind resource assessments, balancing accuracy and cost-effectiveness.
- Preprint
(3959 KB) - Metadata XML
-
Supplement
(2829 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on wes-2025-25', Anonymous Referee #1, 20 Apr 2025
I am not completely sure of your distinction between good and fair in the grading. I have indicated the level "fair" in the meaning the paper meets the expected requirements for a scientific paper.
My comments regarding the revisions:
The authors have quantified the levels of accuracy of using wind speed series of different lengths regarding estimates of various statistical parameters related to wind energy. The results are however - as they also state - limited to Southern Norway using coastal weather stations only. The authors invite more studies for other areas.
Although not a request, it would have added extra value if it somehow had been included in this paper.
Given that the paper is focusing on wind energy it is a bit surprising that the Wind Atlas approach is not discussed in the paper. Not even mentioned. I would like to see a discussion of wind atlas and the wind energy intensity estimates.
They compare their analysis of observed wind speed with an equivalent analysis of reanalysis data using ECMWF data, ERA5.
They don’t discuss the topography issue. This is important, as one must expect the height of the surface in the nearest gridpoint to be crucial to the interpretation of reanalyzed wind fields very near the surface or close to the surface.
One of the objectives in the paper is addressing the possibility that one can use randomly (in the time series) selected data to obtain the necessary distribution parameters. They state that this method is working for both near the surface and at elevated levels. And they then use this method for their following analyses.
Page 8 and page 10 contain very detailed graphical illustrations of this point. They are however not easily grasped: 5 stations and 7 parameters giving 35 small plots. Even in a A4 print it is not a simple exercise to see every point discussed. Especially in figure 3 the blue colors are not easily identified.
Page 10, figur 3, contain an extra column with a1) - a6) superimposed on the middle column of small plots. Should be removed.
The authors distinguish between statistical parameters that are quickly obtained, like the mean, st.dev. and Weibull parameters, and other parameters like skewness and kurtosis requiring much longer time, respectively 1,6 years and 88 years of data.
I cannot easily see where these numbers are coming from.
A statement about a 88 year time scale based on a time series not longer than 16 years is a bit strange. How is this calculation done? It must be based on some assumptions, but which ones? And what about non-stationarity features of the time series like effects of climate changes?
Citation: https://doi.org/10.5194/wes-2025-25-RC1 -
RC2: 'Comment on wes-2025-25', Anonymous Referee #2, 08 May 2025
In my view, the paper should be subject to major revision.
The study sets out an objective of very high practical importance for wind resource analysis: investigate whether short-term wind speed data from WMO weather stations realistically represent the wind speed statistics. Observations frequently cover a limited time span, which may introduce substantial errors in the estimation of several important parameters of the wind distribution, like its time variability. Therefore, guidance for selecting adequate datasets of adequate time spans is highly relevant.
In my view, the applied method is not adequate for this aim. From the introduction and the description of the objectives of the study, it seems that datasets with different lenghts (720 hours to 6 years) would correspond to series of consecutive hours, so that a 720 hours dataset would correspond to 1 month of consecutive hourly data. In this way, actual observations of different lenghts would be replicated.
But from the description of the method (random sampling) it seems that the authors randomly select 720 separate hours within the full 16-year time series. Random sampling over a long time span of 16 years has much less practical relevance, due to the fact that actual station observations will be based on continuous (or nearly continuous) datasets, not on random measurements with random and large gaps between them. The small data gaps typically found in useful observations do not change this, as the full time span of the observations will still be limited.
For example, a continuous (or nearly continuous) dataset of a few months will not cover adequately the seasonal variability of the wind, but if the data are randomly selected over a large period the seasonal variability may be well represented, as indicated by the results of the study. From a practical point of view, a 720 hourly data dataset randomly selected from a multiyear full time series is not equivalent to a continuous (or nearly continuous) 720-hour dataset covering one month in total.
The result of the paper indicating that mean, standard deviation, and Weibull parameters, stabilize with relatively short records (~1 month of randomly selected hourly data) may not hold for 1-month continuous data.
The study builds on the work of Barthelmie and Pryor (2003), but that paper has a fundamental difference, as their sampling is not random. They use conditional samples to replicate data that could be obtained from remote sensing tools, which gives that paper a high practical relevance.
If there is some misunderstanding of the method on my side, please clarify the description of the method. If I have correctly understood the method, I would suggest two possibilities:
- Change the objectives of the study to adapt them to the method. The practical relevance of the study would be substantially smaller, but it is in any case an interesting investigation about the characteristics of wind distributions. The comparison of random sampling to diurnal-cycle-retained and seasonality-retained data is really interesting in itself. Several aspects of the study offer new insights on wind distributions, like the different error margins analysed, and the comparative analysis of ERA data.
- Change the method to adapt it to the objectives. This could be done by selecting for example many different continuous datasets of the same length (e.g. 720 hour) within the full 16-year time series. The random sampling calculations already done could be retained for comparison purposes. A possibility to take advantage of the work already done, which is technically well performed, would be to use the results from the random sampling calculations to select the lenghts of the continuous datasets to analyse. That is, not all lengths until 6 years would be analysed with continuous datasets, only lengths from 720 hours to a threshold determined from the errors obtained in the random sampling calculations.
A minor point that could be improved is the presentation of the results are the figures. There are too many frames in every figure that make it difficult to see the most relevant results. A selection of a couple of meteorological stations, that are representative of the whole set of stations, could perhaps be done. The full graphs could be moved to the supplementary information section. Also, the 3 curves limiting the 90% confidence intervals are difficult to distinguish, as they largely overlap.Citation: https://doi.org/10.5194/wes-2025-25-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
207 | 59 | 10 | 276 | 17 | 18 | 14 |
- HTML: 207
- PDF: 59
- XML: 10
- Total: 276
- Supplement: 17
- BibTeX: 18
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1