Methods for high-accuracy wind resource assessment to support distributed wind turbine siting

Menear, Kevin; Shaik, Sameer; Sheridan, Lindsay; Duplyakin, Dmitry; Phillips, Caleb

doi:10.5194/wes-2026-75

Preprints

https://doi.org/10.5194/wes-2026-75

Preprints

20 May 2026

| 20 May 2026

Status: this preprint is currently under review for the journal WES.

Methods for high-accuracy wind resource assessment to support distributed wind turbine siting

Kevin Menear, Sameer Shaik, Lindsay Sheridan, Dmitry Duplyakin, and Caleb Phillips

Abstract. Public wind resource datasets are central to wind energy planning, particularly for distributed wind installations where it may be infeasible to collect on-site measurements or run bespoke simulations. Yet despite their broad use, the site-level accuracy at hub height remains only partially quantified. This work addresses this gap in two steps: (i) we develop a unified, observation-based benchmark to evaluate the performance of the most common models used in industry, and (ii) we propose a new machine-learned ensemble approach that leverages multiple models to synthesize improved estimates that address the shortcomings of individual models. For each dataset and observation series we form long-term empirical wind speed quantiles. This quantile representation allows us to compare products with different periods of record without requiring temporal overlap and evaluate both wind speed distribution errors and site-level mean biases. Results show that the ensemble method reduces quantile-dependent mean bias to near zero across the distribution and lowers mean absolute bias in long-term mean wind speed by roughly one-third relative to the best-performing individual dataset. Finally, we use the trained model to produce a national, gridded set of wind speed quantiles for the publicly accessible WindWatts platform. Together, the benchmark, ensemble model, and deployment dataset demonstrate that machine learning can meaningfully correct and combine existing public datasets, providing more reliable, distributional wind resource information for early-stage assessment and planning.

Received: 23 Apr 2026 – Discussion started: 20 May 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Kevin Menear, Sameer Shaik, Lindsay Sheridan, Dmitry Duplyakin, and Caleb Phillips

Status: final response (author comments only)

RC1:
'Comment on wes-2026-75', Anonymous Referee #1, 02 Jun 2026
The paper addresses an important problem: whether machine learning can improve public wind-resource products for wind assessment by combining multiple model datasets with observational data. The observational archive is potentially valuable, and the quantile-based framework is interesting. However, I do not think the current validation is sufficient to support the main claims, especially regarding hub-height performance and practical usefulness for wind-energy siting. My main concerns are listed below.
A central methodological issue is the relationship between 10 m wind and hub-height wind-resource assessment. Ten-meter winds observationally are strongly influenced by local exposure, roughness, obstacles. In models such as WRF, 10 m wind is a diagnostic quantity and may behave differently from winds on other vertical levels. Therefore, good performance at 10 m does not necessarily imply good performance at hub height. Because this paper combines ASOS 10 m observations with GS measurements, it needs to demonstrate explicitly that the method improves hub-height estimates, not merely near-surface winds.

From a different angle, I would like to reflect on the relationship between observation that is a point-like measurement and model output that represents the average wind speed over a grid-cell. This distinction is less important in flat terrain where the wind speed is spatially similar. However, this is very important when talking about complex terrain or coastal regions. In addition, measurements are not randomly distributed across grid cell. People carrying out measurements will put the met masts on top of hills and not in the valleys.
Major comments:
My main concern is that the final verification is performed on the full GS dataset without sufficient stratification by measurement height. From Fig. 3, a large fraction of the GS measurements appears to be below typical wind-energy hub heights, and in my estimation 85% are below 50 m. Therefore, the aggregate performance metrics may be dominated by lower-height measurements. This is problematic because the main application is wind-energy assessment at hub height. For example, Fig. 17 indicates that WEM improves over GWA in approximately 65 % of cases, but without height-stratified metrics it is impossible to know whether this improvement occurs at hub-relevant heights or primarily at lower levels. The authors should report WEM versus GWA performance separately for height bins, for example <20 m, 20–50 m, 50–80 m, and >80 m.

To reiterate, the question of how 10m winds can be used to improve wind assessment at hub heights is quite a complicated one. However, because the GS dataset bins together a lot of measurements close to 10 m (one third of measurements is below 20m), the methodology used completely sidesteps this question. Authors argue that GS and ASOS data should be balanced (Section 4.2.3). Why shouldn’t the GS dataset be similarly balanced between low heights and high heights? It is quite easy to improve the 10 m wind speeds in wind energy datasets, because 10 m wind speeds are of limited direct relevance for wind-energy applications, they are provided mostly for reference.

Another important issue that is present throughout the paper is the idea that combination of different model datasets is useful because the datasets contain independent information (“ uncorrelated error structure”). It is not obvious to me, that using different equations should lead to uncorrelated error structures. In my experience, mesoscale models inherit biases from global reanalysis. As for microscale models, they explicitly use model chain. GWA is based on ERA5 and then downscaled using WRF. If authors want to stress the model independence, they should show it explicitly, using the data available. The results already shown in the paper suggest quite the opposite: Figure 3 shows that adding more models decreases the performance metrics.

Similarly, I think that the reasoning for omitting interannual variability is not well established (Section 3.5). First of all, the time series length of GS dataset is not reported, even in summarized form. In my experience, tall mast campaigns tend to be quite short and I am not sure if I would trust annual variability estimate from two or three years of data, because of questions how representative the data are, and maybe there is autocorrelation in the data. I find Figure 4 unconvincing. Only the first 50 stations are of interest, because the measurements there are over 50 m. More importantly, the spatial distribution is quite irrelevant here because nobody is arguing that whole CONUS has the same wind speed. To show improvement, interannual variability should be compared to the difference between models and observations or a similar metric.

L204: “ The empirical quantile representation also provides a consistent way to recover long-term mean wind speed for each site and height.” For a complete empirical sample, the mean computed from the empirical distribution should be equivalent, up to numerical approximation, to the arithmetic mean of the time series. If there are significant differences, I would expect an error in programming code. The word “smooth” is also unclear in this context, since the result is a single scalar mean for one site and height. I am not sure if spatial smoothness is a desirable result for this application.

I have provided a number of theoretical objections to the methodology used. However, “all models are wrong, but some of them are useful”. Is the WEM dataset useful? I would argue that it is not established in the paper because of GS binning over heights ( Point (1)). In addition, in Figure 18 WEM looks indistinguishable from GWA. If I had to construct the worst possible interpretation based on Points (1) and (2), I would say that WEM is basically the same as GWA in upper levels with all improvement over GWA (Figure 17) concentrated on lower levels. Authors could persuade me otherwise by changing Figure 17 to height-stratified metrics and showing how the WEM datasets differs from GWA in upper levels. On top of that, WEM is provided on ERA5 grid (30 km) which is a significant “upscaling”, i.e., lowering the resolution with respect to 250m of GWA. I am not sure that the increase in WEM quality (that remains to be shown for higher levels) offsets the representativeness error in complex terrain introduced by this drastic loss of resolution.

Minor comments:
Regarding the “de-quantization step” (L152). I understand the need for smoothing the wind speed distribution, however, it would seem to me that the uniform distribution is a bad approximation to use. I would expect wind speeds to obey Weibull distribution. For distributions with small median wind speeds the shape of the distribution is quite steep for small wind speeds. I hypothesize that using uniform distribution could result in unphysical or distorted distributions.

The authors find that slope and aspect are useless as ML parameters (Figure 11 (b)). I would argue that this represents a missed opportunity from methodological point of view. I would argue that it is more likely that improvement will be found when comparing the “real terrain” parameters with “what model thinks” is the parameters at that grid cell. Not sure how well that would work for large grid-cells like ERA5.

Conclusion:
The paper claims improved wind-energy-relevant hub-height assessment, but the validation may be dominated by lower-height measurements and the final WEM product may be too close to GWA, and too coarse spatially, to justify the strength of the claimed practical improvement.
To support the main claims, I would expect the authors to provide:

(1) height-stratified performance metrics specifically above 50 m and above 80 m, especially WEM versus GWA metrics;

(2) a clearer justification for ignoring temporal mismatch and interannual variability including information about the timeseries lengths in GS dataset.

(3) more support for the independence of models, preferably, based in actual statistics comparing model data with to observations

(4) a discussion of whether the ERA5-grid WEM product can improve on the 250 m GWA product for complex-terrain siting.
Citation: https://doi.org/10.5194/wes-2026-75-RC1
RC2: 'Comment on wes-2026-75', Anonymous Referee #2, 13 Jun 2026

This paper presents a comprehensive framework for improving wind resource assessment for distributed wind applications. The authors first construct a large benchmark that evaluates several widely used public wind datasets across the United States using a quantile-based representation. Building on this benchmark, they develop the WindWatts Ensemble Model (WEM), a machine learning approach that combines multiple datasets, topographic features, and site characteristics to predict the wind speed distribution at site level. The results demonstrate that the ensemble approach significantly reduces bias and MAE and produces a national gridded dataset.
This is a strong and well-written paper that addresses an important and practical problem in distributed wind energy. The work is novel and valuable both to researchers and the distributed wind energy community. The idea of combining multiple public datasets into a machine-learning ensemble that predicts wind speed distributions (rather than only mean values) addresses a known limitation in current practice, where users often rely on a single dataset. The framework has clear potential for extension to other applications in renewable resource assessment.

The authors assembled an impressive observational archive, especially the number of tower measurements is noteworthy given the scarcity of hub-height observations in most other studies.
The delivery of a national gridded dataset and integration into the WindWatts platform makes the work directly useful for distributed wind siting and planning.
Minor Concerns and Suggestions:

The manuscript is quite long and could benefit from significant shortening. Some sections, particularly methodological descriptions and supporting details, could be condensed without sacrificing clarity. I suggest moving the details of the ML algorithm and section 6.1 to an appendix. This would improve the readability of the main text and help maintain focus on the key contributions and results.
While the paper is written in a scientific manner, it may be overly dense for the distributed wind energy audience. In particular, some equations and derivations in the main text could be simplified or moved to an appendix.

Citation: https://doi.org/10.5194/wes-2026-75-RC2
RC3: 'Comment on wes-2026-75', Anonymous Referee #3, 22 Jun 2026

The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2026-75/wes-2026-75-RC3-supplement.pdf

Citation: https://doi.org/10.5194/wes-2026-75-RC3

Kevin Menear, Sameer Shaik, Lindsay Sheridan, Dmitry Duplyakin, and Caleb Phillips

Viewed

Total article views: 310 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
221	78	11	310	13	11

HTML: 221
PDF: 78
XML: 11
Total: 310
BibTeX: 13
EndNote: 11

Views and downloads (calculated since 20 May 2026)

Month	HTML	PDF	XML	Total
May 2026	120	51	8	179
Jun 2026	67	17	3	87
Jul 2026	34	10	0	44

Cumulative views and downloads (calculated since 20 May 2026)

Month	HTML	PDF	XML	Total
May 2026	120	51	8	179
Jun 2026	67	17	3	87
Jul 2026	34	10	0	44

Viewed (geographical distribution)

Total article views: 277 (including HTML, PDF, and XML) Thereof 277 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 23 Jul 2026

Short summary

Wind energy projects use public wind maps to estimate power potential, but these maps have errors that vary by location. We compared major maps against real measurements across the United States and found that each has distinct weaknesses. We built a machine learning model that combines multiple maps into more accurate estimates, cutting typical errors by a third. These results are freely available online to help communities make better decisions when planning small-scale wind energy projects.


Total:	0
HTML:	0
PDF:	0
XML:	0