the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A decision tree-based measure-correlate-predict approach for peak wind gust estimation from a global reanalysis dataset
Sukanta Basu
Simon J. Watson
Abstract. Peak wind gust (Wp) is a crucial meteorological variable for wind farm planning and operations. However, for many wind farm sites, there is a dearth of on-site measurements of (Wp). In this paper, we propose a machine-learning approach (called INTRIGUE) that utilizes numerous inputs from a public-domain reanalysis dataset, and in turn, generates long-term, site-specific (Wp) series. Through a systematic feature importance study, we also identify the most relevant meteorological variables for (Wp) estimation. Even though the proposed INTRIGUE approach performs very well for nominal conditions compared to specific baselines, its performance for extreme conditions is less than satisfactory.
Serkan Kartal et al.
Status: open (until 04 Jun 2023)
-
RC1: 'Comment on wes-2023-30', Anonymous Referee #1, 10 May 2023
reply
The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2023-30/wes-2023-30-RC1-supplement.pdf
-
RC2: 'Comment on wes-2023-30', Anonymous Referee #2, 21 May 2023
reply
This paper introduces a decision tree-based application of the measure-correlate-predict methodology, which is commonly employed in the wind industry, to estimate wind gusts. The manuscript is very well written and provides interesting historical context. The machine learning approaches outperform ERA5 in wind gust representation at three observational locations in West Texas and predictor variables are ranked in order of importance to the algorithms.
The manuscript, while quite interesting, would benefit from some discussion about the potential applicability of these gust estimation techniques to the wind energy audience of this journal. To elaborate, the tests performed in this work are at 10 m above ground level, while typical turbine hub heights are 80+ m above ground level. Do the authors have any speculation or insight into how the performance of their models and the importance scores of the various parameters might change for a higher height?
Additionally, it would be helpful to understand the ultimate goal of this research to support the wind community. Is the aim to improve gust estimates in wind farm forecasts? Or to provide long-term assessment of the frequency and magnitude of gusts at a proposed wind farm? Lines 198-201 hint at the latter, however, an explicit goal statement would be advantageous to the text.
My one concern with the analysis is the method selected for comparison of timeseries of varying temporal resolution (Lines 182-187). Taking the maximum gust from the 5-minute observations within each hour and assigning it to the top of the hour for evaluation of hourly models seems an unfair comparison, which is particularly noticeable in Figure 5. Was there a reason you did not choose the observed wind gust closest to the top of the hour for your comparisons?
Other comments:
Line 102: “Quasi-universal” is a bit of a strong statement given the small sample size.
Line 177: The authors should consider elaborating on why ERA5 was selected, including citing wind studies that employ it. Additionally, I think a literature review discussion on the accuracy of some of the ERA5 variables employed as predictors in the analysis, in particular the ERA5 instantaneous wind gust, friction velocity, and boundary layer height.
Line 180: What is the distance between each observation station and its nearest ERA5 point?
Section 6.4: I encourage adding bias to the list of performance metrics, as it would be valuable for the wind community to understand whether the evaluated models tend to overestimate or underestimate observed gusts.
Line 268/Tables 3-5 (and 7): “From Tables 3-5, it is clear…” would be clearer if these were figures instead of tables. For example, each one could be a line plot with subplots a, b, c for the different stations.
Line 278: “In Tables 3– 5, all the scores of the ML models are averaged over ten years. Thus, the inter-annual variability of all these models is much lower in comparison to the ERA5 baseline.” These sentences are confusing. Is the inter-annual variability is lower because all of the scores are averaged or because of the model performance?
Section 7.2: This section is quite interesting, given the challenges of observational coverage. I encourage investigation of geographic range of applicability of the discussed techniques for a future paper.
Citation: https://doi.org/10.5194/wes-2023-30-RC2
Serkan Kartal et al.
Serkan Kartal et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
551 | 45 | 12 | 608 | 4 | 5 |
- HTML: 551
- PDF: 45
- XML: 12
- Total: 608
- BibTeX: 4
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1