the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Machine Learning Methods to Improve Spatial Predictions of Coastal Wind Speed Profiles and Low-Level Jets using Single-Level ERA5 Data
Christoffer Hallgren
Jeanie A. Aird
Stefan Ivanell
Heiner Körnich
Ville Vakkari
Rebecca J. Barthelmie
Sara C. Pryor
Erik Sahlée
Abstract. Observations of the wind speed at heights relevant for wind power are sparse, especially offshore, but with emerging aid from advanced statistical methods, it may be possible to derive information regarding wind profiles using surface observations. In this study, two machine learning (ML) methods are developed for predictions of (1) coastal wind speed profiles and (2) low-level jets (LLJs) at three locations of high relevance to offshore wind energy deployment; the U.S. Northeastern Atlantic Coastal Zone, the North Sea, and the Baltic Sea. The ML models are trained on multiple years of lidar profiles and utilize single-level ERA5 variables as input. The models output spatial predictions of coastal wind speed profiles and LLJ occurrence. A suite of nine ERA5 variables are considered for use in the study due to their physics-based relevance in coastal wind speed profile genesis, and the possibility to observe these variables in real-time via measurements. The wind speed at 10 m a.s.l. and the surface sensible heat flux are shown to have the highest importance for both wind speed profile and LLJ predictions. Wind speed profile predictions output by the ML models exhibit similar root mean squared error (RMSE) with respect to observations as is found for ERA5 output. At typical hub heights, the ML models show lower RMSE than ERA5 indicating approximately 5 % RMSE reduction. LLJ identification scores are evaluated using the Symmetric Extremal Dependence Index (SEDI). LLJ predictions from the ML models outperform predictions from ERA5, demonstrating markedly higher SEDIs. However, optimization utilizing the SEDI results in a higher number of false alarms when compared to ERA5.
- Preprint
(5696 KB) - Metadata XML
-
Supplement
(1540 KB) - BibTeX
- EndNote
Christoffer Hallgren et al.
Status: final response (author comments only)
-
RC1: 'Comment on wes-2023-122', Anonymous Referee #1, 25 Oct 2023
The manuscript proposes machine-learning methods to estimate the wind speed profile and low-level jet occurrence from ERA5 single-level data at three coastal sites. The concept is interesting and novel and the authors carried out thorough validation. The text is well-written and, for the most part, clearly presents the background, hypothesis, methods, and results. Some aspects of the results could be elucidated more (see comments and questions below) and the direct usefulness of the methods for wind energy applications could also be highlighted better. My recommendation is to accept with minor revisions.
Specific comments and questions
- P1L17: The Electricity generation number is the capacity, I presume. This should be spelled out.
- P8L186: Were the chosen ERA5 grid points the nearest water point? or were land points used for any of the sites?
- P11L260: I assume the standardization mentioned here is subtracting the mean and dividing by the standard deviation to get z-scores. Please clarify.
- P21L407: You write ``In particular, both the RF and the NN generally result in lower RMSE than ERA5 at typical offshore hub heights when compared to the ground truth lidar profiles.'', but that does not seem true to me at Utö. Looking at Fig. 6 it seems to me the RF model is on par with ERA5 at relevant hub heights, and NN is worse than ERA5. The same can be set for the NN method at ASIT.
- You did a good job of splitting the data into training, testing, and validation data and taking care of the data stratification (seasonality) and correlation (excluding data between sets). However, you could do more to investigate the sensitivity of the results to the splits, e.g. you could have done a k-fold cross-validation of the training+testing data (adjusting exclusion data each time) and presented the robustness across folds. This would also give you a better sense of the robustness of your predictor importance analysis.
- In the same spirit as the question above, it would also be interesting to know more about how testing and validation results differed.
- You evaluate the wind speed profile with the RMSE. The same metric your ML models are optimized for. For the methodology to be truly and generally valuable, I would consider it important to see improvements or at least no major degradation, for other statistics as well. For example, you spend some time in the introduction emphasizing the importance of correctly modeling the shear affecting the rotor. Why not show how well the shear is estimated by your ML approach? A second thing you could also consider is how well distributions are modeled. These ML methods are good mean-finding methods for different conditions. This may give you a low RMSE, however, this can come at the cost of narrower distributions that greatly underestimate the average power density.
- How did you treat/encode the circularity of wdir10 in the ML models? it is not clear to me if you used any kind of encoding, e.g. sin and cos transformation
- You state in your introduction that an important mechanism for LLJ-genesis is warm air advected over cold water so I would assume wind direction to be of some importance (given that the sites are coastal). However, wdir10 appears to be of minor importance (perhaps a bit more important at ASIT). I think it would be good to explain or at least discuss this more. I'm wondering if it's connected to the question of encoding of circularity and the fact that at least ASIT and Utö clearly have land to the North. In both cases, encoding wind direction from 0° to 360° clockwise from North would make it difficult, e.g. for the decision trees, to split land directions from sea directions (since land directions are at both tail-ends).
Technical corrections
- P6L143: "compare" should be "compared"
- P8L196: I assume you mean correlation greater than 0.5, so shouldn't it be ">"' here?
- P9: Predictor importance (PI) should not be italicized
- P12L305: SEDI should not be italicized here.
- P17, Fig. 6 caption: It says ERA5 (dotted), but I see no dots.
Citation: https://doi.org/10.5194/wes-2023-122-RC1 -
RC2: 'Comment on wes-2023-122', Anonymous Referee #2, 30 Nov 2023
Summary:
This paper investigates the development of machine learning (ML) models to predict coastal wind speed profiles and LLJ occurrence from single-level meteorological variables Data from three locations of high relevance to offshore wind energy deployment (the U.S. Northeastern Atlantic Coastal Zone, the North Sea, and the Baltic Sea) were used. The ML models are trained on multiple years of lidar profiles and
utilize single-level ERA5 variables as input. The models provide output spatial predictions of coastal wind speed profiles and LLJ
occurrence.General Comment:
The study is interesting and valuable for the offshore wind energy community. The article is well-written. The authors have used a variety of locations with different wind characteristics to apply the methods, which increases the applicability of the study.
Specific comments:
ERA5 data is quite coarse for wind applications in coastal areas. Could the authors comment/discuss if there is a potential advantage of using wind data of higher resolution for both LLJs and coastal wind speed profiles ?
It would be great that the authors provide some numbers regarding the computational time/power used by the different methods.
Citation: https://doi.org/10.5194/wes-2023-122-RC2
Christoffer Hallgren et al.
Christoffer Hallgren et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
232 | 73 | 6 | 311 | 23 | 2 | 2 |
- HTML: 232
- PDF: 73
- XML: 6
- Total: 311
- Supplement: 23
- BibTeX: 2
- EndNote: 2
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1