Introducing a data-driven approach to predict site-specific leading edge erosion
- 1Department of Wind and Energy Systems, Technical University of Denmark (DTU), 4000 Roskilde, Denmark
- 2Wind Power LAB, 1150 Copenhagen, Denmark
- 3Danish Meteorological Institute (DMI), 2100 Copenhagen, Denmark
- 1Department of Wind and Energy Systems, Technical University of Denmark (DTU), 4000 Roskilde, Denmark
- 2Wind Power LAB, 1150 Copenhagen, Denmark
- 3Danish Meteorological Institute (DMI), 2100 Copenhagen, Denmark
Abstract. Modeling leading edge erosion has been a challenging task due to its multidisciplinary nature involving several variables such as weather conditions, blade coating properties, and operational characteristics. While the process of wind turbine blade erosion is often described by engineering models that rely on the well-known Springer model, there is a glaring need for modeling approaches supported by field data. This paper presents a data-driven framework for modeling erosion damage based on blade inspections from several wind farms in Northern Europe and mesoscale numerical weather prediction (NWP) models. The outcome of the framework is a machine-learning based model that can be used to predict and/or forecast leading edge erosion damages based on weather data/simulations and user-specified wind turbine characteristics. The model is based on feed-forward artificial neural networks utilizing ensemble learning for robust training and validation. The model output fits directly into the damage terminology used by industry and can therefore support site-specific planning and scheduling of repairs as well as budgeting of operation and maintenance costs.
Jens Visbech et al.
Status: closed
-
RC1: 'Comment on wes-2022-55', Anonymous Referee #1, 13 Jul 2022
1) Define NEA and DKA domains more approppriately in the text.
2) It is unclear the reason for the \alpha=0.16 adopted by the power law extrapolation. This extrapolation procedure should be further explained and, furthermore, the sensitivity of the measured quantities with respect to \alpha should be investigated.
3) The main input to the data-driven model is the accumulated rain impingement. Despite it does combine the effect of amount of rain and wind speed, it does not consider a third important parameter which is the rotational speed of the wind turbine. In the paragraph related to line 260, please expand the discussion while taking this argument into account. What would be the implication of including the rotation speed into the prediction models?
4) The authors discuss that the feedforward neural network with a 2-layer architecture with 5 neuros per layer + RELUs; was enough and outperformed other methods such as PCE, support vector machine, . The abiilty to learn non-linear relashionships is not exclusive from FFNN, and the RELU effect to allow a linear piecewise description of the neuron weights can be emulated by another model based on moving averages or any piecewise form of approximation, for instance. The authors are encouraged to improve all the discussion surrounding the choice for the FFNN, exposing also its main limitations and the results of the PCE that was largely investigated.
5) Have the authors considered the use of classical surrogate models such as the Kriging method?
6) The authors should present one application case that applies the developed framework to predict damage in a new site within the covered region, focusing mainly on the workflow that would need to be followed.
-
AC1: 'Reply on RC1', Jens Visbech, 28 Sep 2022
The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2022-55/wes-2022-55-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Jens Visbech, 28 Sep 2022
-
RC2: 'Comment on wes-2022-55', Anonymous Referee #2, 28 Jul 2022
The paper proposes a data-driven approach for predicting leading edge erosion damages. The main contribution of the paper is building a prediction model using realistic data. Data-driven methods for predicting edge erosion damages already exist, however according to the authors, they rely on data that are hard and/or expensive to acquire. The prediction model is an ensemble of feedforward neural networks that is trained using leave-p-out cross-validation to better utilize the limited amount of training data.
1) A comparison between the minimal feature approach followed in this paper and more expensive data-driven approaches mentioned in related work is needed to understand how the “data minimization” affects the quality of predictions.
2) The choice of the ensemble model should be better justified and a comparison to other types of models from interpretable ones like decision trees and linear regression models to black-box models should be added. Interpretability was mentioned as a requirement when discussing dimensionality (around line 255) so an interpretable model might reveal further insights regarding the important features for different damage types
3) Several points need further clarifications:
- Sparsity (Section 2.3) refers to missing data or missing labels? Please expand the discussion on ”sparsity and limitations in data availability” and provide e.g. % of missing data.
- it is not clear what the cardinalities of the training, validation and testing sets are
- the input features should become clear from the very beginning. This information comes too late now and becomes crystal clear only in section 4. A better idea would be to summarize the features in a table, including their value domain.
4) The organization of the paper needs improvement. The introduction section is too long and also covers related work, part of which is also discussed in section 4. I suggest a separate related work section. Section 2.3 is also too long and could be better organized into e.g., data, model and parameter tuning. The discussion section is very interesting but too long to follow. I suggest you split it into different subsections regarding e.g., feature/data choices, model choices, experimental findings etc. Also the discussion includes suggestions for future extensions, the title therefore should be changed accordingly.
5) I believe the novelty of this work is not the ML model but rather the minimal data approach that is followed to train such a model. The title of the paper should therefore be updated.
6) Figure 1 needs improvement, for example, the input features could be clearly indicated. Also, I find the current “ensemble splitting”, “model selection”, “training and validation”, “ML trained model” components not very informative, I believe the input data should be clearly depicted. The training and testing parts are confusing, it seems that only the weather data are used during testing.
7) Statements like “simple neural networks are also well suited as weak learners for ensemble modeling, whereas simple linear regression models are not” (line 315) “they are able to interpolate and, to some extent, extrapolate, which is not the case for other machine learning classes such as support vector machines or those based on decision trees” (line 315) should be supported by appropriate references.
-
AC2: 'Reply on RC2', Jens Visbech, 28 Sep 2022
The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2022-55/wes-2022-55-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Jens Visbech, 28 Sep 2022
Status: closed
-
RC1: 'Comment on wes-2022-55', Anonymous Referee #1, 13 Jul 2022
1) Define NEA and DKA domains more approppriately in the text.
2) It is unclear the reason for the \alpha=0.16 adopted by the power law extrapolation. This extrapolation procedure should be further explained and, furthermore, the sensitivity of the measured quantities with respect to \alpha should be investigated.
3) The main input to the data-driven model is the accumulated rain impingement. Despite it does combine the effect of amount of rain and wind speed, it does not consider a third important parameter which is the rotational speed of the wind turbine. In the paragraph related to line 260, please expand the discussion while taking this argument into account. What would be the implication of including the rotation speed into the prediction models?
4) The authors discuss that the feedforward neural network with a 2-layer architecture with 5 neuros per layer + RELUs; was enough and outperformed other methods such as PCE, support vector machine, . The abiilty to learn non-linear relashionships is not exclusive from FFNN, and the RELU effect to allow a linear piecewise description of the neuron weights can be emulated by another model based on moving averages or any piecewise form of approximation, for instance. The authors are encouraged to improve all the discussion surrounding the choice for the FFNN, exposing also its main limitations and the results of the PCE that was largely investigated.
5) Have the authors considered the use of classical surrogate models such as the Kriging method?
6) The authors should present one application case that applies the developed framework to predict damage in a new site within the covered region, focusing mainly on the workflow that would need to be followed.
-
AC1: 'Reply on RC1', Jens Visbech, 28 Sep 2022
The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2022-55/wes-2022-55-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Jens Visbech, 28 Sep 2022
-
RC2: 'Comment on wes-2022-55', Anonymous Referee #2, 28 Jul 2022
The paper proposes a data-driven approach for predicting leading edge erosion damages. The main contribution of the paper is building a prediction model using realistic data. Data-driven methods for predicting edge erosion damages already exist, however according to the authors, they rely on data that are hard and/or expensive to acquire. The prediction model is an ensemble of feedforward neural networks that is trained using leave-p-out cross-validation to better utilize the limited amount of training data.
1) A comparison between the minimal feature approach followed in this paper and more expensive data-driven approaches mentioned in related work is needed to understand how the “data minimization” affects the quality of predictions.
2) The choice of the ensemble model should be better justified and a comparison to other types of models from interpretable ones like decision trees and linear regression models to black-box models should be added. Interpretability was mentioned as a requirement when discussing dimensionality (around line 255) so an interpretable model might reveal further insights regarding the important features for different damage types
3) Several points need further clarifications:
- Sparsity (Section 2.3) refers to missing data or missing labels? Please expand the discussion on ”sparsity and limitations in data availability” and provide e.g. % of missing data.
- it is not clear what the cardinalities of the training, validation and testing sets are
- the input features should become clear from the very beginning. This information comes too late now and becomes crystal clear only in section 4. A better idea would be to summarize the features in a table, including their value domain.
4) The organization of the paper needs improvement. The introduction section is too long and also covers related work, part of which is also discussed in section 4. I suggest a separate related work section. Section 2.3 is also too long and could be better organized into e.g., data, model and parameter tuning. The discussion section is very interesting but too long to follow. I suggest you split it into different subsections regarding e.g., feature/data choices, model choices, experimental findings etc. Also the discussion includes suggestions for future extensions, the title therefore should be changed accordingly.
5) I believe the novelty of this work is not the ML model but rather the minimal data approach that is followed to train such a model. The title of the paper should therefore be updated.
6) Figure 1 needs improvement, for example, the input features could be clearly indicated. Also, I find the current “ensemble splitting”, “model selection”, “training and validation”, “ML trained model” components not very informative, I believe the input data should be clearly depicted. The training and testing parts are confusing, it seems that only the weather data are used during testing.
7) Statements like “simple neural networks are also well suited as weak learners for ensemble modeling, whereas simple linear regression models are not” (line 315) “they are able to interpolate and, to some extent, extrapolate, which is not the case for other machine learning classes such as support vector machines or those based on decision trees” (line 315) should be supported by appropriate references.
-
AC2: 'Reply on RC2', Jens Visbech, 28 Sep 2022
The comment was uploaded in the form of a supplement: https://wes.copernicus.org/preprints/wes-2022-55/wes-2022-55-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Jens Visbech, 28 Sep 2022
Jens Visbech et al.
Jens Visbech et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
266 | 120 | 14 | 400 | 6 | 4 |
- HTML: 266
- PDF: 120
- XML: 14
- Total: 400
- BibTeX: 6
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1