Classification of Leading Edge Erosion Severity via Machine Learning Surrogate Models
Abstract. As the number and size of wind turbines has increased, manual observation and maintenance of the turbines has become increasingly dangerous and time consuming for human operators. One key form of turbine deterioration is leading-edge erosion which degrades the blade laminate over time. This erosion is caused by environmental factors such as blowing sand, rain, and bug accumulation. Blade damage reduces aerodynamic efficiency and shortens the operational lifespan of wind turbines, motivating the need for structural health monitoring systems. Ideally one would like to use a digital twin which couples a physical device (the turbine) with a computer model by bidirectional passage of information between the physical and digital twins. In a digital twin, sensor data from the turbine continually updates the computer model which then predicts the state of the system for future maintenance and operation decisions, potentially eliminating the need for frequent manual inspections. Machine learning-based classifiers trained on simulation data accurately detect damage, but require large training data sets, highlighting the need for computationally efficient alternatives to full physics simulation. A Gaussian process (GP) surrogate model can be trained from a small set of full simulation datapoints. Once trained, GP’s make predictions very fast (1000 times faster than a simulator evaluation) while also providing information about the uncertainty in the emulator prediction relative to the full physical simulator. The GP emulator methodology we employ includes two extensions to the standard GP. First, the output quantity of interest is vector-valued (rather than a scalar). In our case the vector contains statistics of relevant outputs such as lift, drag, generator power, etc. Second, the range of the outputs are constrained to fit specifications of the blade (so are not defined over the usual full-space domain required of Gaussian distributions). In this work we test two random forest classifiers developed to quantify levels of leading edge erosion. The classifiers differ in whether they are trained on full simulation data or data from the GP surrogate. We find that the classifier trained on surrogate data is as accurate as the classifier trained on full simulation data. Using the surrogate-generated dataset the classifier distinguishes between five erosion severity levels with 87% accuracy, surpassing the simulation-trained classifier’s accuracy of 83 %. These results highlight the promise of using GP surrogates to train classifiers for leading edge erosion, a key component of a digital twin for wind turbine maintenance.
The manuscript addresses an important problem in wind turbine monitoring by exploring the use of surrogate models to generate training data for erosion classification. The approach is interesting and the paper is generally well written, but several aspects of the data generation, methodology, and positioning of the contribution could be strengthened to better reflect the complexity of the real monitoring problem and to clarify the novelty of the proposed framework.
Major Points
- The overall contribution is not yet clearly positioned with respect to the existing literature on surrogate modeling and wind turbine condition monitoring. Gaussian-process surrogates, sensitivity analysis, and random forest classifiers are all well-established techniques. The manuscript should clarify more explicitly what methodological advance is introduced beyond applying these tools to a specific erosion-monitoring scenario.
- The claim of novelty regarding the PPzGP surrogate is not sufficiently demonstrated. While combining parallel partial emulation and range-censored Gaussian processes is technically interesting, the manuscript does not clearly show why this combination enables capabilities that standard GP surrogates would not provide for this problem.
- The erosion model used to generate the data is highly simplified. Blade erosion is represented through a parametric scaling of lift and drag coefficients across six blade regions. While this may be suitable for a proof-of-concept study, the manuscript should discuss the limitations of this representation and justify why it captures the key aerodynamic effects of real leading-edge erosion.
- The erosion process is modeled as discrete severity classes rather than a continuous degradation process. In reality erosion evolves gradually and spatially across the blade surface. The use of five artificial classes may simplify the classification task and should be justified more clearly.
- The classification problem may be artificially easy because the erosion perturbations are directly embedded in the aerodynamic coefficients and the classifier is trained on outputs that are strongly linked to those coefficients (e.g., lift and drag sensor statistics). This raises the possibility that the model is learning the synthetic perturbation rather than identifying erosion signatures that would be observable in practice.
- The strong performance of a relatively simple random forest classifier suggests that the generated dataset may be too clean or too easily separable. In practice, leading-edge erosion detection is known to be challenging due to turbulence, operational variability, sensor noise, and confounding effects. The simulations appear to lack these disturbances, which may make the classification task unrealistically simple.
- The simulations assume uniform wind conditions rather than turbulent inflow. Since turbulence strongly influences turbine loads and vibration signals, the use of uniform wind fields likely underestimates the variability present in real monitoring data. Including turbulent wind realizations would significantly improve realism.
- The operational variability of the turbine is limited. Real turbines experience controller transitions, yaw adjustments, and varying operating regimes that influence measured signals. The current simulation setup may not capture these effects.
- The sensor configuration used in the study may not reflect practical monitoring systems. In particular, lift and drag pressure sensors are rarely available in operational wind turbines. The manuscript should discuss the feasibility of the assumed sensing setup or consider signals more commonly available in SCADA or structural monitoring systems.
- The feature extraction strategy reduces time-series signals to simple statistical moments (mean, standard deviation, skewness, kurtosis). This discards potentially important information contained in the temporal and spectral structure of the signals. The authors should justify this choice or explore richer feature representations.
- The surrogate model is trained on a relatively small number of simulations compared to the dimensionality and nonlinearity of the system. Although the reported prediction errors are moderate, the manuscript should discuss potential model bias and the limits of extrapolation.
- The reported classification improvement between simulator-trained and surrogate-trained models (83% vs 87%) is relatively modest. It would be helpful to provide statistical analysis or repeated experiments to assess whether this improvement is significant.
Minor Points
- The manuscript frequently refers to digital twins, but the work primarily demonstrates surrogate modeling and classification using simulated data. Since essential digital twin elements such as data assimilation, state estimation, or online updating are not included, the connection to digital twins should be described more cautiously.
- The introduction is somewhat lengthy and could be shortened. Several sections summarizing wind-energy background material could be condensed to focus more directly on the methodological contribution.
- Some terminology is used interchangeably throughout the manuscript (e.g., emulator, surrogate model). Consistent terminology would improve clarity.
- Figures illustrating emulator predictions are informative but could be improved in readability, particularly through larger axis labels and clearer legends.
- The manuscript would benefit from a clearer discussion of the gap between simulation-based validation and deployment on real turbine monitoring data.
- Minor typographical issues and formatting inconsistencies appear throughout the manuscript and should be corrected during revision.