Classification of Leading Edge Erosion Severity via Machine Learning Surrogate Models
Abstract. As the number and size of wind turbines has increased, manual observation and maintenance of the turbines has become increasingly dangerous and time consuming for human operators. One key form of turbine deterioration is leading-edge erosion which degrades the blade laminate over time. This erosion is caused by environmental factors such as blowing sand, rain, and bug accumulation. Blade damage reduces aerodynamic efficiency and shortens the operational lifespan of wind turbines, motivating the need for structural health monitoring systems. Ideally one would like to use a digital twin which couples a physical device (the turbine) with a computer model by bidirectional passage of information between the physical and digital twins. In a digital twin, sensor data from the turbine continually updates the computer model which then predicts the state of the system for future maintenance and operation decisions, potentially eliminating the need for frequent manual inspections. Machine learning-based classifiers trained on simulation data accurately detect damage, but require large training data sets, highlighting the need for computationally efficient alternatives to full physics simulation. A Gaussian process (GP) surrogate model can be trained from a small set of full simulation datapoints. Once trained, GP’s make predictions very fast (1000 times faster than a simulator evaluation) while also providing information about the uncertainty in the emulator prediction relative to the full physical simulator. The GP emulator methodology we employ includes two extensions to the standard GP. First, the output quantity of interest is vector-valued (rather than a scalar). In our case the vector contains statistics of relevant outputs such as lift, drag, generator power, etc. Second, the range of the outputs are constrained to fit specifications of the blade (so are not defined over the usual full-space domain required of Gaussian distributions). In this work we test two random forest classifiers developed to quantify levels of leading edge erosion. The classifiers differ in whether they are trained on full simulation data or data from the GP surrogate. We find that the classifier trained on surrogate data is as accurate as the classifier trained on full simulation data. Using the surrogate-generated dataset the classifier distinguishes between five erosion severity levels with 87% accuracy, surpassing the simulation-trained classifier’s accuracy of 83 %. These results highlight the promise of using GP surrogates to train classifiers for leading edge erosion, a key component of a digital twin for wind turbine maintenance.