Inferring Wind Turbine Operational State and Fatigue from High-Frequency Acceleration using Self-Supervised Learning for SCADA-free Monitoring
Abstract. Wind-turbine operation is commonly described using Supervisory Control and Data Acquisition (SCADA) systems. the vast majority of fleet-wide records available for analysis consist of 10-minute averages. These coarse aggregates obscure short transients and dynamic interactions, access is often restricted by proprietary control systems, and the data frequently contain gaps.
Wind-turbine operation is commonly described using SCADA systems. While high-frequency SCADA data (e.g. 1 s resolution) exist, the vast majority of fleet-wide records available for analysis consist of 10-minutes aggergates. These coarse aggregates make them insensitive to short transients. Additionally, access is often restricted by proprietary control systems, and the records frequently contain gaps. To address these limitations, a SCADA-free approach is developed in which operational states are inferred directly from high-frequency nacelle acceleration, a sensor that is increasingly being installed across wind farms, e.g. to monitor loads. The proposed method is based on a denoising autoencoder, to which a Domain-Adversarial Neural Network (DANN) mechanism and a Deep Embedded Clustering (DEC) self-supervision are added. Compact eight-dimensional representations of one-minute vibration spectra between 0 and 3 Hz are learned. Turbine-specific signatures are suppressed through a domain-adversarial regularization, leading to turbine-invariant embeddings that capture a generalized representation of turbine dynamics. A self-supervised DEC objective structures the latent space into discrete and physically meaningful operational regimes. DEC facilitates the post-hoc analysis of the learned embedding Training is performed on data from a 22 out of 44 turbines offshore wind farm sampled at 31.25 Hz, while SCADA signals are used only for validation. Strong correspondence is observed between the learned embeddings and pitch, rotor speed, power, and wind speed, with normalized mutual information above 0.8. Turbine invariance is verified through mutual-information analysis between embeddings and turbine identity. This analysis also reveals clusters within the wind farm and indicates whether the learned representation can be consistently applied across different turbines. As an auxiliary validation, regression models were trained on the learned embeddings to predict 10-minute damage-equivalent moments (DEM). The regressors were fitted using data from only five strain-instrumented turbines and then applied fleet-wide. Accurate fatigue predictions were obtained across all turbines R2 = 0.96, surpassing SCADA-based baselines. This demonstrates that the learned embeddings generalize beyond operational description and contain sufficient load-related information to support fleet-wide fatigue estimation, enabling high-resolution monitoring without dependence on SCADA.