<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">WES</journal-id><journal-title-group>
    <journal-title>Wind Energy Science</journal-title>
    <abbrev-journal-title abbrev-type="publisher">WES</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Wind Energ. Sci.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">2366-7451</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/wes-11-1363-2026</article-id><title-group><article-title>Inferring wind turbine operational state and fatigue from high-frequency acceleration using self-supervised learning for SCADA (supervisory control and data acquisition)-free monitoring</article-title><alt-title>Accelerometer-derived operational embeddings for wind turbine monitoring</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1">
          <name><surname>Bel-Hadj</surname><given-names>Yacine</given-names></name>
          <email>yacine.bel-hadj@vub.be</email>
        <ext-link>https://orcid.org/0000-0001-6488-2979</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>de Nolasco Santos</surname><given-names>Francisco</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-1614-5442</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Weijtjens</surname><given-names>Wout</given-names></name>
          
        <ext-link>https://orcid.org/0000-0003-4068-8818</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Devriendt</surname><given-names>Christof</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-7041-9948</ext-link></contrib>
        <aff id="aff1"><label>1</label><institution>OWI-Lab, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Elsene, Belgium</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Yacine Bel-Hadj (yacine.bel-hadj@vub.be)</corresp></author-notes><pub-date><day>23</day><month>April</month><year>2026</year></pub-date>
      
      <volume>11</volume>
      <issue>4</issue>
      <fpage>1363</fpage><lpage>1382</lpage>
      <history>
        <date date-type="received"><day>14</day><month>November</month><year>2025</year></date>
           <date date-type="rev-request"><day>1</day><month>December</month><year>2025</year></date>
           <date date-type="rev-recd"><day>11</day><month>February</month><year>2026</year></date>
           <date date-type="accepted"><day>16</day><month>March</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 Yacine Bel-Hadj et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026.html">This article is available from https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026.html</self-uri><self-uri xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026.pdf">The full text article is available as a PDF file from https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e104">Wind turbine operation is commonly described using supervisory control and data acquisition (SCADA) systems. While high-frequency SCADA data (e.g., 1 s resolution) exist, the vast majority of fleet-wide records available for analysis consist of 10 min averages. These coarse aggregates obscure short transients and dynamic interactions, access is often restricted by proprietary control systems, and the data frequently contain gaps. To address these limitations, a SCADA-free approach is developed in which operational states are inferred directly from high-frequency nacelle acceleration, a sensor that is increasingly being installed across wind farms, e.g., to monitor loads. The proposed method is based on a denoising auto-encoder, to which a domain-adversarial neural network (DANN) mechanism and a deep embedded clustering (DEC) self-supervision are added. Compact six-dimensional representations of 1 min vibration spectra between 0 and 3 Hz are learned. Turbine-specific signatures are suppressed through a domain-adversarial regularization, leading to turbine-invariant embeddings that capture a generalized representation of turbine dynamics. A self-supervised DEC objective structures the latent space into discrete and physically meaningful operational regimes, thereby facilitating post hoc analysis of the learned embeddings. Training is performed on data from 11 out of 44 turbines on an offshore wind farm sampled at 31.25 Hz, while SCADA signals are used only for validation. Strong correspondence is observed between the learned embeddings and pitch, rotor speed, power, and wind speed, with normalized mutual information above 0.8. Turbine invariance is verified through mutual-information analysis between embeddings and turbine identity. This analysis also reveals clusters within the wind farm and indicates whether the learned representation can be consistently applied across different turbines. As an auxiliary validation, regression models were trained on the learned embeddings to predict 10 min damage-equivalent moments (DEMs). The regressors were fitted using data from only five strain-instrumented turbines and then were applied fleet-wide. Accurate fatigue predictions were obtained across all turbines, with <inline-formula><mml:math id="M1" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.96</mml:mn></mml:mrow></mml:math></inline-formula>, surpassing SCADA-based baselines. This demonstrates that the learned embeddings generalize beyond operational description and contain sufficient load-related information to support fleet-wide fatigue estimation, enabling high-resolution monitoring without dependence on SCADA.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Agentschap Innoveren en Ondernemen</funding-source>
<award-id>HBC.2024.0130</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e131">Recent years have seen offshore wind growing into a cornerstone of Europe's renewable energy expansion, with turbines steadily increasing in size and farms being installed at greater distances from shore <xref ref-type="bibr" rid="bib1.bibx43" id="paren.1"/>. This development has intensified the demand for reliable monitoring of the assets, which are subject to harsh environmental and operational loads <xref ref-type="bibr" rid="bib1.bibx47" id="paren.2"/>. Ensuring the long-term safety and efficiency of these assets requires not only tracking structural integrity but also attaining a clear understanding of their dynamic behavior under realistic operating conditions. Wind turbines are inherently time-varying systems whose responses depend on a wide range of factors, including wind speed, blade pitch angle, wind direction, and the interaction of rotating components such as rotor blades and the tower <xref ref-type="bibr" rid="bib1.bibx49" id="paren.3"/>. These influences give rise to a broad spectrum of operating dynamics, meaning that structural responses cannot be meaningfully interpreted without knowledge of the underlying operational state <xref ref-type="bibr" rid="bib1.bibx38" id="paren.4"/>. This becomes even more pertinent for modern wind farms where, due to design improvements <xref ref-type="bibr" rid="bib1.bibx12" id="paren.5"/>, structural reserves have been diminished and fatigue has become an operational concern. With fatigue – and, therefore, how long turbines may be operated – being inextricably linked with the turbine's operational state, accurate state description has become fundamental for operators.</p>
      <p id="d2e149">More broadly, when monitoring such assets, knowledge of their operational context is indispensable. It provides the basis not only for structural health monitoring (SHM) but also for performance analysis, fault detection, condition monitoring, and fatigue-life assessment, all of which underpin safer and more cost-effective wind energy production. The importance of operational state information is reflected in international standards. At the design stage, IEC 61400-1 <xref ref-type="bibr" rid="bib1.bibx28" id="paren.6"/> defines a catalog of design load cases (DLCs) that turbines must withstand under prescribed operating and environmental scenarios. For monitoring, IEC 61400-25-6 (2016) <xref ref-type="bibr" rid="bib1.bibx27" id="paren.7"/> introduces the concept of “operational state bins”: a grouping mechanism intended to ensure that signals are only compared under similar conditions. In practice, however, the proposed binning in <xref ref-type="bibr" rid="bib1.bibx27" id="text.8"/> is reduced to power alone, a simplification that is far too coarse for SHM where structural dynamics are more nuanced. For example, a rotor lock and an idling turbine may produce comparable power outputs yet represent fundamentally different dynamic states. In the specific case of damage-equivalent moment (DEM) estimation and farm-wide extrapolation, a wide  range of approaches has been prescribed, from physics-guided neural networks <xref ref-type="bibr" rid="bib1.bibx19" id="paren.9"/> to probabilistic models <xref ref-type="bibr" rid="bib1.bibx26 bib1.bibx2 bib1.bibx41" id="paren.10"/>. However, all studies presuppose the use of supervisory control and data acquisition (SCADA) (along with acceleration, for some) in relation to prediction fatigue loads. The SCADA dependency is so pronounced that, in <xref ref-type="bibr" rid="bib1.bibx17" id="text.11"/>, where a comparative study of model performance based on different SCADA (10 min, 1 s) and accelerometer (low- and high-quality) instrumentation scenarios was conducted, an acceleration-only approach is not even equated. In the study of <xref ref-type="bibr" rid="bib1.bibx18" id="text.12"/>, a farm-wide DEM estimation study on real data, the largest errors were traced to SCADA’s insufficient resolution. Short transients were not captured, and the assumption of constant yaw angle over 10 min often failed. Recent studies have therefore stressed the need to annotate operating conditions to make condition-monitoring results interpretable <xref ref-type="bibr" rid="bib1.bibx16" id="paren.13"/>. The reliability of such annotation is further linked to the ability to evaluate operational conditions consistently, which has been recognized as being central to the stable operation of wind farms and power grids <xref ref-type="bibr" rid="bib1.bibx13" id="paren.14"/>.</p>
      <p id="d2e180">Traditionally, operational state annotation relies on SCADA systems, where multiple variables (power, rotor speed, pitch angle, wind speed) are thresholded into categories such as operating, idling, or stopped. Alternatively, data-driven approaches have attempted to automate this process: <xref ref-type="bibr" rid="bib1.bibx13" id="text.15"/> used principal component analysis (PCA) to reveal dominant operational modes, while <xref ref-type="bibr" rid="bib1.bibx9" id="text.16"/> applied bisecting <inline-formula><mml:math id="M2" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means clustering to SCADA correlation matrices. Yet both thresholding and clustering remain limited by SCADA itself: access is often restricted, signals may be inconsistent across manufacturers, and 10 min averaging obscures transients such as load spikes or start–stop events <xref ref-type="bibr" rid="bib1.bibx30" id="paren.17"/>. In addition, SCADA annotation depends on multiple signals, and so the absence of a single variable can invalidate state classification – a common issue noted by <xref ref-type="bibr" rid="bib1.bibx24" id="text.18"/>. In contrast, acceleration-based approaches require only a single measurement modality and offer higher temporal resolution with fewer failure points. Such signals complement, rather than replace, SCADA by enabling finer detection of operational transients.</p>
      <p id="d2e202">The increasing deployment of accelerometers through IoT (Internet of Things) technologies now makes it possible to collect high-frequency vibration data across entire farms. These measurements embed signatures of both environmental forcing and structural dynamics, providing a powerful alternative to infer operational states directly from vibrations. When SCADA is unavailable or unreliable, vibration-derived annotations can fill the gap, offering insight into downtime, start–stop behavior, and fatigue-relevant transients. Leveraging these high-frequency signals for operational inference is a promising direction.</p>
      <p id="d2e206">Having established the need for SCADA-independent operational inference, the central challenge is to extract operational states directly from high-frequency vibration data without labeled examples. This requires identifying the essential structure within rich, high-dimensional measurements while maintaining their physical interpretability. Representation learning provides a natural framework for this task. Auto-encoders (AEs) <xref ref-type="bibr" rid="bib1.bibx25" id="paren.19"/> and other deep-representation-learning methods <xref ref-type="bibr" rid="bib1.bibx31 bib1.bibx8" id="paren.20"/> learn compact latent spaces that capture dominant patterns of variation while suppressing noise and incidental detail. When applied to physical sensor data, such embeddings often acquire semantic meaning that reflects the underlying system dynamics rather than the raw signal characteristics <xref ref-type="bibr" rid="bib1.bibx40 bib1.bibx46 bib1.bibx20 bib1.bibx6 bib1.bibx7 bib1.bibx5" id="paren.21"/>. Modern AEs extend this principle by incorporating design objectives that encourage disentanglement, hierarchical organization, and clusterability <xref ref-type="bibr" rid="bib1.bibx44" id="paren.22"/>. These inductive properties, often referred to as meta-priors <xref ref-type="bibr" rid="bib1.bibx8" id="paren.23"/>, are particularly valuable in vibration-based monitoring where a limited number of physical processes such as loading, resonance, and rotor interaction govern the measured response. At the core of these extensions lies the intrinsic meta-prior of the auto-encoder itself, which assumes that data can be efficiently represented through a lower-dimensional encoding that preserves the information required for reconstruction. In other words, the AE implicitly promotes representations that compress the signal while retaining its functional structure. Building on these principles, the present work introduces two additional priors tailored to wind turbine monitoring: a domain-adversarial regularization that enforces turbine-invariant embeddings and a clustering objective that structures the latent space into compact and interpretable operational regimes. These ideas have recently been applied within SHM. For example, convolutional auto-encoders have been used to distinguish train directions and axle counts from bridge measurements in an unsupervised setting <xref ref-type="bibr" rid="bib1.bibx6" id="paren.24"/>. Denoising variants improve robustness by reconstructing clean inputs from corrupted observations, which encourages embeddings that generalize across operating conditions <xref ref-type="bibr" rid="bib1.bibx45" id="paren.25"/>. Although contrastive self-supervised methods have also shown promise <xref ref-type="bibr" rid="bib1.bibx34 bib1.bibx39" id="paren.26"/>, auto-encoders remain a simple and effective approach for unsupervised operational-state inference in large-scale structural monitoring.</p>
      <p id="d2e234">While auto-encoder frameworks provide a means to derive compact and informative embeddings, such representations often retain individual turbine biases when transferred across different assets. In wind farms, for example, turbines exhibit subtle yet systematic variations in resonance, foundation stiffness, or sensor placement, which can be encoded in the latent space. This challenge is central to the emerging field of population-based structural health monitoring (PBSHM) <xref ref-type="bibr" rid="bib1.bibx10" id="paren.27"/>, where the objective is to transfer knowledge across a fleet of nominally identical structures while accounting for their inherent variability. In this context, the encoder–decoder can be interpreted as learning a population <italic>form</italic> (a unified functional representation that captures the essential operational dynamics shared across turbines while tolerating structured variability between them). Such a form provides a common reference against which future measurements can be assessed, enabling consistent operational inference across the fleet. One prominent solution for learning such a unified functional representation is domain-adversarial learning., which explicitly enforces invariance to domain differences. The domain-adversarial neural network (DANN) <xref ref-type="bibr" rid="bib1.bibx1" id="paren.28"/> extends the adversarial training paradigm of generative adversarial networks (GANs) to representation learning by coupling the main task with a domain classifier connected through a gradient reversal layer. This forces the encoder to produce embeddings that are expressive for the main task while remaining indistinguishable across domains (i.e., different turbines). Building on this principle, recent studies have demonstrated the versatility of DANN in vibration-based monitoring: <xref ref-type="bibr" rid="bib1.bibx35" id="text.29"/> achieved improved transfer performance in bearing fault diagnosis under variable working conditions with a structured DANN, <xref ref-type="bibr" rid="bib1.bibx33" id="text.30"/> proposed a partial conditional adversarial network to transfer damage knowledge from numerical models to full-scale structures, and <xref ref-type="bibr" rid="bib1.bibx32" id="text.31"/> applied DANN to bridge monitoring by aligning finite-element simulations with field data. Similarly, <xref ref-type="bibr" rid="bib1.bibx36" id="text.32"/> fused domain adaptation with feature engineering to classify unseen damage states in shake-table tests of real buildings. Collectively, these applications underscore the potential of adversarial domain adaptation for mitigating domain shifts in SHM tasks. However, its application to operational state inference in wind turbines – where turbine-specific biases are particularly pronounced – remains unexplored. Beyond adversarial approaches such as DANN, PBSHM has also explored alternative alignment strategies such as balanced distribution adaptation (BDA) <xref ref-type="bibr" rid="bib1.bibx22" id="paren.33"/>, although these methods are typically applied to the transfer of diagnostic knowledge, whereas our focus is solely on learning domain-invariant embeddings without transferring damage labels.</p>
      <p id="d2e262">Complementing the DANN regularization, we incorporate deep embedded clustering (DEC) <xref ref-type="bibr" rid="bib1.bibx48" id="paren.34"/>, a self-supervised framework that jointly learns feature representations and cluster assignments, thereby structuring the latent space into compact and interpretable regions and facilitating post hoc analysis of the learned embedding. DEC has proven to be effective in other domains – for instance, convolutional auto-encoders coupled with DEC have been used to separate vibroseismic, highway traffic, and airport noise sources <xref ref-type="bibr" rid="bib1.bibx42" id="paren.35"/>. To the best of the authors' knowledge, DEC and its derivatives have seen limited application in SHM and wind turbine monitoring.</p>
      <p id="d2e271">Together, DANN and DEC act as complementary inductive priors on the latent space: DANN enforces turbine-invariant representations, while DEC promotes clusterability and interpretability aligned with physical operating regimes.</p>
      <p id="d2e274">Motivated by these developments, we ask the following: can wind turbine operational state be inferred directly from high-frequency acceleration without relying on SCADA during training? We investigate this question on a 44-turbine offshore wind farm, using acceleration sampled at 31.25 Hz. Our approach learns compact six-dimensional latent embeddings from 1 min spectrograms via a domain-adversarial auto-encoder that enforces turbine invariance while preserving operational structure, while DEC is used to improve the interpretability and clusterability of the latent space.</p>
      <p id="d2e277"><bold>Contributions.</bold> This work advances wind turbine monitoring by (i) introducing an <italic>acceleration-only operational-state inference</italic> framework that learns compact latent representations directly from vibration spectrograms; (ii) achieving <italic>cross-turbine generalization</italic> through domain-adversarial training, enabling fleet-wide deployment without per-turbine retraining; (iii) integrating <italic>deep embedded clustering (DEC)</italic> within the auto-encoder to jointly learn turbine-invariant and discretized latent spaces, yielding interpretable representations aligned with distinct operational regimes; and (iv) demonstrating <italic>practical utility</italic> through damage-equivalent moment estimation, illustrating how the learned embeddings support structural health monitoring and fatigue assessment.</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Materials and methods</title>
      <p id="d2e302">This section is organized as follows. First, the offshore wind farm dataset and its instrumentation are described to establish the sensing basis of the study. Next, the preprocessing applied to the raw acceleration data is outlined. The representation-learning framework is then introduced: vibration spectra are encoded through a denoising auto-encoder whose latent space is jointly structured and discretized through deep embedded clustering (DEC), while turbine-specific effects are suppressed via domain-adversarial regularization. This integrated architecture produces turbine-invariant, clusterable embeddings that correspond to distinct operational regimes. An auxiliary procedure for estimating 10 min damage-equivalent moments (DEMs) from sequences of embeddings is also presented. Finally, the evaluation protocol is detailed, employing information-theoretic metrics to assess turbine invariance and operational informativeness.</p>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Site instrumentation and operational variability</title>
      <p id="d2e312">The study is based on operational data collected from an offshore wind farm comprising 44 monopile-supported turbines that are broadly similar in terms of structural dynamics. As noted by <xref ref-type="bibr" rid="bib1.bibx10" id="text.36"/>, such a fleet can be treated as a homogeneous population, though minor variability in resonance frequencies arises from differences in seabed depth, fabrication, and installation tolerances. The layout and sensing configuration are shown in Fig. <xref ref-type="fig" rid="F1"/>. All turbines are equipped with nacelle-mounted dedicated accelerometers that provide the high-frequency vibration  data used in this study. Each nacelle unit contains <inline-formula><mml:math id="M3" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula> channels (fore–aft, side–side, and vertical directions) sampled at <inline-formula><mml:math id="M4" display="inline"><mml:mn mathvariant="normal">31.25</mml:mn></mml:math></inline-formula> Hz. SCADA signals, by contrast, are recorded by the turbine control system at a low frequency of <inline-formula><mml:math id="M5" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>/</mml:mo><mml:mn mathvariant="normal">600</mml:mn></mml:mrow></mml:math></inline-formula> Hz (1 min averages) and are used solely for evaluation and interpretation. Strain gauges installed near the tower–transition piece interface on five “fleet leader” turbines are used to provide fatigue reference data but are costly; consequently, only a limited subset is instrumented, as is common in offshore monitoring <xref ref-type="bibr" rid="bib1.bibx47" id="paren.37"/>. Farm-wide fatigue is typically extrapolated from these leaders using SCADA-based models <xref ref-type="bibr" rid="bib1.bibx17" id="paren.38"/>. In this study, the strain-gauge measurements will be utilized only in Sect. <xref ref-type="sec" rid="Ch1.S3.SS6"/> as the source of ground truth for 10 min damage-equivalent moments (DEMs).</p>

      <fig id="F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e362">Schematic of the offshore wind farm and sensing layout: nacelle accelerometers (blue) provide high-frequency vibration data used for learning operational embeddings, and SCADA signals (green) provide supervisory and control measurements used only for evaluation and interpretation. Tower/monopile strain gauges (orange) are installed on a small subset of turbines – so-called fleet leaders.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f01.png"/>

        </fig>

</sec>
<sec id="Ch1.S2.SSx1" specific-use="unnumbered">
  <title>Operational variability</title>
      <p id="d2e377">Wind turbine operation is traditionally classified from SCADA data using rule-based thresholds applied to variables such as rotor speed, blade pitch, power output, wind speed, and occasionally yaw. Typical operational states include the following: <list list-type="bullet"><list-item>
      <p id="d2e382">parked and rotor lock – rotor stopped (locked), no power production;</p></list-item><list-item>
      <p id="d2e386">ramp-down and ramp-up – controlled deceleration or acceleration of the rotor speed;</p></list-item><list-item>
      <p id="d2e390">idling and spinning – low rotor speed with negligible power;</p></list-item><list-item>
      <p id="d2e394">sub-rated generation – below-rated operation with increasing power and rotor speed;</p></list-item><list-item>
      <p id="d2e398">near or rated generation – high power production close to rated conditions;</p></list-item><list-item>
      <p id="d2e402">curtailed or derated: power limited by control actions or high-wind derating;</p></list-item><list-item>
      <p id="d2e406">high-wind storm control: reduced power with large pitch angles to limit loads;</p></list-item><list-item>
      <p id="d2e410">emergency stop or trip: abrupt shutdown due to protection triggers.</p></list-item></list></p>
      <p id="d2e413">Such SCADA-based classification requires expert-defined thresholds; for example, distinguishing parked from idling often involves checking both rotor speed and wind speed against predefined limits. Such schemes assume stationarity, i.e., that conditions remain constant over the 10 min window. While often reasonable, this assumption hides short-term dynamics such as rotor stops and restarts. Figure <xref ref-type="fig" rid="F2"/> illustrates this point. The spectrogram of nacelle acceleration, obtained with 65 s windows, with 30 s overlap, reveals clear differences between idling and stops, as well as short-lived transitions that would not be visible in SCADA records. Restricting the spectrum to the 0–3 Hz band focuses on the dominant rotor dynamics. These patterns indicate that the 10 min stationarity assumption does not always hold.</p>

      <fig id="F2"><label>Figure 2</label><caption><p id="d2e420">Log amplitude spectrogram (0–3 Hz) of turbine acceleration with state sequence inferred from vibrations.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f02.png"/>

        </fig>

      <p id="d2e429">The acceleration-based approach developed here addresses these shortcomings. By operating directly on high-frequency vibration signals, it enables inference of operational states and transient events at sub-10 min resolution without the need for threshold specification. This enables finer temporal resolution of state estimation and allows for event counting, complementing rather than replacing SCADA. In this study, SCADA signals are used solely for interpretation and validation of the acceleration-derived representations and not for training or direct state inference.</p>
</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Representation-learning model</title>
      <p id="d2e440">The objective of this work is to derive compact, expressive, and turbine-invariant descriptors of the acceleration signals that capture operational variability across the fleet. Such descriptors are generally referred to as <italic>representations</italic>, and, when expressed as numerical vectors produced by a neural network, they are referred to as <italic>embeddings</italic>. An embedding can be understood as a vectorized representation of a signal – a compressed summary of an input window that preserves the essential dynamical information while discarding redundancies. Conceptually, embeddings play a similar role to manually engineered statistical features (e.g., minimum, maximum, variance) but are learned automatically by the network in a data-driven manner.</p>
      <p id="d2e449">In the resulting embedding space, signals recorded under similar operational and environmental conditions are expected to map close together, while signals reflecting different dynamics should be located further apart. The structure of this space should yield well-separated clusters, whereas subtler variations (e.g., between adjacent load levels) should appear to be closer. To ensure that the embeddings remain physically meaningful, they are expected to exhibit strong mutual information with key supervisory variables such as rotor speed, wind speed, and blade pitch angle, the latter being particularly important as it directly defines the turbine’s control state.</p>
      <p id="d2e452">The dataset is composed of accelerometer measurements recorded in multiple directions (e.g., fore–aft, side–side, vertical). These signals can be ingested by the model in several ways: (i) a separate model may be trained for each direction, (ii) a shared architecture may be used while fitting independent model instances per direction, or (iii) a multi-channel architecture may be adopted in which all directions are processed jointly.</p>
      <p id="d2e455">In this work, the third strategy is adopted, with each direction being treated as an input channel, analogously to the color channels in image processing. The detailed multi-channel architecture is provided in Sect. <xref ref-type="sec" rid="Ch1.S2.SS5"/>. For clarity, the preprocessing pipeline is first described in the uni-variate (single-channel) case, and its extension to the three-channel setting is trivial.</p>
<sec id="Ch1.S2.SS2.SSS1">
  <label>2.2.1</label><title>Preprocessing of acceleration data</title>
      <p id="d2e468">Acceleration records are segmented into 1 min windows with a 30 s hop size, corresponding to a 50 % overlap. This duration is sufficient to capture the dominant low-frequency turbine dynamics while remaining short enough to assume approximate stationarity of the signal. Formally, let the raw acceleration signal be

              <disp-formula id="Ch1.E1" content-type="numbered"><label>1</label><mml:math id="M6" display="block"><mml:mrow><mml:mi mathvariant="bold">a</mml:mi><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi>T</mml:mi></mml:msub><mml:mo>]</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            from which overlapping windows of length <inline-formula><mml:math id="M7" display="inline"><mml:mi>L</mml:mi></mml:math></inline-formula> and hop size <inline-formula><mml:math id="M8" display="inline"><mml:mi>H</mml:mi></mml:math></inline-formula> are extracted. The <inline-formula><mml:math id="M9" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th window is denoted by

              <disp-formula id="Ch1.E2" content-type="numbered"><label>2</label><mml:math id="M10" display="block"><mml:mrow><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mo>[</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mi>L</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>]</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mi>H</mml:mi><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e613">To prepare the time series data for neural network input, each window of acceleration measurements is transformed into the frequency domain to capture dominant operational dynamics. A Hann window <inline-formula><mml:math id="M11" display="inline"><mml:mi mathvariant="bold">w</mml:mi></mml:math></inline-formula> is applied to reduce spectral leakage, followed by a fast Fourier transform (FFT) <xref ref-type="bibr" rid="bib1.bibx15" id="paren.39"/>. The log amplitude spectrum is then computed and truncated to the 0–3 Hz band, which covers the range of interest for tower and rotor dynamics. Only the magnitude is retained as phase information is typically less informative in this context. The transformation is defined in Eq. (<xref ref-type="disp-formula" rid="Ch1.E3"/>):

              <disp-formula id="Ch1.E3" content-type="numbered"><label>3</label><mml:math id="M12" display="block"><mml:mrow><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mi>log⁡</mml:mi><mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:mfenced close="|" open="|"><mml:mrow><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">FFT</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold">w</mml:mi><mml:mo>⊙</mml:mo><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:mfenced><mml:mspace width="0.125em" linebreak="nobreak"/></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mi mathvariant="italic">ε</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">3</mml:mn><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi mathvariant="normal">Hz</mml:mi><mml:mo>]</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M13" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> denotes the <inline-formula><mml:math id="M14" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula>th signal window, <inline-formula><mml:math id="M15" display="inline"><mml:mi mathvariant="bold">w</mml:mi></mml:math></inline-formula> is the Hann window, and <inline-formula><mml:math id="M16" display="inline"><mml:mi mathvariant="italic">ε</mml:mi></mml:math></inline-formula> is a small constant ensuring numerical stability of the logarithm. With a sampling rate of <inline-formula><mml:math id="M17" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">s</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">31.25</mml:mn></mml:mrow></mml:math></inline-formula> Hz and a window length of <inline-formula><mml:math id="M18" display="inline"><mml:mrow><mml:mi>N</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">2048</mml:mn></mml:mrow></mml:math></inline-formula>, this procedure yields approximately <inline-formula><mml:math id="M19" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">200</mml:mn></mml:mrow></mml:math></inline-formula> frequency bins per channel.</p>
      <p id="d2e777">Before being fed into the neural network, the spectra <inline-formula><mml:math id="M20" display="inline"><mml:mrow><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> are scaled using min–max normalization. To avoid distortion by outliers, scaling is based on the 0.1th and 99.9th percentiles of the training distribution, computed element-wise across frequency bins. Denoting these percentiles by <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn mathvariant="normal">0.1</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn mathvariant="normal">99.9</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>, the normalized input is

              <disp-formula id="Ch1.E4" content-type="numbered"><label>4</label><mml:math id="M23" display="block"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>=</mml:mo><mml:mover accent="true"><mml:mi mathvariant="normal">Φ</mml:mi><mml:mo stretchy="true" mathvariant="normal">̃</mml:mo></mml:mover><mml:mfenced close=")" open="("><mml:mrow><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">Φ</mml:mi><mml:mfenced open="(" close=")"><mml:mrow><mml:msup><mml:mi mathvariant="bold">a</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:mfenced><mml:mo>-</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn mathvariant="normal">0.1</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn mathvariant="normal">99.9</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn mathvariant="normal">0.1</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            which maps the bulk of the data approximately into the <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula> interval while preserving contrast in the presence of occasional extreme values.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS2">
  <label>2.2.2</label><title>Auto-encoder learning and domain-adversarial training</title>
      <p id="d2e916">We assume that each high-dimensional spectrum <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>M</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> (hundreds of frequency coefficients) is governed by a much smaller set of latent factors <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:mi>d</mml:mi><mml:mo>≪</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula>. While vibration spectra may appear to be complex, their variability is largely explained by a handful of physical drivers such as turbine load, control settings, and environmental conditions. For instance, increasing load raises the overall vibration energy, while rotor speed introduces harmonics at multiples of the blade-passing frequency (3 p, 6 p, etc.). Our objective is therefore to learn a mapping,

              <disp-formula id="Ch1.Ex1"><mml:math id="M28" display="block"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">enc</mml:mi></mml:msub><mml:mo>:</mml:mo><mml:mi mathvariant="bold-italic">x</mml:mi><mml:mo>↦</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            such that <inline-formula><mml:math id="M29" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula> captures the salient operational patterns in a compact form.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS3">
  <label>2.2.3</label><title>Auto-encoder formulation</title>
      <p id="d2e997">Auto-encoders provide a natural framework for this task. A standard auto-encoder consists of an encoder <inline-formula><mml:math id="M30" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">enc</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, which compresses an input spectrum into a latent embedding <inline-formula><mml:math id="M31" display="inline"><mml:mi mathvariant="bold-italic">z</mml:mi></mml:math></inline-formula>, and a decoder, <inline-formula><mml:math id="M32" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">dec</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, which attempts to reconstruct the original signal:

              <disp-formula id="Ch1.E5" content-type="numbered"><label>5</label><mml:math id="M33" display="block"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">enc</mml:mi></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>;</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">enc</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">dec</mml:mi></mml:msub><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>;</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dec</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>

            Here, <inline-formula><mml:math id="M34" display="inline"><mml:mi mathvariant="bold">x</mml:mi></mml:math></inline-formula> denotes the input spectrum <inline-formula><mml:math id="M35" display="inline"><mml:mi mathvariant="bold">x</mml:mi></mml:math></inline-formula>; <inline-formula><mml:math id="M36" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover></mml:math></inline-formula> is the reconstruction; and <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">enc</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dec</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the trainable parameters (weights and biases) of the encoder and decoder, respectively. The latent vector <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> (with <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:mi>d</mml:mi><mml:mo>≪</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:math></inline-formula> when <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>M</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>) provides the compact embedding used in downstream analysis. The reconstruction is trained by minimizing the mean squared error (MSE) between the input and the output,

              <disp-formula id="Ch1.E6" content-type="numbered"><label>6</label><mml:math id="M42" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">AE</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mo>|</mml:mo><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="bold">x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>|</mml:mo><mml:msubsup><mml:mo>|</mml:mo><mml:mn mathvariant="normal">2</mml:mn><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
</sec>
<sec id="Ch1.S2.SS2.SSS4">
  <label>2.2.4</label><title>Denoising criterion</title>
      <p id="d2e1246">To improve robustness, we adopt the <italic>denoising auto-encoder</italic> <xref ref-type="bibr" rid="bib1.bibx45" id="paren.40"/>, in which inputs are corrupted by additive Gaussian noise,

              <disp-formula id="Ch1.E7" content-type="numbered"><label>7</label><mml:math id="M43" display="block"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo stretchy="false" mathvariant="normal">̃</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>+</mml:mo><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="italic">ϵ</mml:mi><mml:mo>∼</mml:mo><mml:mi mathvariant="script">N</mml:mi><mml:mfenced close=")" open="("><mml:mrow><mml:mn mathvariant="normal">0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mi mathvariant="italic">σ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mi>I</mml:mi></mml:mrow></mml:mfenced><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>

            Here, the corruption <inline-formula><mml:math id="M44" display="inline"><mml:mi mathvariant="italic">ϵ</mml:mi></mml:math></inline-formula> represents synthetic perturbations, and its scale <inline-formula><mml:math id="M45" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> controls their strength. Choosing <inline-formula><mml:math id="M46" display="inline"><mml:mi mathvariant="italic">σ</mml:mi></mml:math></inline-formula> on the order of natural measurement noise encourages the model to focus on the meaningful structure of the spectra while ignoring irrelevant fluctuations. The encoder receives <inline-formula><mml:math id="M47" display="inline"><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo mathvariant="normal" stretchy="false">̃</mml:mo></mml:mover></mml:math></inline-formula>, while the decoder is trained to recover the clean <inline-formula><mml:math id="M48" display="inline"><mml:mi mathvariant="bold">x</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e1336">This inductive bias can be interpreted as a restoring mechanism: when noise perturbs the spectrum away from regions of physically plausible turbine data, the model learns to pull it back. In the small-noise limit, the reconstruction function approximates the <italic>score function</italic> <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">∇</mml:mi><mml:mi mathvariant="bold">x</mml:mi></mml:msub><mml:mi>log⁡</mml:mi><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold">x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> <xref ref-type="bibr" rid="bib1.bibx45" id="paren.41"/>, which always points in the direction where the likelihood of real data increases most steeply. Estimating this score is important because it provides the model with a way to distinguish between meaningful operational patterns and incidental deviations. In practice, the network learns to suppress sensor noise or spurious fluctuations while retaining the stable vibration signatures that reflect turbine dynamics.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS5">
  <label>2.2.5</label><title>Domain-adversarial regularization</title>
      <p id="d2e1374">While the denoising criterion ensures robustness, embeddings can still encode turbine-specific signatures (e.g., resonance frequencies or sensor placement). Such features would hinder generalization to unseen turbines and complicate the interpretation of the embedding. To address this, we employ a <italic>domain-adversarial mechanism</italic> <xref ref-type="bibr" rid="bib1.bibx21" id="paren.42"/>, where the domain corresponds to turbine identity. This can be interpreted as a <italic>turbine-adversarial mechanism</italic>, whose objective is to remove turbine-specific information from the embeddings.</p>
      <p id="d2e1386">In practice, a domain classifier <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is attached to the encoder through a gradient reversal layer (GRL). For each embedding <inline-formula><mml:math id="M51" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, the classifier – implemented as a small neural network ending with a softmax layer – predicts the turbine of origin:

              <disp-formula id="Ch1.E8" content-type="numbered"><label>8</label><mml:math id="M52" display="block"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>d</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub><mml:mfenced close=")" open="("><mml:mrow><mml:mi mathvariant="normal">GRL</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>;</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>d</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the predicted turbine label; <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> are the classifier parameters; and <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mi mathvariant="normal">GRL</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> in the forward pass but reverses the gradient during backpropagation, <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:mstyle displaystyle="false"><mml:mfrac style="text"><mml:mrow><mml:mo>∂</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">GRL</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>∂</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="italic">γ</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e1544">The domain loss is defined as the cross-entropy between predicted and true turbine labels:

              <disp-formula id="Ch1.E9" content-type="numbered"><label>9</label><mml:math id="M57" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mn mathvariant="normal">1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mstyle><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>K</mml:mi></mml:munderover><mml:mn mathvariant="bold">1</mml:mn><mml:mo>[</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>]</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>log⁡</mml:mi><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mfenced open="(" close=")"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfenced><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M58" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:msub><mml:mi mathvariant="italic">θ</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mi>k</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> is the predicted probability that embedding <inline-formula><mml:math id="M59" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> originates from turbine <inline-formula><mml:math id="M60" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>, <inline-formula><mml:math id="M61" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the true turbine identity, and <inline-formula><mml:math id="M62" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the mini-batch size. During optimization, the classifier parameters are updated to <italic>minimize</italic> this loss, while the encoder receives the reversed gradient and thus learns to <italic>maximize</italic> it – encouraging domain invariance. This adversarial interaction ensures that the latent embeddings remain informative of operational dynamics while discarding turbine-specific biases.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS6">
  <label>2.2.6</label><title>Deep embedded clustering (DEC)</title>
      <p id="d2e1726">While the denoising and adversarial objectives produce embeddings that are robust and turbine-invariant, the latent space remains continuous, making it difficult to interpret in terms of discrete operational modes. To reveal such regimes, we adopt the deep embedded clustering (DEC) formulation <xref ref-type="bibr" rid="bib1.bibx48" id="paren.43"/>, which jointly refines the encoder and a set of cluster centroids so that embeddings belonging to similar operating conditions are pulled closer together while those representing distinct dynamics are pushed apart.</p>
      <p id="d2e1732">The underlying idea is that the model should first form <italic>compact clusters</italic> – bringing together latent points that correspond to consistent vibration patterns – and then <italic>separate</italic> these clusters sufficiently to produce interpretable operational regimes. To achieve this balance, DEC avoids hard assignments (which can lead to unstable optimization) and instead relies on <italic>soft associations</italic> that gradually sharpen over time.</p>
      <p id="d2e1744">For each embedding <inline-formula><mml:math id="M63" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, its similarity to each cluster centroid <inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is measured using a Student's <inline-formula><mml:math id="M65" display="inline"><mml:mi>t</mml:mi></mml:math></inline-formula> kernel:

              <disp-formula id="Ch1.E10" content-type="numbered"><label>10</label><mml:math id="M66" display="block"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mo>‖</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:msup><mml:mo>‖</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>/</mml:mo><mml:mi mathvariant="italic">α</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:msup><mml:mfenced close=")" open="("><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>+</mml:mo><mml:mo>‖</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:msup><mml:mo>‖</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>/</mml:mo><mml:mi mathvariant="italic">α</mml:mi></mml:mrow></mml:mfenced><mml:mrow><mml:mo>-</mml:mo><mml:mo>(</mml:mo><mml:mi mathvariant="italic">α</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mo>/</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> denotes the soft-assignment probability of sample <inline-formula><mml:math id="M68" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> to cluster <inline-formula><mml:math id="M69" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>. Following <xref ref-type="bibr" rid="bib1.bibx48" id="text.44"/>, <inline-formula><mml:math id="M70" display="inline"><mml:mi mathvariant="italic">α</mml:mi></mml:math></inline-formula> is set to 1 so that kernel has a heavy tail, ensuring that not only nearby point are attracted to the cluster center, which stabilizes cluster formation. The heavy-tailed kernel ensures that nearby points contribute strongly, while distant ones exert diminishing influence, promoting smooth cluster boundaries.</p>
      <p id="d2e1951">To make clusters progressively more distinct, DEC defines a sharpened <italic>target distribution</italic>:

              <disp-formula id="Ch1.E11" content-type="numbered"><label>11</label><mml:math id="M71" display="block"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>/</mml:mo><mml:msub><mml:mo>∑</mml:mo><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mo>∑</mml:mo><mml:mrow><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>(</mml:mo><mml:msubsup><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>/</mml:mo><mml:msub><mml:mo>∑</mml:mo><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:msup><mml:mi>j</mml:mi><mml:mo>′</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            which amplifies confident assignments (large <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>) and down-weights uncertain ones. Intuitively, <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> expresses how much a point currently belongs to a cluster, while <inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> represents where it <italic>should</italic> belong as training refines the latent structure. For instance, consider a sample located between two neighboring regimes: if its current soft assignments are <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.6</mml:mn></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M76" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula>, the target distribution will become <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">0.69</mml:mn></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:mrow></mml:msub><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">0.31</mml:mn></mml:mrow></mml:math></inline-formula> after sharpening. This numerical shift increases the weight of the more confident cluster, gently pulling the sample toward centroid 1. As training proceeds, each embedding migrates toward its most representative cluster.</p>
      <p id="d2e2167">The clustering loss minimizes the Kullback–Leibler divergence between the two distributions:

              <disp-formula id="Ch1.E12" content-type="numbered"><label>12</label><mml:math id="M79" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">DEC</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>i</mml:mi></mml:munder><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mi>j</mml:mi></mml:munder><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mi>log⁡</mml:mi><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            thereby encouraging embeddings to move closer to their respective centroids. Each centroid acts as a gravitational attractor in the latent space, continuously pulling nearby embeddings toward a compact configuration and enhancing separation between clusters.</p>
      <p id="d2e2225">In practice, DEC training proceeds in two stages. First, the encoder is pretrained to solely reconstruct the input to obtain a stable and physically meaningful representation. Then, resulting embeddings are clustered using <inline-formula><mml:math id="M80" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means to initialize the centroids <inline-formula><mml:math id="M81" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">μ</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. In the second stage, the DEC objective is introduced and jointly optimized along with initial reconstruction task, gradually organizing the latent space into discrete, interpretable regions that correspond to turbine operating regimes.</p>
</sec>
<sec id="Ch1.S2.SS2.SSS7">
  <label>2.2.7</label><title>Combined objective and training schedule</title>
      <p id="d2e2255">The encoder–decoder is trained using a composite objective that balances reconstruction fidelity, turbine invariance, and cluster compactness:

              <disp-formula id="Ch1.E13" content-type="numbered"><label>13</label><mml:math id="M82" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>total</mml:mtext></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>rec</mml:mtext></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="italic">λ</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>domain</mml:mtext></mml:msub><mml:mo>+</mml:mo><mml:mi mathvariant="italic">β</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>DEC</mml:mtext></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M83" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>rec</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is the mean-squared reconstruction error (Eq. <xref ref-type="disp-formula" rid="Ch1.E6"/>), <inline-formula><mml:math id="M84" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>domain</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> the domain-adversarial cross-entropy (Eq. <xref ref-type="disp-formula" rid="Ch1.E9"/>), and <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>DEC</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> is the deep embedded clustering regularizer (Eq. <xref ref-type="disp-formula" rid="Ch1.E12"/>). The weights <inline-formula><mml:math id="M86" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M87" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> are epoch-dependent and are chosen heuristically based on the observed training dynamics rather than a principled optimum.</p>
      <p id="d2e2353">Training follows a staged schedule. First, the auto-encoder is trained using <inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>rec</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> alone until epoch <inline-formula><mml:math id="M89" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">start</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dann</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, allowing the encoder–decoder pair to learn a stable and physically meaningful reconstruction manifold.</p>
      <p id="d2e2383">Second, the domain-adversarial objective is activated, and the weighting coefficient <inline-formula><mml:math id="M90" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> is increased linearly from <inline-formula><mml:math id="M91" display="inline"><mml:mn mathvariant="normal">0</mml:mn></mml:math></inline-formula> to <inline-formula><mml:math id="M92" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> over the next <inline-formula><mml:math id="M93" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">duration</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dann</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> epochs. The gradient reversal mechanism forces the encoder to suppress turbine-specific information; in principle, the domain classification loss <inline-formula><mml:math id="M94" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>domain</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> should rise toward the random-guess baseline as turbine identity becomes unrecoverable from the embeddings.</p>
      <p id="d2e2438">Third, after centroid initialization by <inline-formula><mml:math id="M95" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means (with <inline-formula><mml:math id="M96" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">clusters</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula>), the clustering objective is introduced at epoch <inline-formula><mml:math id="M97" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">start</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dec</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M98" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> is increased linearly from <inline-formula><mml:math id="M99" display="inline"><mml:mn mathvariant="normal">0</mml:mn></mml:math></inline-formula> to <inline-formula><mml:math id="M100" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> over <inline-formula><mml:math id="M101" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">duration</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dec</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> epochs. The activation of <inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mtext>DEC</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula> reshapes the latent space to promote compact and separable regimes, which can temporarily increase reconstruction error as the latent geometry reorganizes; the decoder subsequently adapts, and reconstruction error recovers.</p>
      <p id="d2e2533">This staged optimization avoids gradient interference between objectives that impose conflicting constraints on the latent space. Reconstruction first establishes a physically grounded representation. Domain-adversarial regularization then removes turbine-specific bias without collapsing this structure. Clustering is applied last to discretize an already stable embedding. In this order, each objective refines an existing representation rather than competing to define it, which improves training stability and preserves interpretability. The resulting training dynamics are discussed in Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>.</p>
</sec>
</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Operational regime identification from embeddings</title>
      <p id="d2e2548">After training, each embedding is associated with a set of soft-assignment probabilities <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> reflecting its similarity to the learned centroids <inline-formula><mml:math id="M104" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> (Eq. <xref ref-type="disp-formula" rid="Ch1.E10"/>). The most probable centroid is interpreted as the current operational regime.</p>
      <p id="d2e2578">As a result of the combined objective, the latent space remains compact, turbine-invariant, and discretized into regimes that are directly interpretable in terms of turbine operation (e.g., idling, sub-rated, rated, or curtailed states).</p>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Temporal aggregation and damage-equivalent moment (DEM) inference</title>
      <p id="d2e2589">Although the encoder and clustering components operate on short, quasi-stationary spectral segments, fatigue-related quantities such as the 10 min damage-equivalent moment (DEM) depend on how operating conditions evolve over time. To capture these temporal dependencies, the sequence of latent embeddings produced by the encoder <inline-formula><mml:math id="M105" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> is processed by a recurrent model that integrates information across successive windows. In practice, a two-layer long short-term memory (LSTM) network aggregates the embeddings within each 10 min interval and outputs a compact hidden representation summarizing the latent trajectory of the turbine’s dynamic state. A <italic>linear regression head</italic> then maps this representation to the corresponding DEM value, trained under a mean-squared-error objective using reference strain gauge measurements from the fleet leader turbines. During this stage, the encoder parameters are frozen so that the recurrent model learns to interpret the latent dynamics rather than to modify their structure.</p>
      <p id="d2e2610">Here, the linear regression head refers to a single fully connected layer that takes as input the LSTM output representation (context vector). The LSTM and this linear layer are trained end-to-end for DEM prediction, while the encoder remains fixed.</p>
      <p id="d2e2613">This design introduces a clear hierarchy: the auto-encoder acts as a spatial compressor that distills high-dimensional vibration spectra into a compact, turbine-invariant representation; the recurrent module integrates these representations temporally; and the regression head translates the aggregated latent dynamics into a physically meaningful fatigue indicator. Conceptually, this mirrors the structure of <italic>world models</italic> proposed by  <xref ref-type="bibr" rid="bib1.bibx23" id="text.45"/>, in which a variational auto-encoder encodes raw observations, a recurrent model captures temporal evolution in latent space, and a lightweight head operates upon that representation. In a similar spirit, the present framework constructs a latent “world view” of turbine dynamics: one that encapsulates both the instantaneous and evolving behavior of the structure – thereby enabling fatigue estimation directly from vibration-derived embeddings without recourse to SCADA data.</p>
</sec>
<sec id="Ch1.S2.SS5">
  <label>2.5</label><title>Implementation details: multi-branch MLP over spectra</title>
      <p id="d2e2630">Acceleration data are stored in 1 h files, each containing three directional components, hereafter referred to as channels. Corresponding SCADA and fatigue-related data are maintained in a database with a temporal resolution of 10 min. For model training, the acceleration signals are segmented and transformed into spectrograms. Each 1 min spectrogram window is represented as a tensor <inline-formula><mml:math id="M106" display="inline"><mml:mrow><mml:mi mathvariant="bold">x</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>B</mml:mi><mml:mo>×</mml:mo><mml:mi>C</mml:mi><mml:mo>×</mml:mo><mml:mi>F</mml:mi><mml:mo>×</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, with batch size <inline-formula><mml:math id="M107" display="inline"><mml:mi>B</mml:mi></mml:math></inline-formula>, channels <inline-formula><mml:math id="M108" display="inline"><mml:mrow><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:math></inline-formula> (fore–aft, side–side, vertical), frequency bins <inline-formula><mml:math id="M109" display="inline"><mml:mrow><mml:mi>F</mml:mi><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>≈</mml:mo><mml:mspace linebreak="nobreak" width="-0.125em"/><mml:mn mathvariant="normal">200</mml:mn></mml:mrow></mml:math></inline-formula> covering 0–3 Hz, and <inline-formula><mml:math id="M110" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> time frames within the minute. The network comprises per-channel encoders, a latent fusion block, and per-channel decoders, with an LSTM head used only for DEM estimation.</p>
      <p id="d2e2701"><list list-type="custom">
            <list-item><label> </label>

      <p id="d2e2706"><italic>Per-channel encoders.</italic> For each channel <inline-formula><mml:math id="M111" display="inline"><mml:mrow><mml:mi>c</mml:mi><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>,</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, the slice <inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">x</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msup><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>B</mml:mi><mml:mo>×</mml:mo><mml:mi>F</mml:mi><mml:mo>×</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> is reshaped to <inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>B</mml:mi><mml:mi>T</mml:mi><mml:mo>,</mml:mo><mml:mi>F</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> and passed through a small multi-layer perceptron (MLP) <inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>:</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>F</mml:mi></mml:msup><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>→</mml:mo><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> (three 128-unit layers with normalization and ReLU). Each timestamp is treated as an independent sample. We set <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">16</mml:mn></mml:mrow></mml:math></inline-formula>. The resulting per-frame latents <inline-formula><mml:math id="M116" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msubsup><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>c</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msubsup><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>T</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> are concatenated:

                  <disp-formula id="Ch1.Ex2"><mml:math id="M117" display="block"><mml:mrow><mml:msubsup><mml:mi mathvariant="bold">z</mml:mi><mml:mi>t</mml:mi><mml:mi mathvariant="normal">cat</mml:mi></mml:msubsup><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>=</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo mathsize="1.1em">[</mml:mo><mml:msubsup><mml:mi>z</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>;</mml:mo><mml:mi mathvariant="normal">…</mml:mi><mml:mo>;</mml:mo><mml:msubsup><mml:mi>z</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msubsup><mml:mo mathsize="1.1em">]</mml:mo><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>C</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
            </list-item>
            <list-item><label> </label>

      <p id="d2e2933"><italic>Fusion to shared embedding.</italic> A compact fusion MLP <inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:mi mathvariant="italic">ψ</mml:mi><mml:mo>:</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>C</mml:mi><mml:msub><mml:mi>d</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow></mml:msup><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>→</mml:mo><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mn mathvariant="normal">128</mml:mn></mml:msup><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>→</mml:mo><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> (linear–norm–ReLU–linear) maps <inline-formula><mml:math id="M119" display="inline"><mml:mrow><mml:msubsup><mml:mi>z</mml:mi><mml:mi>t</mml:mi><mml:mi mathvariant="normal">cat</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> to a shared latent <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:msub><mml:mi>z</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. Stacking over time yields <inline-formula><mml:math id="M121" display="inline"><mml:mrow><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>B</mml:mi><mml:mo>×</mml:mo><mml:mi>T</mml:mi><mml:mo>×</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, with <inline-formula><mml:math id="M122" display="inline"><mml:mrow><mml:mi>d</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:math></inline-formula> used throughout.</p>
            </list-item>
            <list-item><label> </label>

      <p id="d2e3047"><italic>Per-channel decoders.</italic> Each channel is reconstructed independently from the shared latent via <inline-formula><mml:math id="M123" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">δ</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo>:</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>d</mml:mi></mml:msup><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>→</mml:mo><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mn mathvariant="normal">128</mml:mn></mml:msup><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo>→</mml:mo><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mn mathvariant="normal">128</mml:mn></mml:msup><mml:mspace linebreak="nobreak" width="-0.125em"/><mml:mo>→</mml:mo><mml:mspace linebreak="nobreak" width="-0.125em"/><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mi>F</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>, producing <inline-formula><mml:math id="M124" display="inline"><mml:mrow><mml:mover accent="true"><mml:mi mathvariant="bold">x</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>B</mml:mi><mml:mo>×</mml:mo><mml:mi>C</mml:mi><mml:mo>×</mml:mo><mml:mi>F</mml:mi><mml:mo>×</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> after reshaping.</p>
            </list-item>
            <list-item><label> </label>

      <p id="d2e3132"><italic>DEM head (inference only).</italic> For fatigue estimation, the sequence <inline-formula><mml:math id="M125" display="inline"><mml:mi>Z</mml:mi></mml:math></inline-formula> (computed at a 30 s hop) is fed to a two-layer LSTM (hidden size <inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:mi>h</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">64</mml:mn></mml:mrow></mml:math></inline-formula>). The final context vector is mapped by a linear regressor to the 10 min DEM (Sect. <xref ref-type="sec" rid="Ch1.S3.SS6"/>). The encoder is kept fixed; only the LSTM regressor is trained for DEM.</p>
            </list-item>
          </list></p>
</sec>
<sec id="Ch1.S2.SS6">
  <label>2.6</label><title>Optimization details</title>
      <p id="d2e3168">All network parameters were optimized using the Adam optimizer <xref ref-type="bibr" rid="bib1.bibx29" id="paren.46"/>, with an initial learning rate of <inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:mn mathvariant="normal">5</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. The learning rate was adapted using a <italic>ReduceLROnPlateau</italic> scheduler (reduction factor of <inline-formula><mml:math id="M128" display="inline"><mml:mn mathvariant="normal">0.2</mml:mn></mml:math></inline-formula>, patience of <inline-formula><mml:math id="M129" display="inline"><mml:mn mathvariant="normal">5</mml:mn></mml:math></inline-formula>, minimum learning rate of <inline-formula><mml:math id="M130" display="inline"><mml:mrow><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>), monitored on the basis of the validation reconstruction loss. A batch size of <inline-formula><mml:math id="M131" display="inline"><mml:mn mathvariant="normal">1024</mml:mn></mml:math></inline-formula> was used throughout. To ensure stable optimization, gradient clipping with a maximum <inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">ℓ</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula> norm of <inline-formula><mml:math id="M133" display="inline"><mml:mn mathvariant="normal">1.0</mml:mn></mml:math></inline-formula> and early stopping with a patience of <inline-formula><mml:math id="M134" display="inline"><mml:mn mathvariant="normal">50</mml:mn></mml:math></inline-formula> epochs were applied.</p>
      <p id="d2e3256">For the final model, the gradient reversal scale was set to <inline-formula><mml:math id="M135" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula>, and DEC centroids were initialized using <inline-formula><mml:math id="M136" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means with <inline-formula><mml:math id="M137" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">clusters</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula>.</p>
      <p id="d2e3293">All models were trained for up to <inline-formula><mml:math id="M138" display="inline"><mml:mn mathvariant="normal">1000</mml:mn></mml:math></inline-formula> epochs, although convergence was typically achieved earlier due to early stopping. The resulting loss evolution is discussed in Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>.</p>
</sec>
<sec id="Ch1.S2.SS7">
  <label>2.7</label><title>Evaluation methodology</title>
      <p id="d2e3313">For training, a random subset of 1000 h per turbine was selected from the year 2023, corresponding to approximately 6 weeks of data per turbine. Each turbine was assigned an anonymized identifier, and only 11 out of the 44 turbines were used for model training, corresponding to one-quarter of the fleet. This split was adopted to limit overfitting and to explicitly assess generalization to unseen turbines. Model testing for operational-state inference was conducted on data from the first 2 weeks of 2024, which constitute a temporally disjointed hold-out dataset used exclusively for evaluation and visualization. For the fatigue-related task, data from June to September 2024 were used for testing as this period contains a high number of start–stop events and a wide range of operational conditions. Since high-frequency SCADA labels are unavailable, and low-frequency SCADA signals (mean power, rotor speed, pitch, wind speed), assumed to be constant within each 10 min interval, are used as a reference for evaluation. Under this assumption, a lower-bound estimate of how well the embeddings capture operational information is obtained.</p>
      <p id="d2e3316">After training, two key aspects are examined: (i) whether the learned embeddings eliminated turbine-specific fingerprints and achieve invariance across turbines and (ii) whether the embeddings remain informative about the underlying operational state.</p>
<sec id="Ch1.S2.SS7.SSS1">
  <label>2.7.1</label><title>Turbine invariance</title>
      <p id="d2e3326">A key objective is to verify that the embeddings are not dominated by turbine-specific fingerprints. A straightforward option is to train a classifier to predict turbine identity from the embeddings, but the outcome of this test depends on the chosen classifier. To avoid this dependency, we adopt an information-theoretic approach and quantify the <italic>mutual information</italic> (MI) between turbine identity <inline-formula><mml:math id="M139" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> and the embedding <inline-formula><mml:math id="M140" display="inline"><mml:mi mathvariant="bold">Z</mml:mi></mml:math></inline-formula>:

              <disp-formula id="Ch1.E14" content-type="numbered"><label>14</label><mml:math id="M141" display="block"><mml:mrow><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:mi>T</mml:mi><mml:mo>;</mml:mo><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mo movablelimits="false">∫</mml:mo><mml:mspace linebreak="nobreak" width="-0.125em"/><mml:mspace width="-0.125em" linebreak="nobreak"/><mml:mo movablelimits="false">∫</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>)</mml:mo><mml:mi>log⁡</mml:mi><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mfrac></mml:mstyle><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi>t</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mi mathvariant="normal">d</mml:mi><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="bold-italic">z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the joint distribution of <inline-formula><mml:math id="M143" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M144" display="inline"><mml:mi mathvariant="bold">Z</mml:mi></mml:math></inline-formula>.</p>
      <p id="d2e3465">Since <inline-formula><mml:math id="M145" display="inline"><mml:mi>T</mml:mi></mml:math></inline-formula> has 44 classes, the global MI quantifies – in bits – the total information contained in the embeddings about turbine identity, with an upper bound of <inline-formula><mml:math id="M146" display="inline"><mml:mrow><mml:msub><mml:mi>log⁡</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mn mathvariant="normal">44</mml:mn><mml:mo>)</mml:mo><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">5.46</mml:mn></mml:mrow></mml:math></inline-formula> bits. This bound corresponds to a uniform distribution over turbines, which we approximate by randomly sampling 10 000 embeddings per turbine. While this single scalar captures overall dependence, it does not reveal how <italic>individual turbines</italic> relate to one another. To examine this structure, we compute pairwise MI. For each turbine pair <inline-formula><mml:math id="M147" display="inline"><mml:mrow><mml:mo>(</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula>, the dataset is restricted to samples from turbines <inline-formula><mml:math id="M148" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M149" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>, the identity variable is recoded as binary <inline-formula><mml:math id="M150" display="inline"><mml:mrow><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>∈</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula>, and we estimate

              <disp-formula id="Ch1.Ex3"><mml:math id="M151" display="block"><mml:mrow><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>;</mml:mo><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>)</mml:mo><mml:mspace linebreak="nobreak" width="0.25em"/><mml:mo>≤</mml:mo><mml:mspace width="0.25em" linebreak="nobreak"/><mml:mn mathvariant="normal">1</mml:mn><mml:mspace linebreak="nobreak" width="1em"/><mml:mtext>bit</mml:mtext><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>

            Pairwise MI measures how distinguishable the embeddings of two turbines are: <list list-type="bullet"><list-item>
      <p id="d2e3594"><inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>;</mml:mo><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>)</mml:mo><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula> indicates nearly indistinguishable embeddings, suggesting similar dynamics;</p></list-item><list-item>
      <p id="d2e3625">values approaching 1 bit indicate strong separability, suggesting systematic differences.</p></list-item></list> This pairwise MI inform us about the upper bound of the classification. By arranging all values into a symmetric matrix <inline-formula><mml:math id="M153" display="inline"><mml:mrow><mml:mi mathvariant="bold">D</mml:mi><mml:mo>∈</mml:mo><mml:msup><mml:mi mathvariant="double-struck">R</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>×</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> with entries <inline-formula><mml:math id="M154" display="inline"><mml:mrow><mml:mo>[</mml:mo><mml:mi mathvariant="bold">D</mml:mi><mml:msub><mml:mo>]</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>; <inline-formula><mml:math id="M155" display="inline"><mml:mrow><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>|</mml:mo><mml:mtext>turbines </mml:mtext><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> (units: bits), a turbine similarity map is obtained. Here, <inline-formula><mml:math id="M156" display="inline"><mml:mrow><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>T</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>;</mml:mo><mml:mi mathvariant="bold">Z</mml:mi><mml:mo>|</mml:mo><mml:mtext>turbines </mml:mtext><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes the mutual information computed using only samples from turbines <inline-formula><mml:math id="M157" display="inline"><mml:mi>i</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M158" display="inline"><mml:mi>j</mml:mi></mml:math></inline-formula>. This map can be interpreted in two ways:</p>
      <p id="d2e3750"><list list-type="order">
              <list-item>

      <p id="d2e3755"><italic>Fleet-wide dynamic clustering.</italic> Without adversarial training (no DANN), the map highlights clusters of turbines with similar dynamics, visible as blocks of consistently low MI values within subgroups. This is useful for grouping turbines that operate under comparable dynamic conditions.</p>
              </list-item>
              <list-item>

      <p id="d2e3763"><italic>Global invariance check.</italic> With adversarial training (DANN), turbine-specific fingerprints are suppressed: matrix entries shift toward lower MI values, indicating reduced separability by turbine identity. Therefore, the same post hoc analysis can be applied to all of the turbines.</p>
              </list-item>
            </list>Thus, pairwise MI not only indicates how effectively DANN suppresses turbine-specific signatures but also uncovers a data-driven similarity structure across the fleet, which is valuable for population-based SHM and cross-turbine comparisons.</p>
</sec>
<sec id="Ch1.S2.SS7.SSS2">
  <label>2.7.2</label><title>Operational informativeness</title>
      <p id="d2e3779">The second question concerns whether operational information is preserved in the embeddings. Several evaluation strategies can be considered: (i) correlations with SCADA signals, (ii) training regressors to predict SCADA from embeddings and reporting <inline-formula><mml:math id="M159" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, or (iii) the use of an information-theoretic measure. For consistency, the latter approach is adopted, and the normalized mutual information (NMI) between embeddings and each SCADA variable <inline-formula><mml:math id="M160" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula> is computed:

              <disp-formula id="Ch1.E15" content-type="numbered"><label>15</label><mml:math id="M161" display="block"><mml:mrow><mml:mi mathvariant="normal">NMI</mml:mi><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>;</mml:mo><mml:mi mathvariant="bold-italic">Z</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mi mathvariant="normal">MI</mml:mi><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>;</mml:mo><mml:mi mathvariant="bold-italic">Z</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msqrt><mml:mrow><mml:mi>H</mml:mi><mml:mo>(</mml:mo><mml:mi>S</mml:mi><mml:mo>)</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mi>H</mml:mi><mml:mo>(</mml:mo><mml:mi mathvariant="bold-italic">Z</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:msqrt></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

            where <inline-formula><mml:math id="M162" display="inline"><mml:mrow><mml:mi>H</mml:mi><mml:mo>(</mml:mo><mml:mo>⋅</mml:mo><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> denotes Shannon entropy. Normalization ensures comparability across continuous variables by scaling MI relative to the entropies of <inline-formula><mml:math id="M163" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M164" display="inline"><mml:mi mathvariant="bold">Z</mml:mi></mml:math></inline-formula>. As with MI, NMI is estimated using Miller–Madow entropy estimators as implemented in <xref ref-type="bibr" rid="bib1.bibx11" id="text.47"/>.</p>
      <p id="d2e3888">In practice, 10 000 embeddings are randomly sampled per turbine from the training set to compute MI and NMI. The resulting metrics are used to jointly quantify (i) turbine invariance and (ii) operational informativeness, thereby providing a robust, model-free assessment of the learned representations. Because labeled annotations of transient events are not available, direct evaluation of event detection performance is not feasible; instead, goodness is assessed indirectly via alignment with SCADA variables.</p>
</sec>
<sec id="Ch1.S2.SS7.SSS3">
  <label>2.7.3</label><title>Qualitative visualization</title>
      <p id="d2e3900">Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction <xref ref-type="bibr" rid="bib1.bibx37" id="paren.48"/> was applied to project six-dimensional embeddings into 2D space for visualization. The projections were colored according to SCADA variables and turbine identity so that both operational structure and cross-turbine consistency could be assessed.</p>
</sec>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Results and discussion</title>
      <p id="d2e3916">The learned embeddings are evaluated along four dimensions: (i) preservation of operational information with concurrent suppression of turbine-specific signatures, (ii) generalization to unseen turbines achieved through domain-adversarial training, (iii) discretization of the latent space into interpretable regimes consistent with classical operational states, and (iv) predicting fatigue through the learned embedding as a replacement of the classical SCADA-based models. The training dynamics and loss evolution are analyzed first in Sect. <xref ref-type="sec" rid="Ch1.S3.SS1"/>, followed by the assessment of turbine invariance and operational informativeness in Sect. <xref ref-type="sec" rid="Ch1.S3.SS2"/>. Regime discretization is examined in Section. <xref ref-type="sec" rid="Ch1.S3.SS5"/>, and the presence of fatigue-related information in the latent space is evaluated in Sect. <xref ref-type="sec" rid="Ch1.S3.SS6"/>.</p>
<sec id="Ch1.S3.SS1">
  <label>3.1</label><title>Training dynamics and loss evolution</title>
      <p id="d2e3934">Figure <xref ref-type="fig" rid="F3"/> summarizes the evolution of the loss components during training under the staged optimization strategy described in Sect. <xref ref-type="sec" rid="Ch1.S2.SS2"/>. During the warm-up phase (epochs 0–30), only the reconstruction loss <inline-formula><mml:math id="M165" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">rec</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is optimized. The rapid decrease and subsequent stabilization of this loss indicate that the auto-encoder learns a consistent reconstruction manifold before additional objectives are introduced.</p>
      <p id="d2e3952">At <inline-formula><mml:math id="M166" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">start</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dann</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">30</mml:mn></mml:mrow></mml:math></inline-formula>, the domain-adversarial objective is activated. Its weighting coefficient <inline-formula><mml:math id="M167" display="inline"><mml:mi mathvariant="italic">λ</mml:mi></mml:math></inline-formula> is increased linearly over <inline-formula><mml:math id="M168" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">duration</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dann</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula> epochs until it reaches <inline-formula><mml:math id="M169" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>max⁡</mml:mo></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.8</mml:mn></mml:mrow></mml:math></inline-formula>. Immediately after introduction, the domain classification loss <inline-formula><mml:math id="M170" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">domain</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> drops sharply. This transient behavior reflects the ability of the newly trained domain classifier <inline-formula><mml:math id="M171" display="inline"><mml:mrow><mml:msub><mml:mi>f</mml:mi><mml:mi mathvariant="normal">dom</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> to exploit turbine-specific information still present in the latent representation. As training progresses and the gradient reversal mechanism becomes effective, the encoder increasingly suppresses turbine identity, causing the domain classification loss to rise.</p>
      <p id="d2e4040">With 11 training turbines, random guessing corresponds to a cross-entropy of <inline-formula><mml:math id="M172" display="inline"><mml:mrow><mml:mi>log⁡</mml:mi><mml:mo>(</mml:mo><mml:mn mathvariant="normal">11</mml:mn><mml:mo>)</mml:mo><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">2.4</mml:mn></mml:mrow></mml:math></inline-formula>. As shown in Fig. <xref ref-type="fig" rid="F3"/>, the domain classification loss stabilizes at <inline-formula><mml:math id="M173" display="inline"><mml:mn mathvariant="normal">2.1</mml:mn></mml:math></inline-formula>  The observed plateau therefore indicates that turbine identity becomes increasingly difficult to infer from the latent space, although weak residual turbine-specific structure remains. This behavior is consistent with the objective of domain-adversarial training.</p>
      <p id="d2e4070">The clustering objective is introduced at <inline-formula><mml:math id="M174" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">start</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dec</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">60</mml:mn></mml:mrow></mml:math></inline-formula>. Its weighting coefficient <inline-formula><mml:math id="M175" display="inline"><mml:mi mathvariant="italic">β</mml:mi></mml:math></inline-formula> is increased linearly over <inline-formula><mml:math id="M176" display="inline"><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mrow><mml:mi mathvariant="normal">duration</mml:mi><mml:mo>,</mml:mo><mml:mi mathvariant="normal">dec</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula> epochs until it reaches <inline-formula><mml:math id="M177" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mo>max⁡</mml:mo></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula>. Upon activation, the clustering loss <inline-formula><mml:math id="M178" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">DEC</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> initially takes on high values, reflecting the absence of well-formed clusters. As the latent space is reshaped to promote compact and separable regimes, a transient increase in reconstruction error is observed, caused by a temporary mismatch between the decoder and the reorganized latent geometry. As optimization continues, the decoder adapts, and the reconstruction loss decreases again.</p>
      <p id="d2e4147">The values of <inline-formula><mml:math id="M179" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">λ</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M180" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mo>max⁡</mml:mo></mml:msub></mml:mrow></mml:math></inline-formula> were selected heuristically. Moderate variations around these values did not qualitatively affect the results. The guiding principle was to scale the different loss terms at their max to comparable magnitudes once the reconstruction loss had stabilized. Excessively large values were found to be detrimental. In particular, setting <inline-formula><mml:math id="M181" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">β</mml:mi><mml:mo>max⁡</mml:mo></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">50</mml:mn></mml:mrow></mml:math></inline-formula> led to a significant degradation of reconstruction quality, indicating excessive distortion of the latent space.</p>

      <fig id="F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e4189">Training and validation evolution of the loss components: reconstruction loss <inline-formula><mml:math id="M182" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">rec</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, domain classification loss <inline-formula><mml:math id="M183" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">domain</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, and clustering loss <inline-formula><mml:math id="M184" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="script">L</mml:mi><mml:mi mathvariant="normal">DEC</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f03.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS2">
  <label>3.2</label><title>Assessment of turbine invariance and operational informativeness</title>
      <p id="d2e4239">This part focuses on the first two dimensions of evaluation. Turbine invariance is quantified by means of pairwise mutual information (MI) between turbine identity and the latent embeddings, while operational informativeness is evaluated through normalized mutual information (NMI) between embeddings and key SCADA variables – namely power, rotor speed, pitch angle, and wind speed.</p>
</sec>
<sec id="Ch1.S3.SS3">
  <label>3.3</label><title>Turbine invariance via pairwise MI</title>
      <p id="d2e4250">Turbine invariance was assessed by comparing two models: a plain auto-encoder without adversarial training (<inline-formula><mml:math id="M185" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>) and the same auto-encoder with a domain-adversarial component applied to the latent space (<inline-formula><mml:math id="M186" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula>). The corresponding pairwise MI matrices, <inline-formula><mml:math id="M187" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">D</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M188" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">D</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, are presented in Figs. <xref ref-type="fig" rid="F4"/> and <xref ref-type="fig" rid="F5"/>.</p>
      <p id="d2e4314">In the absence of DANN, elevated MI values were observed for many turbine pairs (Fig. <xref ref-type="fig" rid="F4"/>), indicating that turbine-specific fingerprints were retained in the embeddings alongside operational content. Subgroups of turbines were seen to be more similar to each other than to the remainder of the fleet, consistently with residual structural or site variability encoded in the latent space.</p>
      <p id="d2e4319">With DANN, pairwise MI values were reduced across the matrix <inline-formula><mml:math id="M189" display="inline"><mml:mrow><mml:msup><mml:mi mathvariant="bold">D</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> (Fig. <xref ref-type="fig" rid="F5"/>), showing that turbine identity was suppressed while operational features were preserved. Two turbines (ID nos. 28 and 39) remained more separable than the rest, which is interpreted as a genuinely distinct dynamic rather than a training artifact. No checkerboard pattern indicative of leakage from the odd–even train–test split was observed. A small increase in reconstruction error was induced by the adversarial term, but downstream use was not compromised.</p>
      <p id="d2e4340">The GRL scale <inline-formula><mml:math id="M190" display="inline"><mml:mi mathvariant="italic">γ</mml:mi></mml:math></inline-formula> was selected by scanning <inline-formula><mml:math id="M191" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:mn mathvariant="normal">0.01</mml:mn><mml:mo>,</mml:mo><mml:mspace width="0.125em" linebreak="nobreak"/><mml:mn mathvariant="normal">0.2</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">0.4</mml:mn><mml:mo>,</mml:mo><mml:mspace linebreak="nobreak" width="0.125em"/><mml:mn mathvariant="normal">0.6</mml:mn><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:math></inline-formula> and monitoring the mean of the pairwise MI matrix <inline-formula><mml:math id="M192" display="inline"><mml:mi mathvariant="bold">D</mml:mi></mml:math></inline-formula>. The mean MI was reduced from <inline-formula><mml:math id="M193" display="inline"><mml:mn mathvariant="normal">0.65</mml:mn></mml:math></inline-formula> to <inline-formula><mml:math id="M194" display="inline"><mml:mn mathvariant="normal">0.36</mml:mn></mml:math></inline-formula> and then to <inline-formula><mml:math id="M195" display="inline"><mml:mn mathvariant="normal">0.15</mml:mn></mml:math></inline-formula> and <inline-formula><mml:math id="M196" display="inline"><mml:mn mathvariant="normal">0.12</mml:mn></mml:math></inline-formula>, with an elbow around <inline-formula><mml:math id="M197" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula>. Larger values did not yield meaningful gains and were found to risk latent collapse, and so <inline-formula><mml:math id="M198" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula> was adopted in the final model.</p>

      <fig id="F4"><label>Figure 4</label><caption><p id="d2e4441">Pairwise MI between turbine ID and embeddings before adversarial training. Higher values indicate stronger turbine-specific signatures.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f04.png"/>

        </fig>

      <fig id="F5"><label>Figure 5</label><caption><p id="d2e4452">Pairwise MI after adversarial training (<inline-formula><mml:math id="M199" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.4</mml:mn></mml:mrow></mml:math></inline-formula>). Lower values indicate improved turbine invariance, while structured residuals highlight turbines with similar dynamics.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f05.png"/>

        </fig>

      <p id="d2e4473">Starting from the precomputed pairwise MI matrix <inline-formula><mml:math id="M200" display="inline"><mml:mi mathvariant="bold">D</mml:mi></mml:math></inline-formula> (Fig. <xref ref-type="fig" rid="F4"/>), which was interpreted as a symmetric dissimilarity measure in bits (larger values corresponding to lower similarity), agglomerative hierarchical clustering was performed, resulting in the identification of five clusters (Appendix <xref ref-type="sec" rid="App1.Ch1.S1"/>). A detailed explanation of hierarchical clustering can be found in <xref ref-type="bibr" rid="bib1.bibx14" id="text.49"/>.</p>
      <p id="d2e4490">In Fig. <xref ref-type="fig" rid="F6"/>, the geographic layout of the wind farm is shown, with colors indicating the clusters obtained; numbers correspond to anonymized turbine identifiers. The map was examined to verify that the clustering was not a by-product of wake geometry or simple row positioning (front versus back turbines). No systematic alignment or consistent relation with water depth was observed. The clusters nevertheless appeared to be structured rather than random yet could not be explained by straightforward spatial factors. It is therefore inferred that the grouping most likely reflects a combination of site-specific conditions, control strategies, or structural variability not captured in the available metadata.</p>
      <p id="d2e4495">Two turbines (anonymized ID nos. 28 and 39) were assigned to single-member clusters and also remained the most separable after adversarial training, which suggests that their distinct behavior arises from genuine dynamic differences rather than artifacts of the clustering procedure.</p>

      <fig id="F6"><label>Figure 6</label><caption><p id="d2e4501">Wind farm layout. Numbers denote anonymized turbine IDs; color indicates clusters of the turbine based on the similarity derived from the pairwise MI in Fig. <xref ref-type="fig" rid="F4"/>.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f06.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS4">
  <label>3.4</label><title>Operational informativeness via NMI</title>
      <p id="d2e4521">A central objective of this study is to determine whether operational information typically derived from SCADA can instead be recovered directly from high-frequency acceleration. To evaluate this, normalized mutual information (NMI) values between embeddings and SCADA variables were computed on unseen turbines and are reported in Table <xref ref-type="table" rid="T1"/>. NMI was used because it captures both linear and nonlinear dependencies and provides a normalized measure that is comparable across variables, making it well suited for assessing how much operational content is retained in the embeddings.</p>
      <p id="d2e4526">Across all variables, higher mean NMI values were obtained after adversarial training, with improvements ranging from <inline-formula><mml:math id="M201" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mn mathvariant="normal">0.016</mml:mn></mml:mrow></mml:math></inline-formula> for power to <inline-formula><mml:math id="M202" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mn mathvariant="normal">0.027</mml:mn></mml:mrow></mml:math></inline-formula> for wind speed. Values in the range of 0.75–0.92 indicate that a substantial fraction of the variability in SCADA signals can be captured by the learned embeddings despite the fact that SCADA data were not used during training. This demonstrates that high-frequency acceleration contains operationally relevant information that can be effectively extracted through the proposed representation-learning framework.</p>
      <p id="d2e4549">As an external validation, random forest regressors (<inline-formula><mml:math id="M203" display="inline"><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi mathvariant="normal">est</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></inline-formula>) were trained to predict SCADA variables from the learned embeddings using a 50 % train–test split. The models achieved mean <inline-formula><mml:math id="M204" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> scores of 0.923 for power, 0.882 for pitch, 0.937 for wind speed, and 0.925 for rotor speed across the 44 turbines, indicating that the embeddings retain strong operationally relevant information.</p>
      <p id="d2e4578">Because SCADA variables are available only as 10 min averages, any intra-interval variability captured by the vibration-based embeddings cannot be directly validated. The reported correspondence metrics therefore quantify alignment with a temporally aggregated proxy of the operational state and should be interpreted as conservative lower bounds with respect to the unobserved instantaneous dynamics rather than as an upper limit on achievable predictive performance.</p>

<table-wrap id="T1"><label>Table 1</label><caption><p id="d2e4585">Mean normalized mutual information (NMI) between embeddings and SCADA variables, computed on unseen turbines before and after adversarial training.<inline-formula><mml:math id="M205" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi mathvariant="italic">%</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> denotes relative change.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="center"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">SCADA variable</oasis:entry>
         <oasis:entry colname="col2">No DANN</oasis:entry>
         <oasis:entry colname="col3">DANN</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M206" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="normal">Δ</mml:mi><mml:mi mathvariant="italic">%</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula></oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">mean_power</oasis:entry>
         <oasis:entry colname="col2">0.798</oasis:entry>
         <oasis:entry colname="col3">0.814</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M207" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>2.0 %</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">mean_pitch</oasis:entry>
         <oasis:entry colname="col2">0.753</oasis:entry>
         <oasis:entry colname="col3">0.771</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M208" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>2.4 %</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">mean_windspeed</oasis:entry>
         <oasis:entry colname="col2">0.919</oasis:entry>
         <oasis:entry colname="col3">0.946</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M209" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>2.9 %</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">mean_rpm</oasis:entry>
         <oasis:entry colname="col2">0.748</oasis:entry>
         <oasis:entry colname="col3">0.765</oasis:entry>
         <oasis:entry colname="col4"><inline-formula><mml:math id="M210" display="inline"><mml:mo>+</mml:mo></mml:math></inline-formula>2.3 %</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S3.SS5">
  <label>3.5</label><title>Operational state inference from embeddings</title>
      <p id="d2e4735">The objective in this section is to determine whether the learned latent space can be used to identify distinct operational states of the turbine. As shown previously, the embeddings capture SCADA-like information with high accuracy; here, the focus is on whether these representations can be organized into discrete and interpretable regimes.</p>
      <p id="d2e4738">When no clustering constraint is applied (i.e., DEC is disabled and <inline-formula><mml:math id="M211" display="inline"><mml:mrow><mml:mi mathvariant="italic">β</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>), the latent space is shaped only by reconstruction and domain-adversarial objectives. In this setting, there is no explicit geometric incentive for the encoder to form multiple compact, well-separated operational regimes. The embedding therefore organizes primarily according to the largest, most separable dynamical differences in the data. Empirically, this yields three dominant groups, as illustrated in Fig. <xref ref-type="fig" rid="F9"/>: two small clusters corresponding to non-producing conditions (standstill and parked) and one large cluster that aggregates the full continuum of producing operation. The separation between the two non-producing clusters is consistent with a control-driven distinction (mainly pitch angle differences under low or zero rotor speed), which produces distinct low-frequency spectral signatures. In contrast, within the producing regime, sub-rated, rated, and curtailed behavior form a smooth progression in spectral space (driven by gradual changes in rotor speed, aerodynamic loading, and control action) and therefore remain embedded as a single connected manifold rather than splitting into discrete clusters. This behavior is visible in Fig. <xref ref-type="fig" rid="F9"/>, where producing states form a single connected manifold; the splitting is mainly driven by the pitch angle.</p>
      <p id="d2e4757">By introducing DEC, the latent structure is explicitly encouraged to become clusterable. DEC adds a set of learnable centroids <inline-formula><mml:math id="M212" display="inline"><mml:mrow><mml:mo mathvariant="italic">{</mml:mo><mml:msub><mml:mi mathvariant="bold-italic">μ</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:msubsup><mml:mo mathvariant="italic">}</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>K</mml:mi></mml:msubsup></mml:mrow></mml:math></inline-formula> and optimizes the encoder such that embeddings are pulled toward these centroids via a KL divergence objective between soft assignments and a sharpened target distribution. This explicitly trades a purely continuous representation for one that partitions the operating manifold into <inline-formula><mml:math id="M213" display="inline"><mml:mi>K</mml:mi></mml:math></inline-formula> compact regions. With DEC enabled and <inline-formula><mml:math id="M214" display="inline"><mml:mrow><mml:mi>K</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">5</mml:mn></mml:mrow></mml:math></inline-formula> (a user-defined choice), the previously broad operating manifold is refined into multiple regimes that distinguish different levels of power production and control action. This five-cluster configuration aligns with the canonical division of turbine behavior used in SCADA-based classification while being inferred directly from vibrations and at a higher temporal resolution.</p>
      <p id="d2e4804">Additionally, DEC integrates clustering into the training objective: centroids are part of the model and are refined jointly with the encoder during optimization. In that sense, regime discovery is learned end-to-end rather than imposed only as a purely post hoc clustering step on the final embeddings (although, as is standard in DEC, centroids are initialized from a preliminary clustering such as <inline-formula><mml:math id="M215" display="inline"><mml:mi>k</mml:mi></mml:math></inline-formula>-means before being refined during training).</p>
      <p id="d2e4815">Figure <xref ref-type="fig" rid="F7"/> shows the resulting latent-space partition for the first five turbines during the initial 2 weeks of 2024, serving as the calibration dataset (Sect. <xref ref-type="sec" rid="Ch1.S2.SS7"/>). The identified clusters align clearly with rotor speed thresholds, as shown in Fig. <xref ref-type="fig" rid="F8"/>.</p>

      <fig id="F7"><label>Figure 7</label><caption><p id="d2e4826">UMAP projection of the latent space colored by discovered clusters. The partitioning yields coherent operational regimes.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f07.png"/>

        </fig>

      <fig id="F8"><label>Figure 8</label><caption><p id="d2e4837">UMAP projection colored by normalized mean RPM. The smooth gradient indicates that rotor speed is preserved in the embedding.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f08.png"/>

        </fig>

      <fig id="F9"><label>Figure 9</label><caption><p id="d2e4848">UMAP projection of the latent embeddings learned without deep embedded clustering (DEC), colored by normalized mean blade pitch. The representation separates parked and standstill conditions from operating states, but the operating regime remains a largely continuous manifold without clear sub-structure. This illustrates that, in the absence of an explicit clustering objective, the latent space is not naturally partitioned into distinct operational regimes, motivating the use of DEC to enforce clusterable and interpretable embeddings.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f09.png"/>

        </fig>

      <p id="d2e4858">Overall, the discovered clusters correspond to well-known operating behaviors: parked or idling, standstill, sub-rated, and rated generation. Collapsing the latent geometry into discrete states provides an interpretable layer on top of the embeddings, enabling event monitoring tasks such as start–stop counting at sub-10 min resolution. In scenarios without SCADA, clusters can be interpreted by visually inspecting representative samples; once identified, they can be relabeled with meaningful operational states.</p>
      <p id="d2e4861">To further validate the clustering, the regimes are projected onto SCADA references. In the power curve (Fig. <xref ref-type="fig" rid="F10"/>), the regimes separate into five operating zones. Clusters 3 and 4 overlap at low power, but their distinction becomes clear on the pitch versus wind speed plot (Fig. <xref ref-type="fig" rid="F11"/>), where cluster 4 corresponds to high pitch (curtailed or stopped) and cluster 3 corresponds to lower pitch (idling). A small overlap is also observed between the rated and ramp-up regions, reflecting their similarity in terms of RPM (rotations per minute) and the resulting spectra. These comparisons should be regarded as a lower-bound validation since SCADA signals are available only as 10 min averages, whereas embeddings are computed at a 30 s hop length. The assumption of constant SCADA over 10 min introduces unavoidable mismatches.</p>

      <fig id="F10"><label>Figure 10</label><caption><p id="d2e4870">Normalized power versus wind speed, colored by latent-space clusters. Regimes align with canonical power curve regions.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f10.png"/>

        </fig>

      <fig id="F11"><label>Figure 11</label><caption><p id="d2e4881">Normalized pitch versus wind speed, colored by clusters. High-pitch curtailed or stopped regimes separate from low-pitch operating regimes.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f11.png"/>

        </fig>

      <p id="d2e4890">Finally, the method enables monitoring of high-frequency operational events. By applying the model continuously, it is possible to identify and count start–stop transitions within each hour, capturing short events that are lost in coarse 10 min SCADA averages. Figure <xref ref-type="fig" rid="F12"/> illustrates such a case: six stop–start events are detected, with transitions from low-rate production to standstill. These rapid fluctuations have direct implications for fatigue life, underlining the value of high-resolution, acceleration-based regime inference.</p>

      <fig id="F12"><label>Figure 12</label><caption><p id="d2e4898">Example of high-frequency event detection from embeddings. Six stop–start transitions are resolved within 1 h, highlighting dynamic loading conditions that would be obscured in 10 min SCADA.</p></caption>
          <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f12.png"/>

        </fig>

</sec>
<sec id="Ch1.S3.SS6">
  <label>3.6</label><title>Damage estimation from embeddings</title>
      <p id="d2e4915">Operational-state information is fundamental for wind turbine fatigue assessment. Damage-equivalent moment (DEM) estimation is traditionally performed using SCADA-based models trained on 10 min averages and calibrated on a limited number of strain-instrumented reference turbines. In this work, DEM estimation is used as an auxiliary validation task to assess whether the learned acceleration-derived embeddings preserve load-relevant information.</p>
<sec id="Ch1.S3.SS6.SSS1">
  <label>3.6.1</label><title>Baseline definition</title>
      <p id="d2e4925">In this study, the proposed method is compared against a predefined reference baseline. This baseline corresponds to the artificial neural-network-based fatigue estimation framework introduced in <xref ref-type="bibr" rid="bib1.bibx19" id="text.50"/>. It combines standard 10 min SCADA variables with acceleration-based features derived from simple statistical descriptors of vibration signals (e.g., RMS, variance, and standard deviation). The baseline reflects the current state of practice for farm-wide DEM estimation and is actively used in operational deployments</p>
      <p id="d2e4931">Importantly, the comparison is asymmetric by design. The baseline model is trained on a substantially longer dataset, namely a full year of data (and 2 full years in the internal implementation). In contrast, the proposed approach is trained on only 1000 h per turbine and does not use SCADA information at any stage, although it takes full leverage of the higher sampling rate. The baseline, as such, constitutes a best-case reference rather than an information parity comparator.</p>
      <p id="d2e4934">Therefore, the current work does not attempt to compare a SCADA-only approach against an acceleration-only approach but rather positions our approach in relation to the current state of the art (which uses SCADA and acceleration statistics). In a previous study (albeit undertaken for a different wind farm with different foundations), we have demonstrated how the addition of 10 min acceleration statistics to SCADA-only models improves fatigue prediction accuracy by several percentage points <xref ref-type="bibr" rid="bib1.bibx17" id="paren.51"/>. It is therefore against this baseline that we benchmark our approach.</p>
</sec>
<sec id="Ch1.S3.SS6.SSS2">
  <label>3.6.2</label><title>Training and evaluation protocol</title>
      <p id="d2e4948">Acceleration data are stored in 1 h files, whereas DEM values are available at 10 min resolution. Each acceleration file is segmented into six non-overlapping 10 min intervals, each associated with the corresponding DEM value computed over the same time window.</p>
      <p id="d2e4951">The encoder is kept fixed, and only an LSTM-based regression head is trained to map sequences of embeddings to DEM values. Training and evaluation follow a train-on-four/test-on-one cross-validation strategy across the five strain-instrumented fleet leader turbines; in the reported setup, the model is trained on FL1–FL4 and evaluated on FL5, which is entirely unseen during training. Evaluation is performed exclusively over the full summer period (June–September 2024), which includes a wide range of operational conditions and numerous stop–start events.</p>
</sec>
<sec id="Ch1.S3.SS6.SSS3">
  <label>3.6.3</label><title>Results on the downstream fatigue estimation task</title>
      <p id="d2e4962">Across all fleet leader turbines, the proposed acceleration-only approach achieves predictive performance that is comparable to or exceeds that of the SCADA-based baseline. Differences in <inline-formula><mml:math id="M216" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> typically remain within a narrow range of 0.01–0.02, while mean squared error is consistently similar or lower.</p>
      <p id="d2e4976">Given the strong advantage of the baseline in terms of training data volume and sensor availability, these results indicate that the learned embeddings preserve sufficient load-related information to support fatigue estimation without reliance on SCADA variables or manually engineered features.</p>
      <p id="d2e4979">Finally, fleet-wide applicability is supported by the turbine-invariant representations learned through domain-adversarial training. Pairwise mutual information analysis between embeddings and turbine identity (Fig. <xref ref-type="fig" rid="F5"/>) shows that turbine-specific information is largely suppressed for most units. Turbines 28 and 39 exhibit residual turbine-specific behavior and are therefore excluded from fleet-level fatigue estimation.</p>

<table-wrap id="T2"><label>Table 2</label><caption><p id="d2e4988">DEM prediction based on all strain-instrumented turbines: comparison between the SCADA-based legacy baseline and the proposed acceleration-only embedding approach. Each turbine was unseen during training (train-on-four/test-on-one). A higher <inline-formula><mml:math id="M217" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> is better, and a lower MSE is better.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="left"/>
     <oasis:colspec colnum="3" colname="col3" align="center"/>
     <oasis:colspec colnum="4" colname="col4" align="center"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Turbine</oasis:entry>
         <oasis:entry colname="col2">Model</oasis:entry>
         <oasis:entry colname="col3"><inline-formula><mml:math id="M218" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> (<inline-formula><mml:math id="M219" display="inline"><mml:mo lspace="0mm">↑</mml:mo></mml:math></inline-formula>)</oasis:entry>
         <oasis:entry colname="col4">MSE (10<sup>10</sup>, <inline-formula><mml:math id="M221" display="inline"><mml:mo>↓</mml:mo></mml:math></inline-formula>)</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">FL1</oasis:entry>
         <oasis:entry colname="col2">SCADA baseline</oasis:entry>
         <oasis:entry colname="col3">0.95</oasis:entry>
         <oasis:entry colname="col4">2.1</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Proposed approach</oasis:entry>
         <oasis:entry colname="col3">0.97</oasis:entry>
         <oasis:entry colname="col4">1.5</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">FL2</oasis:entry>
         <oasis:entry colname="col2">SCADA baseline</oasis:entry>
         <oasis:entry colname="col3">0.95</oasis:entry>
         <oasis:entry colname="col4">1.6</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Proposed approach</oasis:entry>
         <oasis:entry colname="col3">0.97</oasis:entry>
         <oasis:entry colname="col4">1.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">FL3</oasis:entry>
         <oasis:entry colname="col2">SCADA baseline</oasis:entry>
         <oasis:entry colname="col3">0.94</oasis:entry>
         <oasis:entry colname="col4">2.9</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Proposed approach</oasis:entry>
         <oasis:entry colname="col3">0.95</oasis:entry>
         <oasis:entry colname="col4">2.8</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">FL4</oasis:entry>
         <oasis:entry colname="col2">SCADA baseline</oasis:entry>
         <oasis:entry colname="col3">0.96</oasis:entry>
         <oasis:entry colname="col4">2.5</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Proposed approach</oasis:entry>
         <oasis:entry colname="col3">0.96</oasis:entry>
         <oasis:entry colname="col4">2.4</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">FL5</oasis:entry>
         <oasis:entry colname="col2">SCADA baseline</oasis:entry>
         <oasis:entry colname="col3">0.95</oasis:entry>
         <oasis:entry colname="col4">1.7</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2">Proposed approach</oasis:entry>
         <oasis:entry colname="col3">0.96</oasis:entry>
         <oasis:entry colname="col4">1.5</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
</sec>
</sec>
<sec id="Ch1.S4" sec-type="conclusions">
  <label>4</label><title>Conclusions</title>
      <p id="d2e5226">This study has demonstrated that high-frequency nacelle acceleration can serve as a reliable foundation for inferring wind turbine operational state when SCADA is unavailable, incomplete, or too coarse. By learning compact, turbine-invariant embeddings of short-time spectrograms, the proposed framework captured operational dynamics at sub-10 min resolution and aligned closely with supervisory variables despite never being trained on SCADA. Domain-adversarial training effectively reduced turbine-specific bias, enabling consistent cross-turbine structure and supporting deployment across a mainly homogeneous fleet without per-turbine training.</p>
      <p id="d2e5229">Discrete operational regimes derived from the embeddings provided an interpretable bridge to classical power curve analysis, allowing events such as starts, stops, and curtailments to be resolved at finer temporal scales than is possible with standard SCADA. In an auxiliary illustration, sequences of embeddings were further shown to predict damage-equivalent moments (DEMs) with competitive accuracy relative to a SCADA-based baseline, demonstrating that acceleration-derived representations can preserve load-relevant information needed for fatigue-related applications.</p>
      <p id="d2e5232">While fatigue estimation was not the primary focus of this work, these results indicate that the operational embeddings retain physically meaningful variability beyond regime identification. A dedicated investigation of the interaction between domain-adversarial regularization and fatigue prediction – quantifying the trade-off between turbine invariance and preservation of load- and site-specific effects such as soil structure interaction – remains an important direction for future research. Such a study would require a broader set of strain-instrumented turbines and is therefore left for future work.</p>
      <p id="d2e5235">Together, these findings establish acceleration-based operational embeddings as a practical and scalable complement to SCADA for structural health monitoring and performance analysis. While the present validation was performed on a single offshore farm, the results suggest a broader potential: cross-farm transfer, integration of physics-informed constraints, and tighter coupling of embeddings to load proxies are promising directions for future research. By leveraging ubiquitous accelerometers and modern representation learning, SCADA-free monitoring becomes a viable path toward richer, higher-resolution insight into turbine dynamics, unlocking new opportunities for condition assessment, fatigue extrapolation, and predictive maintenance across large wind fleets.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Hierarchical clustering from the precomputed MI dissimilarity <bold>D</bold></title>
      <p id="d2e5254">Clustering was performed directly on the precomputed turbine <inline-formula><mml:math id="M222" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> turbine matrix <bold>D</bold>, which encodes pairwise dissimilarity derived from mutual information (MI) between turbine identity and embeddings. Larger entries in <bold>D</bold> indicate lower similarity (stronger turbine-specific signatures), the matrix is symmetric with a zero diagonal, and units are bits.</p>
      <p id="d2e5270">Agglomerative hierarchical clustering with <italic>average linkage</italic> (UPGMA) was applied to <bold>D</bold>. The number of clusters was determined by the <italic>largest merge jump</italic> rule: the tree was cut at the midway between the two consecutive merges exhibiting the largest increase in linkage distance, yielding <inline-formula><mml:math id="M223" display="inline"><mml:mn mathvariant="normal">5</mml:mn></mml:math></inline-formula> clusters. Leaf labels were anonymized using the same mapping as in the main text. The linkage distance on the vertical axis shares the units of <bold>D</bold> (bits). The resulting partition is the one used to color the geographic layout map in Fig. <xref ref-type="fig" rid="F6"/>. The dendrogram below corresponds to embeddings trained without adversarial regularization (<inline-formula><mml:math id="M224" display="inline"><mml:mrow><mml:mi mathvariant="italic">γ</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0</mml:mn></mml:mrow></mml:math></inline-formula>).</p>

      <fig id="FA1"><label>Figure A1</label><caption><p id="d2e5309">Dendrogram from pairwise mutual-information dissimilarity <inline-formula><mml:math id="M225" display="inline"><mml:mi mathvariant="bold">D</mml:mi></mml:math></inline-formula> between turbines based on acceleration-derived embeddings (no DANN). Each merge height reflects the dissimilarity in bits; higher values indicate more distinct turbine dynamics. The five clusters obtained correspond to groups of turbines with similar vibration behavior as represented by the auto-encoder.</p></caption>
        
        <graphic xlink:href="https://wes.copernicus.org/articles/11/1363/2026/wes-11-1363-2026-f13.png"/>

      </fig>


</app>
  </app-group><notes notes-type="specialsection"><title>Methods</title>
    

      <p id="d2e5335">The authors used ChatGPT (version GPT-5) during the preparation of this work to improve the language. These tools were used to streamline the writing process but not to generate or interpret scientific content. All AI-assisted content was reviewed, edited, and verified by the authors as needed.</p>
  </notes><notes notes-type="codeavailability"><title>Code availability</title>

      <p id="d2e5341">The code used in this study is partially available. The implementation corresponding to the operational inference component of the proposed framework is publicly available at <uri>https://github.com/YacineBelHadj/operational_state_from_autoencoder</uri> (last access: 6 April 2026) (archived version: <ext-link xlink:href="https://doi.org/10.5281/zenodo.19439516" ext-link-type="DOI">10.5281/zenodo.19439516</ext-link>; <xref ref-type="bibr" rid="bib1.bibx3" id="altparen.52"/>). The implementation corresponding to the fatigue estimation component is publicly available at <uri>https://github.com/YacineBelHadj/dem_from_acceleration/releases/tag/v1.0.1</uri> (last access: 6 April 2026) (archived version: <ext-link xlink:href="https://doi.org/10.5281/zenodo.19440161" ext-link-type="DOI">10.5281/zenodo.19440161</ext-link>; <xref ref-type="bibr" rid="bib1.bibx4" id="altparen.53"/>). For confidentiality and security reasons, certain configuration files and credentials (e.g. API keys and environment-specific settings) have been removed. As a result, the repositories are not directly runnable in their published form. The complete internal pipeline used in this study, including project-specific data handling and execution infrastructure, is not publicly available.</p>
  </notes><notes notes-type="dataavailability"><title>Data availability</title>

      <p id="d2e5366">The data used in this study cannot be made publicly available. The acceleration data, SCADA data, and strain gauge measurements were provided by an industrial partner under a confidentiality agreement and are therefore not available for public release.</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e5372">Y. Bel-Hadj led the conceptualization, methodology design, software implementation, data curation, formal analysis, and original draft preparation. F. de Nolasco Santos contributed to the conceptualization, methodology design, and critical revision of the paper. W. Weijtjens contributed to the conceptualization, validation, and supervision. C. Devriendt provided resources, supervision, and project administration. All of the authors contributed to the paper review and editing.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e5378">The contact author has declared that none of the authors has any competing interests.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e5384">Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e5393">The present research work is part of the WILLOW project, funded by the European Union with GA No. 1011122184.</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e5399">This paper was edited by Nikolay Dimitrov and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Ajakan et al.(2014)Ajakan, Germain, Larochelle, Laviolette, and Marchand</label><mixed-citation>Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., and Marchand, M.: Domain-adversarial neural networks, arXiv preprint arXiv:1412.4446, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1412.4446" ext-link-type="DOI">10.48550/arXiv.1412.4446</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Avendano-Valencia et al.(2020)Avendano-Valencia, Chatzi, and Tcherniak</label><mixed-citation>Avendano-Valencia, L. D., Chatzi, E. N., and Tcherniak, D.: Gaussian process models for mitigation of operational variability in the structural health monitoring of wind turbines, Mech. Syst. Signal Pr., 142, 106686, <ext-link xlink:href="https://doi.org/10.1016/j.ymssp.2020.106686" ext-link-type="DOI">10.1016/j.ymssp.2020.106686</ext-link>,  2020.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Bel-Hadj(2026a)</label><mixed-citation>Bel-Hadj, Y.: YacineBelHadj/operational_state_from_autoencoder: operational_state_from_autoencoder_WES (v1.0.1), Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.19439517" ext-link-type="DOI">10.5281/zenodo.19439517</ext-link>, 2026a.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Bel-Hadj(2026b)</label><mixed-citation>Bel-Hadj, Y.: YacineBelHadj/dem_from_acceleration: DEM_from_acceleration (v1.0.1), Zenodo [code], <ext-link xlink:href="https://doi.org/10.5281/zenodo.19440161" ext-link-type="DOI">10.5281/zenodo.19440161</ext-link>, 2026b.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Bel-Hadj and Weijtjens(2022)</label><mixed-citation>Bel-Hadj, Y. and Weijtjens, W.: Anomaly detection in vibration signals for structural health monitoring of an offshore wind turbine, in: European Workshop on Structural Health Monitoring, pp. 348–358, Springer, <ext-link xlink:href="https://doi.org/10.1007/978-3-031-07322-9_36" ext-link-type="DOI">10.1007/978-3-031-07322-9_36</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Bel-Hadj et al.(2022)Bel-Hadj, Weijtjens, and de Nolasco Santos</label><mixed-citation>Bel-Hadj, Y., Weijtjens, W., and de Nolasco Santos, F.: Anomaly detection and representation learning in an instrumented railway bridge, in: ESANN, <ext-link xlink:href="https://doi.org/10.14428/esann/2022.ES2022-29" ext-link-type="DOI">10.14428/esann/2022.ES2022-29</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Bel-Hadj et al.(2025)Bel-Hadj, Weijtjens, and Devriendt</label><mixed-citation>Bel-Hadj, Y., Weijtjens, W., and Devriendt, C.: Structural health monitoring in a population of similar structures with self-supervised learning: a two-stage approach for enhanced damage detection and model tuning, Struct. Health Monit., p. 14759217251324194, <ext-link xlink:href="https://doi.org/10.1177/14759217251324194" ext-link-type="DOI">10.1177/14759217251324194</ext-link>,  2025.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Bengio et al.(2013)Bengio, Courville, and Vincent</label><mixed-citation>Bengio, Y., Courville, A., and Vincent, P.: Representation learning: A review and new perspectives, IEEE T. Pattern Anal., 35, 1798–1828, <ext-link xlink:href="https://doi.org/10.1109/TPAMI.2013.50" ext-link-type="DOI">10.1109/TPAMI.2013.50</ext-link>,  2013.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Bette et al.(2023)Bette, Wiedemann, Wächter, Freund, Peinke, and Guhr</label><mixed-citation>Bette, H. M., Wiedemann, C., Wächter, M., Freund, J., Peinke, J., and Guhr, T.: Dynamics of wind turbine operational states, arXiv preprint arXiv:2310.06098, <ext-link xlink:href="https://doi.org/10.48550/arXiv.2310.06098" ext-link-type="DOI">10.48550/arXiv.2310.06098</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Bull et al.(2020)Bull, Gardner, Gosliga, Dervilis, Papatheou, Maguire, Campos, Rogers, Cross, and Worden</label><mixed-citation>Bull, L. A., Gardner, P. A., Gosliga, J., Dervilis, N., Papatheou, E., Maguire, A. E., Campos, C., Rogers, T. J., Cross, E. J., and Worden, K.: Towards population-based structural health monitoring, Part I: Homogeneous populations and forms, in: Model Validation and Uncertainty Quantification, Volume 3: Proceedings of the 38th IMAC, A Conference and Exposition on Structural Dynamics 2020, pp. 287–302, Springer, <ext-link xlink:href="https://doi.org/10.1007/978-3-030-47638-0_32" ext-link-type="DOI">10.1007/978-3-030-47638-0_32</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Büth et al.(2025)Büth, Acharya, and Zanin</label><mixed-citation>Büth, C. M., Acharya, K., and Zanin, M.: infomeasure: a comprehensive Python package for information theory measures and estimators, Sci. Rep., 15, 29323, <ext-link xlink:href="https://doi.org/10.48550/arXiv.2505.14696" ext-link-type="DOI">10.48550/arXiv.2505.14696</ext-link>,  2025.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Byrne et al.(2019)Byrne, Burd, Zdravković, McAdam, Taborda, Houlsby, Jardine, Martin, Potts, and Gavin</label><mixed-citation>Byrne, B. W., Burd, H. J., Zdravković, L., McAdam, R. A., Taborda, D. M., Houlsby, G. T., Jardine, R. J., Martin, C. M., Potts, D. M., and Gavin, K. G.: PISA: new design methods for offshore wind turbine monopiles, Revue Française de Géotechnique, p. 3, <ext-link xlink:href="https://doi.org/10.1051/geotech/2019009" ext-link-type="DOI">10.1051/geotech/2019009</ext-link>,  2019.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Chu et al.(2019)Chu, Yuan, Xie, Pan, Wang, and Zhang</label><mixed-citation>Chu, J.-C., Yuan, L., Xie, F., Pan, L., Wang, X.-D., and Zhang, L.-Z.: Operational State Analysis of Wind Turbines Based on SCADA Data, in: 2nd International Conference on Electrical and Electronic Engineering (EEE 2019), pp. 169–173, Atlantis Press, <ext-link xlink:href="https://doi.org/10.2991/eee-19.2019.29" ext-link-type="DOI">10.2991/eee-19.2019.29</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Contreras and Murtagh(2015)</label><mixed-citation>Contreras, P. and Murtagh, F.: Hierarchical clustering, Handbook of cluster analysis, pp. 103–123, <ext-link xlink:href="https://doi.org/10.1201/b19706-11" ext-link-type="DOI">10.1201/b19706-11</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Cooley and Tukey(1965)</label><mixed-citation> Cooley, J. W. and Tukey, J. W.: An algorithm for the machine calculation of complex Fourier series, Math. Comput., 19, 297–301, 1965.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Daems et al.(2023)Daems, Peeters, Matthys, Verstraeten, and Helsen</label><mixed-citation> Daems, P.-J., Peeters, C., Matthys, J., Verstraeten, T., and Helsen, J.: Fleet-wide analytics on field data targeting condition and lifetime aspects of wind turbine drivetrains, Forsch. Ingenieurwes., 87, 285–295, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>d N Santos et al.(2022)d N Santos, Noppe, Weijtjens, and Devriendt</label><mixed-citation>d N Santos, F., Noppe, N., Weijtjens, W., and Devriendt, C.: Data-driven farm-wide fatigue estimation on jacket-foundation OWTs for multiple SHM setups, Wind Energ. Sci., 7, 299–321, <ext-link xlink:href="https://doi.org/10.5194/wes-7-299-2022" ext-link-type="DOI">10.5194/wes-7-299-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>de N Santos et al.(2023)de N Santos, D’Antuono, Robbelein, Noppe, Weijtjens, and Devriendt</label><mixed-citation> de N Santos, F., D’Antuono, P., Robbelein, K., Noppe, N., Weijtjens, W., and Devriendt, C.: Long-term fatigue estimation on offshore wind turbines interface loads through loss function physics-guided learning of neural networks, Renew. Energ., 205, 461–474, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>de N Santos et al.(2024)de N Santos, Noppe, Weijtjens, and Devriendt</label><mixed-citation> de N Santos, F., Noppe, N., Weijtjens, W., and Devriendt, C.: Farm-wide interface fatigue loads estimation: A data-driven approach based on accelerometers, Wind Energy, 27, 321–340, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>de Nolasco Santos et al.(2025)de Nolasco Santos, Bel-Hadj, Weijtjens, and Devriendt</label><mixed-citation>de Nolasco Santos, F., Bel-Hadj, Y., Weijtjens, W., and Devriendt, C.: Estimating Fatigue Through Latent Space Embedding of Acceleration in Offshore Wind Turbines, in: International Conference on Experimental Vibration Analysis for Civil Engineering Structures, pp. 943–951, Springer, <ext-link xlink:href="https://doi.org/10.1007/978-3-031-96106-9_96" ext-link-type="DOI">10.1007/978-3-031-96106-9_96</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Ganin and Lempitsky(2015)</label><mixed-citation>Ganin, Y. and Lempitsky, V.: Unsupervised domain adaptation by backpropagation, in: International conference on machine learning, pp. 1180–1189, PMLR, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1409.7495" ext-link-type="DOI">10.48550/arXiv.1409.7495</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Gardner et al.(2022)Gardner, Bull, Gosliga, Poole, Dervilis, and Worden</label><mixed-citation>Gardner, P., Bull, L. A., Gosliga, J., Poole, J., Dervilis, N., and Worden, K.: A population-based SHM methodology for heterogeneous structures: Transferring damage localisation knowledge between different aircraft wings, Mech. Syst. Signal Pr., 172, 108918, <ext-link xlink:href="https://doi.org/10.1016/j.ymssp.2022.108918" ext-link-type="DOI">10.1016/j.ymssp.2022.108918</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Ha and Schmidhuber(2018)</label><mixed-citation>Ha, D. and Schmidhuber, J.: Recurrent world models facilitate policy evolution, Adv. Neur. In., 31, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1809.01999" ext-link-type="DOI">10.48550/arXiv.1809.01999</ext-link>,  2018.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Hameed et al.(2009)Hameed, Hong, Cho, Ahn, and Song</label><mixed-citation> Hameed, Z., Hong, Y. S., Cho, Y. M., Ahn, S. H., and Song, C. K.: Condition monitoring and fault detection of wind turbines and related algorithms: A review, Adv. Mater. Res.-Switz., 13, 1–39, 2009.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Hinton and Salakhutdinov(2006)</label><mixed-citation>Hinton, G. E. and Salakhutdinov, R. R.: Reducing the dimensionality of data with neural networks, Science, 313, 504–507, <ext-link xlink:href="https://doi.org/10.1126/science.1127647" ext-link-type="DOI">10.1126/science.1127647</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Hlaing et al.(2024)Hlaing, Morato, Santos, Weijtjens, Devriendt, and Rigo</label><mixed-citation> Hlaing, N., Morato, P. G., Santos, F. d. N., Weijtjens, W., Devriendt, C., and Rigo, P.: Farm-wide virtual load monitoring for offshore wind structures via Bayesian neural networks, Struct. Health Monit., 23, 1641–1663, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>IEC(2016)</label><mixed-citation>IEC 61400-25-6: Communications for monitoring and control of wind power plants – Logical node classes and data classes for condition monitoring, <uri>https://webstore.iec.ch/publication/26226</uri> (last access: 6 April 2026), 2016.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>IEC(2019)</label><mixed-citation>IEC 61400-1: Wind energy generation systems – Part 1: Design requirements, <uri>https://webstore.iec.ch/publication/26423</uri> (last access: 6 April 2026), 2019.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Kingma and Ba(2014)</label><mixed-citation>Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1412.6980" ext-link-type="DOI">10.48550/arXiv.1412.6980</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Korkos et al.(2022)Korkos, Linjama, Kleemola, and Lehtovaara</label><mixed-citation> Korkos, P., Linjama, M., Kleemola, J., and Lehtovaara, A.: Data annotation and feature extraction in fault detection in a wind turbine hydraulic pitch system, Renew. Energ., 185, 692–703, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation>LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444, <ext-link xlink:href="https://doi.org/10.1038/nature14539" ext-link-type="DOI">10.1038/nature14539</ext-link>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Li et al.(2023)Li, Liu, and Xia</label><mixed-citation>Li, Z., Liu, Y., and Xia, Y.: Damage detection of bridges subjected to moving load based on domain-adversarial neural network considering measurement and model error, Eng. Struct., 293, 116601, <ext-link xlink:href="https://doi.org/10.1016/j.engstruct.2023.116601" ext-link-type="DOI">10.1016/j.engstruct.2023.116601</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Li et al.(2025)Li, Chen, Xu, and Huang</label><mixed-citation>Li, Z., Chen, Y., Xu, T., and Huang, H.: Cross-domain damage detection through partial conditional adversarial domain adaptation, Mech. Syst. Signal Pr., 225, 110118, <ext-link xlink:href="https://doi.org/10.1016/j.ymssp.2025.110118" ext-link-type="DOI">10.1016/j.ymssp.2025.110118</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Liu et al.(2021)Liu, Wang, Liu, Wang, Yao, and Abdelzaher</label><mixed-citation>Liu, D., Wang, T., Liu, S., Wang, R., Yao, S., and Abdelzaher, T.: Contrastive self-supervised representation learning for sensing signals from the time-frequency perspective, in: IEEE International Conference on Computer Communications (INFOCOM Workshops), pp. 1–6, IEEE, <ext-link xlink:href="https://doi.org/10.1109/ICCCN52240.2021.9522151" ext-link-type="DOI">10.1109/ICCCN52240.2021.9522151</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Mao et al.(2020)Mao, He, Li, and Yan</label><mixed-citation>Mao, W., He, J., Li, Y., and Yan, Y.: A new structured domain adversarial neural network for transfer fault diagnosis of rolling bearings under different working conditions, IEEE T. Instrum. Meas., 70, 1–13, <ext-link xlink:href="https://doi.org/10.1109/TIM.2020.3038596" ext-link-type="DOI">10.1109/TIM.2020.3038596</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Martakis et al.(2023)Martakis, Chatzi, Michalis, and Karapetrou</label><mixed-citation>Martakis, P., Chatzi, E., Michalis, I., and Karapetrou, S.: Fusing damage-sensitive features and domain adaptation towards robust damage classification in real buildings, Soil Dyn. Earthq. Eng., 166, 107739, <ext-link xlink:href="https://doi.org/10.3929/ethz-b-000593193" ext-link-type="DOI">10.3929/ethz-b-000593193</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>McInnes et al.(2018)McInnes, Healy, and Melville</label><mixed-citation>McInnes, L., Healy, J., and Melville, J.: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, arXiv preprint arXiv:1802.03426, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1802.03426" ext-link-type="DOI">10.48550/arXiv.1802.03426</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Ozturkoglu et al.(2024)Ozturkoglu, Ozcelik, and Günel</label><mixed-citation> Ozturkoglu, O., Ozcelik, O., and Günel, S.: Effects of Operational and Environmental Conditions on Estimated Dynamic Characteristics of a Large In-service Wind Turbine, J. Vib. Eng. Technol., 12, 803–824, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Rahimi Taghanaki et al.(2023)</label><mixed-citation>Rahimi Taghanaki, F. et al.: Self-supervised human activity recognition with localized time-frequency contrastive representation learning, in: Proceedings of the 30th ACM International Conference on Multimedia, ACM, <ext-link xlink:href="https://doi.org/10.1145/3581783.3612063" ext-link-type="DOI">10.1145/3581783.3612063</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Ranzato et al.(2012)Ranzato, Monga, Devin, Chen, Corrado, Dean, Le, and Ng</label><mixed-citation>Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G., Dean, J., Le, Q. V., and Ng, A. Y.: Building high-level features using large scale unsupervised learning, in: Proceedings of the 29th International Conference on Machine Learning (ICML-12), pp. 81–88, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1112.6209" ext-link-type="DOI">10.48550/arXiv.1112.6209</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Singh et al.(2024)Singh, Dwight, and Viré</label><mixed-citation>Singh, D., Dwight, R., and Viré, A.: Probabilistic surrogate modeling of damage equivalent loads on onshore and offshore wind turbines using mixture density networks, Wind Energ. Sci., 9, 1885–1904, <ext-link xlink:href="https://doi.org/10.5194/wes-9-1885-2024" ext-link-type="DOI">10.5194/wes-9-1885-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Snover(2020)</label><mixed-citation>Snover, D.: Urban Seismic Noise Identified with Deep Embedded Clustering Using a Dense Array in Long Beach, CA, Master's thesis, University of California San Diego, <uri>https://noiselab.ucsd.edu/group/Thesis/DSnover_MastersThesis.pdf</uri> (last access: 6 April 2026), 2020.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Soares-Ramos et al.(2020)Soares-Ramos, de Oliveira-Assis, Sarrias-Mena, and Fernández-Ramírez</label><mixed-citation>Soares-Ramos, E. P., de Oliveira-Assis, L., Sarrias-Mena, R., and Fernández-Ramírez, L. M.: Current status and future trends of offshore wind power in Europe, Energy, 202, 117787, <ext-link xlink:href="https://doi.org/10.1016/j.energy.2020.117787" ext-link-type="DOI">10.1016/j.energy.2020.117787</ext-link>,  2020.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Tschannen et al.(2018)Tschannen, Bachem, and Lucic</label><mixed-citation>Tschannen, M., Bachem, O., and Lucic, M.: Recent advances in autoencoder-based representation learning, arXiv preprint arXiv:1812.05069, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1812.05069" ext-link-type="DOI">10.48550/arXiv.1812.05069</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Vincent(2011)</label><mixed-citation>Vincent, P.: A connection between score matching and denoising autoencoders, Neural Comput., 23, 1661–1674, 2011.  </mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Vincent et al.(2010)Vincent, Larochelle, Lajoie, Bengio, Manzagol, and Bottou</label><mixed-citation>Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., and Bottou, L.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res., 11, <uri>https://www.jmlr.org/papers/volume11/vincent10a/vincent10a.pdf</uri> (last access: 6 April 2026),  2010.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Weijtens et al.(2016)Weijtens, Noppe, Verbelen, Iliopoulos, and Devriendt</label><mixed-citation>Weijtens, W., Noppe, N., Verbelen, T., Iliopoulos, A., and Devriendt, C.: Offshore wind turbine foundation monitoring, extrapolating fatigue measurements from fleet leaders to the entire wind farm, in: Journal of Physics: Conference Series, vol. 753, p. 092018, IOP Publishing, <ext-link xlink:href="https://doi.org/10.1088/1742-6596/753/9/092018" ext-link-type="DOI">10.1088/1742-6596/753/9/092018</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Xie et al.(2016)Xie, Girshick, and Farhadi</label><mixed-citation>Xie, J., Girshick, R., and Farhadi, A.: Unsupervised deep embedding for clustering analysis, in: International conference on machine learning, pp. 478–487, PMLR, <ext-link xlink:href="https://doi.org/10.48550/arXiv.1511.06335" ext-link-type="DOI">10.48550/arXiv.1511.06335</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Zhao et al.(2020)Zhao, Pan, Huang, Miao, Jiang, and Wang</label><mixed-citation>Zhao, Y., Pan, J., Huang, Z., Miao, Y., Jiang, J., and Wang, Z.: Analysis of vibration monitoring data of an onshore wind turbine under different operational conditions, Eng. Struct., 205, 110071, <ext-link xlink:href="https://doi.org/10.1016/j.engstruct.2019.110071" ext-link-type="DOI">10.1016/j.engstruct.2019.110071</ext-link>,  2020.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Inferring wind turbine operational state and fatigue from high-frequency acceleration using self-supervised learning for SCADA (supervisory control and data acquisition)-free monitoring</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Ajakan et al.(2014)Ajakan, Germain, Larochelle, Laviolette, and
Marchand</label><mixed-citation>
      
Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., and Marchand, M.:
Domain-adversarial neural networks, arXiv preprint arXiv:1412.4446,
<a href="https://doi.org/10.48550/arXiv.1412.4446" target="_blank">https://doi.org/10.48550/arXiv.1412.4446</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Avendano-Valencia et al.(2020)Avendano-Valencia, Chatzi, and
Tcherniak</label><mixed-citation>
      
Avendano-Valencia, L. D., Chatzi, E. N., and Tcherniak, D.: Gaussian process
models for mitigation of operational variability in the structural health
monitoring of wind turbines, Mech. Syst. Signal Pr., 142,
106686, <a href="https://doi.org/10.1016/j.ymssp.2020.106686" target="_blank">https://doi.org/10.1016/j.ymssp.2020.106686</a>,  2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Bel-Hadj(2026a)</label><mixed-citation>
      
Bel-Hadj, Y.: YacineBelHadj/operational_state_from_autoencoder: operational_state_from_autoencoder_WES (v1.0.1), Zenodo [code], <a href="https://doi.org/10.5281/zenodo.19439517" target="_blank">https://doi.org/10.5281/zenodo.19439517</a>, 2026a.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Bel-Hadj(2026b)</label><mixed-citation>
      
Bel-Hadj, Y.: YacineBelHadj/dem_from_acceleration: DEM_from_acceleration (v1.0.1), Zenodo [code], <a href="https://doi.org/10.5281/zenodo.19440161" target="_blank">https://doi.org/10.5281/zenodo.19440161</a>, 2026b.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Bel-Hadj and Weijtjens(2022)</label><mixed-citation>
      
Bel-Hadj, Y. and Weijtjens, W.: Anomaly detection in vibration signals for
structural health monitoring of an offshore wind turbine, in: European
Workshop on Structural Health Monitoring, pp. 348–358, Springer, <a href="https://doi.org/10.1007/978-3-031-07322-9_36" target="_blank">https://doi.org/10.1007/978-3-031-07322-9_36</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Bel-Hadj et al.(2022)Bel-Hadj, Weijtjens, and
de Nolasco Santos</label><mixed-citation>
      
Bel-Hadj, Y., Weijtjens, W., and de Nolasco Santos, F.: Anomaly detection and
representation learning in an instrumented railway bridge, in: ESANN, <a href="https://doi.org/10.14428/esann/2022.ES2022-29" target="_blank">https://doi.org/10.14428/esann/2022.ES2022-29</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Bel-Hadj et al.(2025)Bel-Hadj, Weijtjens, and
Devriendt</label><mixed-citation>
      
Bel-Hadj, Y., Weijtjens, W., and Devriendt, C.: Structural health monitoring in
a population of similar structures with self-supervised learning: a two-stage
approach for enhanced damage detection and model tuning, Struct. Health
Monit., p. 14759217251324194, <a href="https://doi.org/10.1177/14759217251324194" target="_blank">https://doi.org/10.1177/14759217251324194</a>,  2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Bengio et al.(2013)Bengio, Courville, and
Vincent</label><mixed-citation>
      
Bengio, Y., Courville, A., and Vincent, P.: Representation learning: A review
and new perspectives, IEEE T. Pattern Anal., 35, 1798–1828, <a href="https://doi.org/10.1109/TPAMI.2013.50" target="_blank">https://doi.org/10.1109/TPAMI.2013.50</a>,  2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Bette et al.(2023)Bette, Wiedemann, Wächter, Freund, Peinke, and
Guhr</label><mixed-citation>
      
Bette, H. M., Wiedemann, C., Wächter, M., Freund, J., Peinke, J., and Guhr,
T.: Dynamics of wind turbine operational states, arXiv preprint
arXiv:2310.06098, <a href="https://doi.org/10.48550/arXiv.2310.06098" target="_blank">https://doi.org/10.48550/arXiv.2310.06098</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Bull et al.(2020)Bull, Gardner, Gosliga, Dervilis, Papatheou,
Maguire, Campos, Rogers, Cross, and Worden</label><mixed-citation>
      
Bull, L. A., Gardner, P. A., Gosliga, J., Dervilis, N., Papatheou, E., Maguire,
A. E., Campos, C., Rogers, T. J., Cross, E. J., and Worden, K.: Towards
population-based structural health monitoring, Part I: Homogeneous
populations and forms, in: Model Validation and Uncertainty Quantification,
Volume 3: Proceedings of the 38th IMAC, A Conference and Exposition on
Structural Dynamics 2020, pp. 287–302, Springer, <a href="https://doi.org/10.1007/978-3-030-47638-0_32" target="_blank">https://doi.org/10.1007/978-3-030-47638-0_32</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Büth et al.(2025)Büth, Acharya, and
Zanin</label><mixed-citation>
      
Büth, C. M., Acharya, K., and Zanin, M.: infomeasure: a comprehensive
Python package for information theory measures and estimators, Sci.
Rep., 15, 29323,
<a href="https://doi.org/10.48550/arXiv.2505.14696" target="_blank">https://doi.org/10.48550/arXiv.2505.14696</a>,  2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Byrne et al.(2019)Byrne, Burd, Zdravković, McAdam, Taborda,
Houlsby, Jardine, Martin, Potts, and Gavin</label><mixed-citation>
      
Byrne, B. W., Burd, H. J., Zdravković, L., McAdam, R. A., Taborda, D. M.,
Houlsby, G. T., Jardine, R. J., Martin, C. M., Potts, D. M., and Gavin,
K. G.: PISA: new design methods for offshore wind turbine monopiles, Revue
Française de Géotechnique, p. 3, <a href="https://doi.org/10.1051/geotech/2019009" target="_blank">https://doi.org/10.1051/geotech/2019009</a>,  2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Chu et al.(2019)Chu, Yuan, Xie, Pan, Wang, and
Zhang</label><mixed-citation>
      
Chu, J.-C., Yuan, L., Xie, F., Pan, L., Wang, X.-D., and Zhang, L.-Z.:
Operational State Analysis of Wind Turbines Based on SCADA Data, in: 2nd
International Conference on Electrical and Electronic Engineering (EEE 2019),
pp. 169–173, Atlantis Press, <a href="https://doi.org/10.2991/eee-19.2019.29" target="_blank">https://doi.org/10.2991/eee-19.2019.29</a>,
2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Contreras and Murtagh(2015)</label><mixed-citation>
      
Contreras, P. and Murtagh, F.: Hierarchical clustering, Handbook of cluster
analysis, pp. 103–123, <a href="https://doi.org/10.1201/b19706-11" target="_blank">https://doi.org/10.1201/b19706-11</a>,
2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Cooley and Tukey(1965)</label><mixed-citation>
      
Cooley, J. W. and Tukey, J. W.: An algorithm for the machine calculation of
complex Fourier series, Math. Comput., 19, 297–301, 1965.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Daems et al.(2023)Daems, Peeters, Matthys, Verstraeten, and
Helsen</label><mixed-citation>
      
Daems, P.-J., Peeters, C., Matthys, J., Verstraeten, T., and Helsen, J.:
Fleet-wide analytics on field data targeting condition and lifetime aspects
of wind turbine drivetrains, Forsch. Ingenieurwes., 87, 285–295, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>d N Santos et al.(2022)d N Santos, Noppe, Weijtjens, and
Devriendt</label><mixed-citation>
      
d N Santos, F., Noppe, N., Weijtjens, W., and Devriendt, C.: Data-driven farm-wide fatigue estimation on jacket-foundation OWTs for multiple SHM setups, Wind Energ. Sci., 7, 299–321, <a href="https://doi.org/10.5194/wes-7-299-2022" target="_blank">https://doi.org/10.5194/wes-7-299-2022</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>de N Santos et al.(2023)de N Santos, D’Antuono, Robbelein, Noppe,
Weijtjens, and Devriendt</label><mixed-citation>
      
de N Santos, F., D’Antuono, P., Robbelein, K., Noppe, N., Weijtjens, W., and
Devriendt, C.: Long-term fatigue estimation on offshore wind turbines
interface loads through loss function physics-guided learning of neural
networks, Renew. Energ., 205, 461–474, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>de N Santos et al.(2024)de N Santos, Noppe, Weijtjens, and
Devriendt</label><mixed-citation>
      
de N Santos, F., Noppe, N., Weijtjens, W., and Devriendt, C.:
Farm-wide interface fatigue loads estimation: A data-driven approach based on
accelerometers, Wind Energy, 27, 321–340, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>de Nolasco Santos et al.(2025)de Nolasco Santos, Bel-Hadj, Weijtjens,
and Devriendt</label><mixed-citation>
      
de Nolasco Santos, F., Bel-Hadj, Y., Weijtjens, W., and Devriendt, C.:
Estimating Fatigue Through Latent Space Embedding of Acceleration in Offshore
Wind Turbines, in: International Conference on Experimental Vibration
Analysis for Civil Engineering Structures, pp. 943–951, Springer, <a href="https://doi.org/10.1007/978-3-031-96106-9_96" target="_blank">https://doi.org/10.1007/978-3-031-96106-9_96</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Ganin and Lempitsky(2015)</label><mixed-citation>
      
Ganin, Y. and Lempitsky, V.: Unsupervised domain adaptation by backpropagation,
in: International conference on machine learning, pp. 1180–1189, PMLR, <a href="https://doi.org/10.48550/arXiv.1409.7495" target="_blank">https://doi.org/10.48550/arXiv.1409.7495</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Gardner et al.(2022)Gardner, Bull, Gosliga, Poole, Dervilis, and
Worden</label><mixed-citation>
      
Gardner, P., Bull, L. A., Gosliga, J., Poole, J., Dervilis, N., and Worden, K.:
A population-based SHM methodology for heterogeneous structures: Transferring
damage localisation knowledge between different aircraft wings,
Mech. Syst. Signal Pr., 172, 108918, <a href="https://doi.org/10.1016/j.ymssp.2022.108918" target="_blank">https://doi.org/10.1016/j.ymssp.2022.108918</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Ha and Schmidhuber(2018)</label><mixed-citation>
      
Ha, D. and Schmidhuber, J.: Recurrent world models facilitate policy evolution,
Adv. Neur. In., 31, <a href="https://doi.org/10.48550/arXiv.1809.01999" target="_blank">https://doi.org/10.48550/arXiv.1809.01999</a>,  2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Hameed et al.(2009)Hameed, Hong, Cho, Ahn, and Song</label><mixed-citation>
      
Hameed, Z., Hong, Y. S., Cho, Y. M., Ahn, S. H., and Song, C. K.: Condition
monitoring and fault detection of wind turbines and related algorithms: A
review, Adv. Mater. Res.-Switz., 13, 1–39, 2009.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Hinton and Salakhutdinov(2006)</label><mixed-citation>
      
Hinton, G. E. and Salakhutdinov, R. R.: Reducing the dimensionality of data
with neural networks, Science, 313, 504–507, <a href="https://doi.org/10.1126/science.1127647" target="_blank">https://doi.org/10.1126/science.1127647</a>,
2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Hlaing et al.(2024)Hlaing, Morato, Santos, Weijtjens, Devriendt, and
Rigo</label><mixed-citation>
      
Hlaing, N., Morato, P. G., Santos, F. d. N., Weijtjens, W., Devriendt, C., and
Rigo, P.: Farm-wide virtual load monitoring for offshore wind structures via
Bayesian neural networks, Struct. Health Monit., 23, 1641–1663, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>IEC(2016)</label><mixed-citation>
      
IEC 61400-25-6: Communications for monitoring and control of wind power plants
– Logical node classes and data classes for condition monitoring,
<a href="https://webstore.iec.ch/publication/26226" target="_blank"/>
(last access: 6 April 2026), 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>IEC(2019)</label><mixed-citation>
      
IEC 61400-1: Wind energy generation systems – Part 1: Design requirements,
<a href="https://webstore.iec.ch/publication/26423" target="_blank"/>
(last access: 6 April 2026), 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Kingma and Ba(2014)</label><mixed-citation>
      
Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv
preprint arXiv:1412.6980, <a href="https://doi.org/10.48550/arXiv.1412.6980" target="_blank">https://doi.org/10.48550/arXiv.1412.6980</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Korkos et al.(2022)Korkos, Linjama, Kleemola, and
Lehtovaara</label><mixed-citation>
      
Korkos, P., Linjama, M., Kleemola, J., and Lehtovaara, A.: Data annotation and
feature extraction in fault detection in a wind turbine hydraulic pitch
system, Renew. Energ., 185, 692–703, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>LeCun et al.(2015)LeCun, Bengio, and Hinton</label><mixed-citation>
      
LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444,
<a href="https://doi.org/10.1038/nature14539" target="_blank">https://doi.org/10.1038/nature14539</a>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Li et al.(2023)Li, Liu, and Xia</label><mixed-citation>
      
Li, Z., Liu, Y., and Xia, Y.: Damage detection of bridges subjected to moving
load based on domain-adversarial neural network considering measurement and
model error, Eng. Struct., 293, 116601,
<a href="https://doi.org/10.1016/j.engstruct.2023.116601" target="_blank">https://doi.org/10.1016/j.engstruct.2023.116601</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Li et al.(2025)Li, Chen, Xu, and Huang</label><mixed-citation>
      
Li, Z., Chen, Y., Xu, T., and Huang, H.: Cross-domain damage detection through
partial conditional adversarial domain adaptation, Mech. Syst.
Signal Pr., 225, 110118, <a href="https://doi.org/10.1016/j.ymssp.2025.110118" target="_blank">https://doi.org/10.1016/j.ymssp.2025.110118</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Liu et al.(2021)Liu, Wang, Liu, Wang, Yao, and
Abdelzaher</label><mixed-citation>
      
Liu, D., Wang, T., Liu, S., Wang, R., Yao, S., and Abdelzaher, T.: Contrastive
self-supervised representation learning for sensing signals from the
time-frequency perspective, in: IEEE International Conference on Computer
Communications (INFOCOM Workshops), pp. 1–6, IEEE,
<a href="https://doi.org/10.1109/ICCCN52240.2021.9522151" target="_blank">https://doi.org/10.1109/ICCCN52240.2021.9522151</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Mao et al.(2020)Mao, He, Li, and Yan</label><mixed-citation>
      
Mao, W., He, J., Li, Y., and Yan, Y.: A new structured domain adversarial
neural network for transfer fault diagnosis of rolling bearings under
different working conditions, IEEE T. Instrum.
Meas., 70, 1–13, <a href="https://doi.org/10.1109/TIM.2020.3038596" target="_blank">https://doi.org/10.1109/TIM.2020.3038596</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Martakis et al.(2023)Martakis, Chatzi, Michalis, and
Karapetrou</label><mixed-citation>
      
Martakis, P., Chatzi, E., Michalis, I., and Karapetrou, S.: Fusing
damage-sensitive features and domain adaptation towards robust damage
classification in real buildings, Soil Dyn. Earthq. Eng.,
166, 107739, <a href="https://doi.org/10.3929/ethz-b-000593193" target="_blank">https://doi.org/10.3929/ethz-b-000593193</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>McInnes et al.(2018)McInnes, Healy, and Melville</label><mixed-citation>
      
McInnes, L., Healy, J., and Melville, J.: UMAP: Uniform Manifold Approximation
and Projection for Dimension Reduction, arXiv preprint arXiv:1802.03426,
<a href="https://doi.org/10.48550/arXiv.1802.03426" target="_blank">https://doi.org/10.48550/arXiv.1802.03426</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Ozturkoglu et al.(2024)Ozturkoglu, Ozcelik, and
Günel</label><mixed-citation>
      
Ozturkoglu, O., Ozcelik, O., and Günel, S.: Effects of Operational and
Environmental Conditions on Estimated Dynamic Characteristics of a Large
In-service Wind Turbine, J. Vib. Eng. Technol.,
12, 803–824, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Rahimi Taghanaki et al.(2023)</label><mixed-citation>
      
Rahimi Taghanaki, F. et al.: Self-supervised human activity recognition with
localized time-frequency contrastive representation learning, in: Proceedings
of the 30th ACM International Conference on Multimedia, ACM,
<a href="https://doi.org/10.1145/3581783.3612063" target="_blank">https://doi.org/10.1145/3581783.3612063</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Ranzato et al.(2012)Ranzato, Monga, Devin, Chen, Corrado, Dean, Le,
and Ng</label><mixed-citation>
      
Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G., Dean, J., Le, Q. V.,
and Ng, A. Y.: Building high-level features using large scale unsupervised
learning, in: Proceedings of the 29th International Conference on Machine
Learning (ICML-12), pp. 81–88, <a href="https://doi.org/10.48550/arXiv.1112.6209" target="_blank">https://doi.org/10.48550/arXiv.1112.6209</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Singh et al.(2024)Singh, Dwight, and
Viré</label><mixed-citation>
      
Singh, D., Dwight, R., and Viré, A.: Probabilistic surrogate modeling of damage equivalent loads on onshore and offshore wind turbines using mixture density networks, Wind Energ. Sci., 9, 1885–1904, <a href="https://doi.org/10.5194/wes-9-1885-2024" target="_blank">https://doi.org/10.5194/wes-9-1885-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Snover(2020)</label><mixed-citation>
      
Snover, D.: Urban Seismic Noise Identified with Deep Embedded Clustering Using
a Dense Array in Long Beach, CA, Master's thesis, University of California
San Diego,
<a href="https://noiselab.ucsd.edu/group/Thesis/DSnover_MastersThesis.pdf" target="_blank"/> (last access: 6 April 2026), 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Soares-Ramos et al.(2020)Soares-Ramos, de Oliveira-Assis,
Sarrias-Mena, and Fernández-Ramírez</label><mixed-citation>
      
Soares-Ramos, E. P., de Oliveira-Assis, L., Sarrias-Mena, R., and
Fernández-Ramírez, L. M.: Current status and future trends of
offshore wind power in Europe, Energy, 202, 117787, <a href="https://doi.org/10.1016/j.energy.2020.117787" target="_blank">https://doi.org/10.1016/j.energy.2020.117787</a>,  2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Tschannen et al.(2018)Tschannen, Bachem, and
Lucic</label><mixed-citation>
      
Tschannen, M., Bachem, O., and Lucic, M.: Recent advances in autoencoder-based
representation learning, arXiv preprint arXiv:1812.05069,
<a href="https://doi.org/10.48550/arXiv.1812.05069" target="_blank">https://doi.org/10.48550/arXiv.1812.05069</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Vincent(2011)</label><mixed-citation>
      
Vincent, P.: A connection between score matching and denoising autoencoders,
Neural Comput., 23, 1661–1674, 2011.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Vincent et al.(2010)Vincent, Larochelle, Lajoie, Bengio, Manzagol,
and Bottou</label><mixed-citation>
      
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A., and
Bottou, L.: Stacked denoising autoencoders: Learning useful representations
in a deep network with a local denoising criterion,
J. Mach. Learn. Res., 11, <a href="https://www.jmlr.org/papers/volume11/vincent10a/vincent10a.pdf" target="_blank"/> (last access: 6 April 2026),  2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Weijtens et al.(2016)Weijtens, Noppe, Verbelen, Iliopoulos, and
Devriendt</label><mixed-citation>
      
Weijtens, W., Noppe, N., Verbelen, T., Iliopoulos, A., and Devriendt, C.:
Offshore wind turbine foundation monitoring, extrapolating fatigue
measurements from fleet leaders to the entire wind farm, in: Journal of
Physics: Conference Series, vol. 753, p. 092018, IOP Publishing, <a href="https://doi.org/10.1088/1742-6596/753/9/092018" target="_blank">https://doi.org/10.1088/1742-6596/753/9/092018</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Xie et al.(2016)Xie, Girshick, and Farhadi</label><mixed-citation>
      
Xie, J., Girshick, R., and Farhadi, A.: Unsupervised deep embedding for
clustering analysis, in: International conference on machine learning, pp.
478–487, PMLR, <a href="https://doi.org/10.48550/arXiv.1511.06335" target="_blank">https://doi.org/10.48550/arXiv.1511.06335</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Zhao et al.(2020)Zhao, Pan, Huang, Miao, Jiang, and
Wang</label><mixed-citation>
      
Zhao, Y., Pan, J., Huang, Z., Miao, Y., Jiang, J., and Wang, Z.: Analysis of
vibration monitoring data of an onshore wind turbine under different
operational conditions, Eng. Struct., 205, 110071, <a href="https://doi.org/10.1016/j.engstruct.2019.110071" target="_blank">https://doi.org/10.1016/j.engstruct.2019.110071</a>,  2020.

    </mixed-citation></ref-html>--></article>
