the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Remote Diagnostics for Power Converter Faults in Wind Turbines Based on Converter Control System Data
Abstract. Power converters are among the most frequently failing subsystems of onshore and offshore wind turbines. In order to minimize the resulting downtime and production losses, the time to repair should be as low as possible. In practice, however, it is not uncommon for several turbine visits to be necessary, as information about the failure mode and the spare parts required can often only be determined on site. This paper presents a data-driven, interpretable workflow for the remote diagnosis of power-converter–related turbine shutdowns using converter control system data from an offshore wind farm. The study uses converter-fault events and three data sources: high-resolution fast logs (4.5 kHz, −350 ms to +200 ms around a fault-induced trigger), 1-min operating data, and fault flags derived from event log data. From an initial 864 engineered features we remove low-variance and highly correlated features, apply a subsampled decision-tree inclusion-rate filter to retain 34 features, and estimate diagnostic impact via subsampled logistic regression. Results show that fast-log features and converter fault flags contain the most predictive information for classifying standstill severity after a fault-induced shutdown, while low-resolution operating data contribute little. Using four of the derived features yields the best cross-validated performance in a decision tree with an accuracy of 0.89 and an F1-score of 0.86. The proposed approach is practical for industry use and offers the potential to provide explainable decision support for improving first-time fix rate.
- Preprint
(627 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on wes-2025-186', Anonymous Referee #1, 25 Oct 2025
-
CC1: 'Comment on wes-2025-186', Dingrui Li, 05 Dec 2025
This paper utilized data-driven approach for converter fault diagnostics. The results from the paper are practical and useful for actual industrial applications, while the academic innovations are limited. Here are my detailed comments:
- The proposed approach utilized the data from converter control, which may not be available in other scenarios. In real applications, it is common that only the converter vendors have access to the control data. How will the converter customers utilize the proposed approach?
- The authors utilized the Park Transformation to convert the data to dq coordinates, which can transform the three-phase AC balanced components to DC components. However, during fault conditions, the converter output may not be balanced; as a result, the Park Transformation may not lead to DC components. Will the unbalanced components affect the data analysis?
- The decision trees and regression are not new approaches. Can the authors highlight the innovation of the proposed approach?
- For the results, it will be better if the system configuration or parameters can be introduced
Disclaimer: this community comment is written by an individual and does not necessarily reflect the opinion of their employer.Citation: https://doi.org/10.5194/wes-2025-186-CC1 -
RC2: 'Comment on wes-2025-186', Anonymous Referee #2, 29 Dec 2025
It is an interesting paper - both in terms of methodology as well as the data available for the data-driven prediction of fault types in a wind turbine. When reading it -
I miss a system diagram showing where the data are coming from in the wind turbine - all the variables which are defined could be nice to see in such figure.
The paper also concludes that only the fast sampled data are usefull - it is not clear in the figures how that can be seen - a little more details
An initial discussion on how general the structure applied can be transferred to other turbines
Citation: https://doi.org/10.5194/wes-2025-186-RC2 -
AC1: 'Comment on wes-2025-186', Timo Lichtenstein, 13 Feb 2026
We would like to thank all reviewers for their constructive and insightful comments. Their suggestions have helped us to improve the clarity and quality of the manuscript substantially. See the detailed answers to each reviewer below.
Anwer to RC1:
Thank you very much for your constructive and detailed comments. We carefully considered all comments and respond to each point in detail below.
First, we fully agree that a sample size of about 100 fault events is limited for machine learning purposes and inherently raises concerns about overfitting and generalizability. For this reason, we deliberately applied a methodology that reduces the feature space and uses models with relatively few parameters (logistic regression and decision trees). The intention of this work is primarily to demonstrate that diagnostically relevant information is indeed present in the available data and that meaningful fault classification is possible in principle. More extensive validation and deeper analyses, including assessments on different wind farms, turbine types, or converter manufacturers, will only be feasible once data from additional wind farms, turbine types, and converter control systems become available.
Regarding the question of why the four most relevant features are particularly informative, a detailed technical interpretation of the selected features and their direct relation to specific physical failure mechanisms would be largely speculative at this stage. Our approach is intentionally designed as a purely data-driven method. The core idea is to exploit the available data to identify patterns that may not yet be understood or documented as distinct fault mechanisms, and in this way provide the user with additional decision support beyond the diagnostic capabilities of the converter itself. We also acknowledge that the number of features from different sources and their concise descriptions are not self-explanatory, and that a full physical explanation of every feature is not feasible within the scope of this paper. Nevertheless, the four most significant features can be briefly summarized as follows:
- FL: LSC current setpoint entropy (W3) measures the irregularity (information content) of the LSC current setpoint in time window W3; increased entropy indicates stronger fluctuations in the commanded current.
- FL: LSC actuating space vector length slope (W0) describes the trend (slope) of the PWM actuating signal amplitude in the first time window W0, where a non-zero slope reflects systematic changes in the converter’s actuation behavior.
- FF: High temperature LV filter is a fault flag indicates that the low-voltage filter temperature exceeds its threshold, directly linked to increased temperatures in the filter components, which may result from either increased power losses or cooling issues.
- FF: Control system is a fault flag indicates an internal control system issue.
Concerning the use of “time until restart” as a proxy for fault severity, it is correct that, without service data as ground truth, an assessment of fault severity can be influenced by several external factors. However, it should be noted that, in consultation with the wind farm operator, we set the threshold to 1 hour. This period is, on the one hand, sufficiently long to exclude remotely resettable faults, while, on the other hand, being short enough for reactive maintenance measures not to be carried out yet. We therefore consider this a pragmatic compromise under the given data constraints.
With respect to how the model would operate in practice, this algorithm is not intended for continuous data monitoring. As the datasets used in this work are generated at the time of trigger events of the converter, the approach is designed as a remote diagnostic tool to support service operators in their prioritization and decision-making. In other words, the model is activated in connection with converter trigger events rather than continuously classifying normal operational data. At the same time, the results of this work can be seen as a precursor for future concepts involving continuous monitoring of the high-resolution data of the converter control.
Finally, regarding the comparison with simpler baseline methods and existing practice, we have carried out the same analysis using only event data and the available operating data, without the fast log information. In this case, we obtain R² values and F1 scores of around 0.6. These values indicate a limited diagnostic performance and highlight the need for the richer information content provided by the fast logs. We believe this comparison clearly illustrates the added value of the proposed multi-source, high-resolution data-driven approach over more conventional, flag-based diagnostics.
Answer to RC2:
Thank you very much for your constructive and insightful feedback on our manuscript. We carefully addressed all comments and revised the paper accordingly. Below, we respond to each point in detail.
In response to your request for a clearer overview of where the different data originate from, we have added a system-level diagram (new Fig. 1) that illustrates the data flow. This figure shows from which parts of the system the signals are obtained (turbine control vs. converter control) and how the operating data (OD), fault logs (FL), and event data (OS, FF) enter and are processed within our proposed workflow.
Regarding your comment that the conclusion about the usefulness of the fast-sampled data is not clearly visible in the figures, we have clarified this in the manuscript. Features with higher absolute logistic regression coefficients (see Tab. 2 and Fig. 3 (old Fig. 2)) are considered to have a greater impact on model performance. These features are then added to the model in descending order of importance (see Fig. 4b (old Fig. 4b)). Among the first 16 selected features, only one originates from OD and one from OS; all remaining features are derived from FF and FL data. We now point this out more explicitly in the text to make the dominance of the fast-sampled sources easier to see.
Finally, in response to your suggestion to discuss the generality of the proposed structure, we have added a comment in the summary stating that the approach can, in principle, be applied to any turbine system where trigger data from the converter control are available.
Answer to CC1:
Thank you very much for your constructive comments and for recognizing the practical relevance of our work. We have carefully addressed your comments and revised the paper accordingly. Below, we respond to each point in detail.
Regarding the availability of converter control data, it is clear to us that the use of these data is not a typical use case in day-to-day technical operation. For the dataset shown in this study, however, the fault flags (FF) and high-resolution fast log data (FL) around the trigger events were readily available to the operator. In addition, 1‑minute operating data (OD) can nowadays typically be acquired, if desired. Nevertheless, the research paper is generally intended to showcase the possibilities of advanced diagnostics with access to these extended data sources.
Concerning the use of the Park transformation, we would like to point out that there was indeed an error in the manuscript. Instead of the Park transformation, the Clarke transformation has been used, resulting in a rotating space vector from which the magnitude and angular velocity were calculated; this was corrected for the resubmitted version. Furthermore, it is correct that a balanced voltage system leads to steady (DC) quantities. Nevertheless, after the transformation (either Park or Clarke), we still obtain a time series. Under fault conditions these signals can indeed become partly chaotic. It is precisely these deviations from steady-state behavior that allow the different fault severities to be distinguished. An explanation to this effect has been added to the manuscript.
With respect to the machine learning methods, we agree that decision trees and regression are not new approaches. In this work, however, we focus on a systematic investigation of the diagnostic possibilities when extended converter control data are available, which to our knowledge has not been studied in this form before. Therefore, the innovation lies in the data-driven analysis of converter control signals and in the findings derived from this analysis, rather than in developing new machine learning techniques.
Finally, regarding the request to introduce system configuration or parameters, the proposed methodology is designed to be system-agnostic. Consequently, the detailed configuration of the specific wind turbines and converters used is not essential for understanding or applying the approach and would mainly increase the complexity and length of the paper.
Citation: https://doi.org/10.5194/wes-2025-186-AC1
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 428 | 207 | 34 | 669 | 34 | 41 |
- HTML: 428
- PDF: 207
- XML: 34
- Total: 669
- BibTeX: 34
- EndNote: 41
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This paper proposes a data-driven workflow for remote diagnostics of power converter faults in wind turbines using multi-source data fusion and machine learning models. This is an interesting and industrially relevant area that has received limited attention in the literature. However, there are several concerns that authors need to address before the paper is suitable for publication: