Non-stationarity in correlation matrices for wind turbine SCADA-data and implications for failure detection

Bette, Henrik M.; Jungblut, Edgar; Guhr, Thomas

doi:https://doi.org/10.5194/wes-2021-107

Preprints

https://doi.org/10.5194/wes-2021-107

Preprints

01 Oct 2021

| 01 Oct 2021

Status: this preprint was under review for the journal WES but the revision was not accepted.

Non-stationarity in correlation matrices for wind turbine SCADA-data and implications for failure detection

Henrik M. Bette, Edgar Jungblut, and Thomas Guhr

Abstract. Modern utility-scale wind turbines are equipped with a Supervisory Control And Data Acquisition (SCADA) system gathering vast amounts of operational data that can be used for failure analysis and prediction to improve operation and maintenance of turbines. We analyse high freqeuency SCADA-data from the Thanet offshore windpark in the UK and evaluate Pearson correlation matrices for a variety of observables with a moving time window. This renders possible an asessment of non-stationarity in mutual dependcies of different types of data. Drawing from our experience in other complex systems, such as financial markets and traffic, we show this by employing a hierarchichal k-means clustering algorithm on the correlation matrices. The different clusters exhibit distinct typical correlation structures to which we refer as states. Looking first at only one and later at multiple turbines, the main dependence of these states is shown to be on wind speed. In accordance, we identify them as operational states arising from different settings in the turbine control system based on the available wind speed. We model the boundary wind speeds seperating the states based on the clustering solution. This allows the usage of our methodology for failure analysis or prediction by sorting new data based on wind speed and comparing it to the respective operational state, thereby taking the non-stationarity into account.

Received: 15 Sep 2021 – Discussion started: 01 Oct 2021

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Henrik M. Bette, Edgar Jungblut, and Thomas Guhr

Status: closed

RC1:
'Comment on wes-2021-107', Anonymous Referee #1, 31 Oct 2021
In this paper, the authors have identified different normal operational states of wind turbines using SCADA data. They have used Pearson correlation matrix data and k-means clustering algorithm for identifying operational states without prior knowledge of the control system. It has been shown that the primary dependence of recognized states is on wind speed. The model is well-structured, and the results sound promising. The writing quality of the paper is also good. The reviewer just has one comment that would like the authors to explain:

The proposed method has been utilized for normal operational state recognition based on different wind speed intervals. The authors mention in several places throughout the paper, as well as in the title, that their methodology can be applied for failure detection. However, it is unclear how the probable failures can be detected using the proposed method based on wind speed. Please clarify this issue.
Citation: https://doi.org/10.5194/wes-2021-107-RC1
- AC1: 'Reply on RC1', Henrik M. Bette, 20 Jan 2022
  
  We are very grateful for the concise summary and the referees’ valuable comment.
  We recognize that the point brought up by the referee can lead to misunderstandings. Our method cannot be used directly for failure analysis. It allows the automated distinction between different operational states without prior knowledge about the control mechanisms of the turbine. We are convinced that this distinction will aid other failure detection methods, especially those directly dependent on the correlation structure, by separating the space of what is normal into three smaller subspaces, from which deviations could be detected. This is, however, just an implication and not shown in the current paper.
  
  Citation: https://doi.org/10.5194/wes-2021-107-AC1
RC2:
'Comment on wes-2021-107', Anonymous Referee #2, 23 Dec 2021
General comments

The authors introduce an automated methodology to classify SCADA data in different operational regimes by defining wind speeds boundaries based on the observed correlations between variables. While the proposed methodology seems interesting, the benefit of using this method seems to be considered as a given. The paper also contains some statements that should be further investigated and clarified. These are discussed as specific comments.

Specific comments

Line 100 – The investigated set of variables is strange if the main goal of the paper is to identify the different operational regimes of the wind turbine. For example, a set of directly coupled variables (rotor and generator speed/line current and active power) are chosen, whereas the pitch angle of the blades is not. The latter is the most significant control variable of the turbine above nominal wind speeds, yet is not considered in the analysis.

Line 263 – “In cluster 2 this changes and the observables RotorRPM and GeneratorRPM decouple from the others.” This should be investigated, as a gearbox with a fixed transmission ratio should be the only component between the two tachometers measuring the rotational speeds. It therefore does not make sense from a mechanical point of view that these two variables can decouple from one another.

Line 287 – “the response time of the turbine controller is finite.” While this is true and while this can play a role during some events, the fact that 30 minute non-overlapping windows are used, surely plays a much larger role than this. Especially since windows are analyzed where no alarms or downtime was present.

Figure 5 - The fact that no pitch angles are considered for the classification of the operational regimes, seems to result in a significant amount of misclassified points given no post-processing (section 5 is done). Is this resolved by extending the set of considered variables?

Line 333 – The fact that the estimated rated wind speed seems to be 11.8% higher than the warranted one by the manufacturer, should be justified. This deviation is quite significant, and it does raise the question if these two wind speeds are directly comparable, or if the method intrinsically overestimates the threshold for classification. While the evolution of states over time is shown in multiple figures in the paper, these figures do not inherently convey any information without knowing the wind speed at a given time. In fact, the different control regimes of a wind turbine can normally rather easily be identified by making a three scatter plots, i.e. the active power, the rotor speed, and the pitch angle as a function of the wind speed. It would be good to see that the classification of the data represents what an analyst would manually identify on these plots.

Figure 10/Line 350 – The direct use of this graph should be clarified when wanting to go towards a (general) classification framework. How can abnormal operating conditions (e.g. deration, curtailments) be taken into account? Do the authors aim to only classify based on wind speed, or should additional post-processing still occur to filter abnormal conditions? How are wind speeds dealt with where there is overlap between two control regimes?

Line 367 - Our method enables the definition of multiple operational states for wind turbines, whereby an optimization for failure analysis and prediction could be undertaken.

This seems like an overstatement. The same control regimes are described in the IEC standard, and are thus not ‘defined’ within this paper, but by the control design of the manufacturer.

Line 369 – The fact that this pre-processing step might help the performance of “all techniques used for failure analysis” seems to not be a given. It would be good to briefly illustrate with an example that this is indeed the case, as the fact that different operational regimes exist for wind turbines is well known. The so-called boundary speeds v1, v2, and vnom are quite easily identifiable by an analyst (see question 5). This raises the question if their identification was indeed a bottleneck for the proposed analysis strategy, or if the benefit is simply not considerable. In any case, automated classification of control regimes is still interesting, but the paper clearly aims at this specific use case, which should therefore be illustrated.

Regarding the "implications for failure detection" in the title of this paper: I agree that if the classification could be shown to have an added value for the failure detection, that this would be a novelty, but currently there is nothing in the paper that actually decisively shows that this classification improves failure detection. Without any proof that it actually helps the failure detection step, this sounds more like speculation than an implication. Can you provide proof, eg an example or reference, where the proposed methodology is used to augment the failure detection step?
Citation: https://doi.org/10.5194/wes-2021-107-RC2
- AC2:
  'Reply on RC2', Henrik M. Bette, 20 Jan 2022
  We thank the referee for his/her detailed comments. We are sure they helped increase the quality of our manuscript. In general, the proposed methodology in itself can be used to identify operational states without prior knowledge of the control system. It is true that the benefit for failure detection is not proven in the current paper, but only implied. We show that the different correlation structures can be detected automatically by clustering. This means that they are quite distinct and different. We are convinced that this implies benefits for failure detection by separating the space of what is normal into three smaller subspaces, from which deviations could be detected, especially if the method used to analyze failure directly depends on the correlation structure. A test and study of this benefit will be part of future research. We aim with this paper to encourage others to do this as well. We think that it is a good idea to more clearly state this in the text of the paper and change “implications” to “possible implications” in the title.
  
  We also extended the paragraph discussing implied benefits for failure detection in the introduction. We included citations from Tveten (2019) showing the strong influence of changes in correlation on PCA and Zimroz et al. (2014) showing that accounting for non-stationary operating conditions improves failure detection results.
  We are especially grateful for the specific comments, which we will address separately in the same order they were made by the referee.
  As correctly pointed out, the pitch angle is an interesting control variable. In our data set, however, it contains many missing values. This reduces the number of usable epochs from 749 to 171. As the turbine type looked at in this study aims to either keep rotation and/or power constant, it is not necessary to look at the pitch angle to distinguish these regimes. Be that as it may, we understand that the pitch angle is a very prominent variable when looking at wind turbines and this point rightfully brought up by the referee could irritate many readers. We have therefore decided to apply a method to fill the missing values in the pitch angle time series and include the basic clustering analysis with pitch angle in the manuscript as well. This can be found in a newly added section 4.2.
  
  The use of directly coupled observables is thought to be beneficial for understanding the entire correlation structure. As in many complex systems groups of correlated observables occur. Our analysis shows that the states found are different in the inter-correlation between groups not in the intra-correlations inside the groups. Further studies could investigate a minimal set of observables needed for this distinction (starting from our observables and removing groups the set of “Active Power”, “Rotor RPM” and “Wind Speed” could be considered.) as well as larger sets of observables. Of course, this could also change the number of clusters, which need to be considered.
  
  When using the proposed method as a pre-processing for failure detection, it will have to be fitted on the observables needed for that specific failure detection method.
  
  The two observables do not decouple from each other, but rather “from the others”. This means that these two stay closely correlated as is mechanically reasonable, but are no longer correlated to the other observables. We refined this wording in the revised manuscript.
  
  We still think that finite control time may in some cases play a role, but we agree that the 30 minute window is the dominant cause here. We will emphasize this point in the revised manuscript.
  
  As pointed out in the answer to specific comment one, the use of pitch angles is problematic with our data. Some misclassified points are to be expected always when using stochastic clustering methods as can also be seen in the newly added section 4.2 where pitch angle is included in the analysis. In our data set we did not find a better set of observables to consider, however we share the referees hope that with larger data sets (time- and observable-wise) the method could be improved further in the future.
  
  The evolution of states over time is shown once to clearly show that – in contrast to some other complex systems such as finance – the dependence does not seem to be time and no states die out or emerge.
  
  While we agree with the the referee that scatter plots would be interesting for comparison, we cannot show them due to our cooperation agreement with the data provider. In response to the referees' comment we specifically asked, if we could show them in this case, but we are not allowed to. We have, however, included the explanation for the difference in values in the revised manuscript:
  
  The power curve (active power in dependency of wind speed) as given by the manufacturer is one line. Accordingly, there is exactly one value ˜vnom which marks the starting point for nominal power production. In reality, especially when looking at high-frequency data, there will always be an area around this line which is realized. The value ˜vnom lies in the middle of this smeared out power curve. At this wind speed nominal power output can be reached but is not yet constant. With even higher wind speeds, it becomes less and less likely that the actual power produced lies beneath the nominal value. Only when this probability nearly vanishes, a change in correlation structure is detectable by our method. It is therefore reasonable that our value vnom lies higher than ˜vnom. While our value is therefore well suited to distinguish correlation states, it cannot be directly compared with the nominal wind speed given by the manufacturer of a turbine. We have confirmed this by looking at scatter plots of our data but cannot show them in this paper due to data confidentiality.
  
  We thank the referee again for pointing out that it was missing.
  
  We are very grateful for this comment. The referee is certainly correct that a few sentences about this in the paper are in order. We will add them in the revised manuscript. The general idea is to classify based on wind speed as this was shown to be the dominant factor dividing the three states. The overlap between two states could be used to calculate a certainty of the analysis made. If one is looking to minimize false alarms for example, one could also apply any method to both states and then consider the one indicating less failure. Abnormal operating conditions like curtailment need to be considered and used as additional post-processing when applying this in practice. These things must be considered and tested when actually using our method as a pre-processing for any analysis. As we have said, we will add this in the revised manuscript.
  
  We agree that they are not “defined” by our method, but are, of course, introduced by the manufacturer. We will change this wording to “enables the distinction of multiple operational states” to more clearly point out that we can find these states defined by the manufacturer without prior knowledge.
  
  We are unaware of publications that use automatic separation into operational/correlation states before the application of the actual failure analysis process. This also because high-frequency data is not always available. The distinction into operational states is only sensible if the interval that can be looked at is not larger than the usual time it takes for wind conditions to change. The improvement for “all techniques” is just conjecture, which led to the usage of the word “might”. We want to raise awareness of the non-stationarity in the correlation structure and the possibility to use this for pre-processing into separated normal states. We want to encourage researchers and analysts to try this out and will do so ourselves in future work. An example to underline our conjecture is a neural network: We do not really know what it does internally, but we do know that it needs to account as best it can for all mechanisms in the system. With our proposed pre-processing the neural network does not need to learn to distinguish different control regimes as this is already done. We conject that this could increase accuracy or reduce training time, or the size of the training data set needed.
  
  For methods directly dependent on the correlations such as PCA the implication is clearer: As the principal components are the eigenvectors of the correlation matrix, a changing correlation matrix will also change the principal components and thereby the results of an analysis depending on them. The referee is correct in that this for now is still only implication. Detailed analysis of the possible benefit will be part of future work.
  
  We thank the referee for recognizing that automated detection is interesting in any case.
  
  We have not proven or quantified the aid our method could provide for failure analysis in this paper and aim to do this in the future. No reference using the proposed methodology can be given yet as it is new. We hope to encourage researchers to consider this pre-processing in the future and analyze its benefit.
  
  We think that the word “implications” already describes this, in contrast to e.g. “consequences”, but agree that it needs to be pointed out more prominently. We will do so in the text of the manuscript where necessary and propose to change the title to “Non-stationarity in correlation matrices for wind turbine SCADA-data and possible implications for failure detection”.
  
  We hope that our answers are satisfactory and want to thank the referee once more for many helpful suggestions.
  
  Citation: https://doi.org/10.5194/wes-2021-107-AC2

Status: closed

RC1:
'Comment on wes-2021-107', Anonymous Referee #1, 31 Oct 2021
In this paper, the authors have identified different normal operational states of wind turbines using SCADA data. They have used Pearson correlation matrix data and k-means clustering algorithm for identifying operational states without prior knowledge of the control system. It has been shown that the primary dependence of recognized states is on wind speed. The model is well-structured, and the results sound promising. The writing quality of the paper is also good. The reviewer just has one comment that would like the authors to explain:

The proposed method has been utilized for normal operational state recognition based on different wind speed intervals. The authors mention in several places throughout the paper, as well as in the title, that their methodology can be applied for failure detection. However, it is unclear how the probable failures can be detected using the proposed method based on wind speed. Please clarify this issue.
Citation: https://doi.org/10.5194/wes-2021-107-RC1
- AC1: 'Reply on RC1', Henrik M. Bette, 20 Jan 2022
  
  We are very grateful for the concise summary and the referees’ valuable comment.
  We recognize that the point brought up by the referee can lead to misunderstandings. Our method cannot be used directly for failure analysis. It allows the automated distinction between different operational states without prior knowledge about the control mechanisms of the turbine. We are convinced that this distinction will aid other failure detection methods, especially those directly dependent on the correlation structure, by separating the space of what is normal into three smaller subspaces, from which deviations could be detected. This is, however, just an implication and not shown in the current paper.
  
  Citation: https://doi.org/10.5194/wes-2021-107-AC1
RC2:
'Comment on wes-2021-107', Anonymous Referee #2, 23 Dec 2021
General comments

The authors introduce an automated methodology to classify SCADA data in different operational regimes by defining wind speeds boundaries based on the observed correlations between variables. While the proposed methodology seems interesting, the benefit of using this method seems to be considered as a given. The paper also contains some statements that should be further investigated and clarified. These are discussed as specific comments.

Specific comments

Line 100 – The investigated set of variables is strange if the main goal of the paper is to identify the different operational regimes of the wind turbine. For example, a set of directly coupled variables (rotor and generator speed/line current and active power) are chosen, whereas the pitch angle of the blades is not. The latter is the most significant control variable of the turbine above nominal wind speeds, yet is not considered in the analysis.

Line 263 – “In cluster 2 this changes and the observables RotorRPM and GeneratorRPM decouple from the others.” This should be investigated, as a gearbox with a fixed transmission ratio should be the only component between the two tachometers measuring the rotational speeds. It therefore does not make sense from a mechanical point of view that these two variables can decouple from one another.

Line 287 – “the response time of the turbine controller is finite.” While this is true and while this can play a role during some events, the fact that 30 minute non-overlapping windows are used, surely plays a much larger role than this. Especially since windows are analyzed where no alarms or downtime was present.

Figure 5 - The fact that no pitch angles are considered for the classification of the operational regimes, seems to result in a significant amount of misclassified points given no post-processing (section 5 is done). Is this resolved by extending the set of considered variables?

Line 333 – The fact that the estimated rated wind speed seems to be 11.8% higher than the warranted one by the manufacturer, should be justified. This deviation is quite significant, and it does raise the question if these two wind speeds are directly comparable, or if the method intrinsically overestimates the threshold for classification. While the evolution of states over time is shown in multiple figures in the paper, these figures do not inherently convey any information without knowing the wind speed at a given time. In fact, the different control regimes of a wind turbine can normally rather easily be identified by making a three scatter plots, i.e. the active power, the rotor speed, and the pitch angle as a function of the wind speed. It would be good to see that the classification of the data represents what an analyst would manually identify on these plots.

Figure 10/Line 350 – The direct use of this graph should be clarified when wanting to go towards a (general) classification framework. How can abnormal operating conditions (e.g. deration, curtailments) be taken into account? Do the authors aim to only classify based on wind speed, or should additional post-processing still occur to filter abnormal conditions? How are wind speeds dealt with where there is overlap between two control regimes?

Line 367 - Our method enables the definition of multiple operational states for wind turbines, whereby an optimization for failure analysis and prediction could be undertaken.

This seems like an overstatement. The same control regimes are described in the IEC standard, and are thus not ‘defined’ within this paper, but by the control design of the manufacturer.

Line 369 – The fact that this pre-processing step might help the performance of “all techniques used for failure analysis” seems to not be a given. It would be good to briefly illustrate with an example that this is indeed the case, as the fact that different operational regimes exist for wind turbines is well known. The so-called boundary speeds v1, v2, and vnom are quite easily identifiable by an analyst (see question 5). This raises the question if their identification was indeed a bottleneck for the proposed analysis strategy, or if the benefit is simply not considerable. In any case, automated classification of control regimes is still interesting, but the paper clearly aims at this specific use case, which should therefore be illustrated.

Regarding the "implications for failure detection" in the title of this paper: I agree that if the classification could be shown to have an added value for the failure detection, that this would be a novelty, but currently there is nothing in the paper that actually decisively shows that this classification improves failure detection. Without any proof that it actually helps the failure detection step, this sounds more like speculation than an implication. Can you provide proof, eg an example or reference, where the proposed methodology is used to augment the failure detection step?
Citation: https://doi.org/10.5194/wes-2021-107-RC2
- AC2:
  'Reply on RC2', Henrik M. Bette, 20 Jan 2022
  We thank the referee for his/her detailed comments. We are sure they helped increase the quality of our manuscript. In general, the proposed methodology in itself can be used to identify operational states without prior knowledge of the control system. It is true that the benefit for failure detection is not proven in the current paper, but only implied. We show that the different correlation structures can be detected automatically by clustering. This means that they are quite distinct and different. We are convinced that this implies benefits for failure detection by separating the space of what is normal into three smaller subspaces, from which deviations could be detected, especially if the method used to analyze failure directly depends on the correlation structure. A test and study of this benefit will be part of future research. We aim with this paper to encourage others to do this as well. We think that it is a good idea to more clearly state this in the text of the paper and change “implications” to “possible implications” in the title.
  
  We also extended the paragraph discussing implied benefits for failure detection in the introduction. We included citations from Tveten (2019) showing the strong influence of changes in correlation on PCA and Zimroz et al. (2014) showing that accounting for non-stationary operating conditions improves failure detection results.
  We are especially grateful for the specific comments, which we will address separately in the same order they were made by the referee.
  As correctly pointed out, the pitch angle is an interesting control variable. In our data set, however, it contains many missing values. This reduces the number of usable epochs from 749 to 171. As the turbine type looked at in this study aims to either keep rotation and/or power constant, it is not necessary to look at the pitch angle to distinguish these regimes. Be that as it may, we understand that the pitch angle is a very prominent variable when looking at wind turbines and this point rightfully brought up by the referee could irritate many readers. We have therefore decided to apply a method to fill the missing values in the pitch angle time series and include the basic clustering analysis with pitch angle in the manuscript as well. This can be found in a newly added section 4.2.
  
  The use of directly coupled observables is thought to be beneficial for understanding the entire correlation structure. As in many complex systems groups of correlated observables occur. Our analysis shows that the states found are different in the inter-correlation between groups not in the intra-correlations inside the groups. Further studies could investigate a minimal set of observables needed for this distinction (starting from our observables and removing groups the set of “Active Power”, “Rotor RPM” and “Wind Speed” could be considered.) as well as larger sets of observables. Of course, this could also change the number of clusters, which need to be considered.
  
  When using the proposed method as a pre-processing for failure detection, it will have to be fitted on the observables needed for that specific failure detection method.
  
  The two observables do not decouple from each other, but rather “from the others”. This means that these two stay closely correlated as is mechanically reasonable, but are no longer correlated to the other observables. We refined this wording in the revised manuscript.
  
  We still think that finite control time may in some cases play a role, but we agree that the 30 minute window is the dominant cause here. We will emphasize this point in the revised manuscript.
  
  As pointed out in the answer to specific comment one, the use of pitch angles is problematic with our data. Some misclassified points are to be expected always when using stochastic clustering methods as can also be seen in the newly added section 4.2 where pitch angle is included in the analysis. In our data set we did not find a better set of observables to consider, however we share the referees hope that with larger data sets (time- and observable-wise) the method could be improved further in the future.
  
  The evolution of states over time is shown once to clearly show that – in contrast to some other complex systems such as finance – the dependence does not seem to be time and no states die out or emerge.
  
  While we agree with the the referee that scatter plots would be interesting for comparison, we cannot show them due to our cooperation agreement with the data provider. In response to the referees' comment we specifically asked, if we could show them in this case, but we are not allowed to. We have, however, included the explanation for the difference in values in the revised manuscript:
  
  The power curve (active power in dependency of wind speed) as given by the manufacturer is one line. Accordingly, there is exactly one value ˜vnom which marks the starting point for nominal power production. In reality, especially when looking at high-frequency data, there will always be an area around this line which is realized. The value ˜vnom lies in the middle of this smeared out power curve. At this wind speed nominal power output can be reached but is not yet constant. With even higher wind speeds, it becomes less and less likely that the actual power produced lies beneath the nominal value. Only when this probability nearly vanishes, a change in correlation structure is detectable by our method. It is therefore reasonable that our value vnom lies higher than ˜vnom. While our value is therefore well suited to distinguish correlation states, it cannot be directly compared with the nominal wind speed given by the manufacturer of a turbine. We have confirmed this by looking at scatter plots of our data but cannot show them in this paper due to data confidentiality.
  
  We thank the referee again for pointing out that it was missing.
  
  We are very grateful for this comment. The referee is certainly correct that a few sentences about this in the paper are in order. We will add them in the revised manuscript. The general idea is to classify based on wind speed as this was shown to be the dominant factor dividing the three states. The overlap between two states could be used to calculate a certainty of the analysis made. If one is looking to minimize false alarms for example, one could also apply any method to both states and then consider the one indicating less failure. Abnormal operating conditions like curtailment need to be considered and used as additional post-processing when applying this in practice. These things must be considered and tested when actually using our method as a pre-processing for any analysis. As we have said, we will add this in the revised manuscript.
  
  We agree that they are not “defined” by our method, but are, of course, introduced by the manufacturer. We will change this wording to “enables the distinction of multiple operational states” to more clearly point out that we can find these states defined by the manufacturer without prior knowledge.
  
  We are unaware of publications that use automatic separation into operational/correlation states before the application of the actual failure analysis process. This also because high-frequency data is not always available. The distinction into operational states is only sensible if the interval that can be looked at is not larger than the usual time it takes for wind conditions to change. The improvement for “all techniques” is just conjecture, which led to the usage of the word “might”. We want to raise awareness of the non-stationarity in the correlation structure and the possibility to use this for pre-processing into separated normal states. We want to encourage researchers and analysts to try this out and will do so ourselves in future work. An example to underline our conjecture is a neural network: We do not really know what it does internally, but we do know that it needs to account as best it can for all mechanisms in the system. With our proposed pre-processing the neural network does not need to learn to distinguish different control regimes as this is already done. We conject that this could increase accuracy or reduce training time, or the size of the training data set needed.
  
  For methods directly dependent on the correlations such as PCA the implication is clearer: As the principal components are the eigenvectors of the correlation matrix, a changing correlation matrix will also change the principal components and thereby the results of an analysis depending on them. The referee is correct in that this for now is still only implication. Detailed analysis of the possible benefit will be part of future work.
  
  We thank the referee for recognizing that automated detection is interesting in any case.
  
  We have not proven or quantified the aid our method could provide for failure analysis in this paper and aim to do this in the future. No reference using the proposed methodology can be given yet as it is new. We hope to encourage researchers to consider this pre-processing in the future and analyze its benefit.
  
  We think that the word “implications” already describes this, in contrast to e.g. “consequences”, but agree that it needs to be pointed out more prominently. We will do so in the text of the manuscript where necessary and propose to change the title to “Non-stationarity in correlation matrices for wind turbine SCADA-data and possible implications for failure detection”.
  
  We hope that our answers are satisfactory and want to thank the referee once more for many helpful suggestions.
  
  Citation: https://doi.org/10.5194/wes-2021-107-AC2

Henrik M. Bette, Edgar Jungblut, and Thomas Guhr

Viewed

Total article views: 1,090 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
752	292	46	1,090	44	64

HTML: 752
PDF: 292
XML: 46
Total: 1,090
BibTeX: 44
EndNote: 64

Views and downloads (calculated since 01 Oct 2021)

Month	HTML	PDF	XML	Total
Oct 2021	95	24	4	123
Nov 2021	56	11	3	70
Dec 2021	60	8	0	68
Jan 2022	69	15	7	91
Feb 2022	38	11	0	49
Mar 2022	47	9	0	56
Apr 2022	28	9	0	37
May 2022	26	7	1	34
Jun 2022	11	3	1	15
Jul 2022	7	4	0	11
Aug 2022	16	5	0	21
Sep 2022	7	4	0	11
Oct 2022	6	3	1	10
Nov 2022	14	7	1	22
Dec 2022	10	3	0	13
Jan 2023	14	18	1	33
Feb 2023	7	3	0	10
Mar 2023	10	4	2	16
Apr 2023	7	14	0	21
May 2023	3	5	1	9
Jun 2023	6	11	0	17
Jul 2023	4	3	2	9
Aug 2023	14	9	1	24
Sep 2023	13	6	1	20
Oct 2023	4	5	0	9
Nov 2023	5	0	5
Dec 2023	8	8	0	16
Jan 2024	9	2	0	11
Feb 2024	8	9	1	18
Mar 2024	10	12	0	22
Apr 2024	11	1	2	14
May 2024	12	1	1	14
Jun 2024	5	7	1	13
Jul 2024	7	3	4	14
Aug 2024	13	3	1	17
Sep 2024	4	2	0	6
Oct 2024	5	4	0	9
Nov 2024	3	3	1	7
Dec 2024	7	1	0	8
Jan 2025	7	2	3	12
Feb 2025	9	3	2	14
Mar 2025	11	3	0	14
Apr 2025	18	10	1	29
May 2025	10	6	2	18
Jun 2025	16	11	1	28
Jul 2025	2	0	2

Cumulative views and downloads (calculated since 01 Oct 2021)

Month	HTML	PDF	XML	Total
Oct 2021	95	24	4	123
Nov 2021	56	11	3	70
Dec 2021	60	8	0	68
Jan 2022	69	15	7	91
Feb 2022	38	11	0	49
Mar 2022	47	9	0	56
Apr 2022	28	9	0	37
May 2022	26	7	1	34
Jun 2022	11	3	1	15
Jul 2022	7	4	0	11
Aug 2022	16	5	0	21
Sep 2022	7	4	0	11
Oct 2022	6	3	1	10
Nov 2022	14	7	1	22
Dec 2022	10	3	0	13
Jan 2023	14	18	1	33
Feb 2023	7	3	0	10
Mar 2023	10	4	2	16
Apr 2023	7	14	0	21
May 2023	3	5	1	9
Jun 2023	6	11	0	17
Jul 2023	4	3	2	9
Aug 2023	14	9	1	24
Sep 2023	13	6	1	20
Oct 2023	4	5	0	9
Nov 2023	5	0	5
Dec 2023	8	8	0	16
Jan 2024	9	2	0	11
Feb 2024	8	9	1	18
Mar 2024	10	12	0	22
Apr 2024	11	1	2	14
May 2024	12	1	1	14
Jun 2024	5	7	1	13
Jul 2024	7	3	4	14
Aug 2024	13	3	1	17
Sep 2024	4	2	0	6
Oct 2024	5	4	0	9
Nov 2024	3	3	1	7
Dec 2024	7	1	0	8
Jan 2025	7	2	3	12
Feb 2025	9	3	2	14
Mar 2025	11	3	0	14
Apr 2025	18	10	1	29
May 2025	10	6	2	18
Jun 2025	16	11	1	28
Jul 2025	2	0	2

Viewed (geographical distribution)

Total article views: 1,077 (including HTML, PDF, and XML) Thereof 1,077 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 05 Jul 2025

Short summary

We analyse the non-stationarity in Pearson correlation matrices for high frequency wind turbine data. Applying a hierarchichal k-means clustering to a time series of matrices, we distinguish different states, which exhibit distinct correlation structures. These arise from the turbine control system reacting to the current wind speed. We model boundary wind speeds between the different states. Our method enables accounting for the non-stationarity when predicting or analysing turbine failures.


Total:	0
HTML:	0
PDF:	0
XML:	0

Non-stationarity in correlation matrices for wind turbine SCADA-data and implications for failure detection

Viewed

Viewed (geographical distribution)

Cited

2 citations as recorded by crossref.