Identification of wind turbine main shaft torsional loads from SCADA measurements using an inverse problem approach

To assess the structural health and remaining useful life of a wind turbine within wind farms one would require site-specific dynamic quantities such as structural response and modal parameters. In this regard, a novel inverse problembased methodology is proposed here to identify the dynamic quantities of the drive train main shaft, i.e., torsional displacement and coupled stiffness. As a model-based approach, an inverse problem of a mathematical model concerning the coupled shaft torsional dynamics with SCADA measurements as input is solved. It involves Tikhonov regularisation to smoothen the 5 measurement noise and irregularities on the shaft torsional displacement obtained from measured rotor and generator speed. Subsequently, the regularised torsional displacement along with necessary SCADA measurements is used as an input for the mathematical model and a model-based system identification method called the collage method is employed to estimate the coupled torsional stiffness. It is also demonstrated that the estimated shaft torsional displacement and coupled stiffness can be used to identify the site-specific main shaft torsional loads. It is shown that the torsional loads estimated by the proposed 10 methodology is in good agreement with the aeroelastic simulations of the Vestas V52 wind turbine. Upon successful verification, the proposed methodology is applied to the V52 turbine SCADA measurements to identify the site-specific main shaft torsional loads and damage equivalent load. Since the proposed methodology does not require a design basis or additional measurement sensors, it can be directly applied to wind turbines within a wind farm irrespective of their age.

methodology is in good agreement with the aeroelastic simulations of the Vestas V52 wind turbine. Upon successful verification, the proposed methodology is applied to the V52 turbine SCADA measurements to identify the site-specific main shaft torsional loads and damage equivalent load. Since the proposed methodology does not require a design basis or additional measurement sensors, it can be directly applied to wind turbines within a wind farm irrespective of their age.

15
Monitoring of wind turbines within wind farms is increasingly becoming very important due to the need to detect specific turbines that show anomalous behavior, plan inspections or preventive maintenance, and to compute the remaining useful life of specific structures. However, addition of new instrumentation to existing turbines, such as installation of strain gauges, accelerometers can be costly and also require repetitive calibration and synchronization of their measurement signals with the turbine computer. Most wind farm operators also do not possess the aeroelastic design parameters of their wind turbines and 20 hence cannot simulate the mechanical loads acting on their wind turbine. Monitoring of turbine primary structures through existing Supervisory Control and Data Acquisition (SCADA) system based measurements can allow cost effectiveness and provide valuable information to the wind farm operator. Usually such monitoring through SCADA only provides information on power performance and not regarding turbine structural integrity. Here we develop mathematical models for the main shaft of a wind turbine that can determine both the coupled torsional stiffness of the main shaft and its torsional displacement in continuous time-domain, using existing time series measurements of the rotor speed and generator speed. This is a novel methodology that can potentially benefit wind farm owners, since both the property of the structure in terms of its stiffness, and the structural response can be determined without requiring additional sensors or information from the wind turbine manufacturer.
There are many studies available in the literature on system identification of wind turbines (Koukoura et al., 2015;Pahn et al., 30 2017;Norén-Cosgriff and Kaynia, 2021). All of the studies employed output-based operational modal analysis (OMA) (Wang et al., 2016) technique on the measured structural response to estimate the modal parameters of the structure. OMA methods can be broadly classified into two categories: (i) Time domain-based methods, and (ii) Frequency domain-based methods (Zhang et al., 2010). Time-domain based OMA methods are based on the calculation of auto and cross-correlation functions of the response time histories. Since they possess similar properties as frequency response functions, modal parameters can 35 be extracted from those correlation functions. Several algorithms are available to perform this task and few such methods are natural excitation technique (NExT) (James et al., 1995), random decrement technique (RDT) (Ibrahim, 1977), auto-regression moving average vector model (ARMAV) (Andersen, 1997) and stochastic subspace identification (SSI) method (Van Overschee and De Moor, 1993). Frequency-domain based OMA methods are peak picking method, enhanced frequency domain decomposition method(FDD) (Brincker et al., 2001) and frequency and spatial domain decomposition method (FSDD) (Wang et al., 40 2005). A recent comprehensive review on various time domain and frequency domain-based OMA methods can be found in Zahid et al. (2020). However, to the best of the authors' knowledge, there is no such study available on estimating the structural response of a wind turbine component from SCADA measurements. Thus, a novel model-based approach is proposed here for the wind turbine drive train main shaft to estimate both the structural response and the modal parameters simultaneously.
A mathematical model concerning the shaft torsional dynamics will be utilized to obtain the system response from SCADA 45 measurements. In this context, the concerned mathematical model comprised of differential equations will be solved for the shaft torsional velocity. Subsequently, the main shaft torsional displacement is obtained by numerically integrating the shaft torsional velocity. However, these time integration schemes are based on time-marching algorithms, the lack of initial conditions make the displacement reconstruction an ill-posed problem (Hong et al., 2008). Further, these time marching algorithms are sensitive to measurement noise and they even get amplified during the time marching procedure which results in inadmissible 50 errors in the reconstructed displacement. Also, this inaccurate displacement leads to drastic errors in the system modal parameter estimation. Hence, one needs to go for regularisation techniques to smoothen the reconstructed displacement. Hansen (2005) discussed the nature of various ill-posed problems and presented a number of solution methodologies. Though there are many regularisation techniques available, Tikhonov, Truncated singular value decomposition (SVD), and nuclear norm are a few of the popular techniques (Aarden, 2017). Among all these regularisation techniques, Tikhonov regularisation (Tikhonov,55 1963) has been widely used in many engineering applications (Ronasi et al., 2011;Hào and Quyen, 2012;Bangji et al., 2017;Nieminen et al., 2011) and also it has been studied extensively in the field of inverse problems (Hansen, 2005). Hence, the same has been employed in the present work. Tikhonov regularization minimizes the error using the least-square criterion and by means of a numerical damping, it also minimizes the effect of the measurement noise. Together with regularised shaft torsional displacement, the same mathematical model is utilized to obtain the shaft stiffness.
For this purpose, a model-based system identification technique called the collage method is used in the present study (Kunze and Vrscay, 1999;Groetsch, 1993). The model-based collage method have been successfully applied for system identification in various differential equations based problems such as boundary value problems (Kunze et al., 2009), reaction-diffusion problems (Deng et al., 2008) and elliptic problems (Kunze and La Torre, 2016). The collage method is used to convert the inverse problem of system identification into a minimization problem of a function of several variables (for example unknown 65 system parameters) and then the corresponding minimization problem is solved using a suitable algorithm. The minimization procedure is referred to as Collage coding and it is a greedy algorithm that seeks to construct an approximate solution to a target solution in one go. Hence, unlike other inverse problem methods, one need not solve the forward problem in an iterative manner (Deng and Liao, 2009).
One of the key benefits of estimating the shaft torsional stiffness and displacement is that it can be used to identify the site-70 specific shaft torsional load and the remaining useful life (RUL) (Ziegler et al., 2018) of the main shaft. Also, the main shaft torsional load can significantly affect the fatigue performance of other drive train components such as gearbox and planetary bearings (Dong et al., 2012;Gallego-Calderon and Natarajan, 2015;Ding et al., 2018). Hence the same site-specific torsional load can also be used to quantify the RUL of gearbox and other drive train components as well. Also, several older turbines possess only SCADA measurements by default and they lack measurement sensors, the proposed methodology can be used for 75 estimation of the site-specific loads of them.
The rest of the paper is organised as follows: the problem formulation consisting of the Tikhonov regularisation and the collage method is given in section 2; section 3 presents the verification of the proposed formulation; application of the proposed formulation on measurements are presented in section 4.

80
As mentioned in the previous section, the main objective is to identify the shaft torsional displacement and coupled stiffness from SCADA measurements. This is achieved by solving the shaft torsional dynamical equations using a suitable inverse problem algorithm and the estimated shaft torsional displacement θ and torsional stiffness K will be utilized for the shaft torsional load estimation. For this purpose, a two-mass model (refer Fig. 1) (Boukhezzar et al., 2007;Berglind et al., 2015) which governs the main shaft torsional dynamics subjected to the rotor and generator torques T r and T g , respectively, is 85 considered, and the mathematical model is given by Eqs. (1-3). It is assumed that the gearbox is perfectly stiff while transferring deformations on the main shaft and the gear ratio, N is only considered as a parameter to proportionally adjust the force and torsional displacement between the high-speed and main shafts. The main shaft is modelled by an inertia free viscously damped torsional spring. Further, the edgewise flexibility of the blade and the torsional stiffness of the main shaft is assumed to be Here, J r represents the inertia of the rotor, J g represents the collective inertias of the high-speed shaft, the gearbox, and the generator, ω r and ω g are the rotor and generator speeds, respectively, C is the shaft damping coefficient and θ is the shaft 95 torsional displacement.
For normal operation of the wind turbines, the shaft torsional dynamics is mainly influenced by the low frequency modes such as blade edgewise and shaft free-free modes and the high frequency gearbox dynamics do not play a significant role in it.
Hence, the two-mass model is sufficient enough to model the shaft torsional dynamics for the wind turbine normal operations as it includes both the low frequency modes. Also, given the system parameters and rotor and generator torques, the two-mass 100 model is capable of predicting the shaft torsional displacement (θ) as close as that of the full-fledged aeroelastic simulation as shown in Fig. 2. Here, the aeroelastic simulation is performed in the DTU in-house tool called HAWC2 (Larsen and Hansen, 2007) and the results are obtained for the Vestas V52 (Vestas) wind turbine at a mean wind speed of 8 m/s.
In forward problem approach, given the modal parameters (J r , J g , C, K) and external torques (T r and T g ), Eqs (1-3) are solved for ω r , ω g and θ. But given only SCADA measurements, one has to solve Eqs. (1-3) inversely for θ and modal param-105 eters. In general, the available SCADA measurements are ω r , ω g , P, β, U . Here, P, β, U are, respectively, the generator power, blade pitch angle and wind velocity.

Collage method
Given ω r and ω g , it is straightforward to use Eq.
(3) to obtainθ and then θ. The next step is to estimate the modal parameters that are required for the load calculation. The collage method (Kunze and Vrscay, 1999;Groetsch, 1993) is used for the same in the present study since it is a model-based approach. For a given initial value problem (IVP), that admits a target solution x(t), the associated Picard integral operator T is given by, The assumptions regarding the parameter estimation problem using the collage method are listed as follow (Deng and Liao,115 2009): 1. x(t) ∈ [t 0 , t n ] is a bounded solution; where, t 0 and t n are positive constants satisfying t 0 < t n .
3. The exact solution x(t) of the system (4) exists uniquely.
It is important to note that the fixed pointū(t) of this Picard operator is the unique solution of the given IVP (Kunze 120 et al., 2004). Accordingly, the collage distance becomes, (x − T x) and then the optimal solution is the one which minimizes the squared collage distance i.e., L 2 collage distance. Also, unlike the conventional inverse problem which minimises the approximate error d(x −x), the collage method minimizes the collage distance d(x, T x) which is an useful change as one cannot findx for a general T (Kunze et al., 2004). Further, the optimality of the collage distance minimization is ensured as shown by Kunze et al. (2004). Minimising the L 2 collage distance using least square method yields a stationarity condition, parameters as the generator torque (T g ) can be readily obtained from SCADA data as, T g = P/ω g . Accordingly, the Picard 130 integral for Eq. (2) becomes, where, t is the time variable and ω g0 is the generator speed at time, t = 0. Modal parameters will be obtained by minimizing Eq. (6) using the least square method.

Application of the Collage method 135
To the best of the authors' knowledge, this collage method has not been used in the context of wind turbine system identification and hence it is important to check the applicability of this method for the same. This is done by estimating the modal parameters using the collage method for the following two wind turbines and verifying with its design values, (i) DTU 10-MW (Bak et al.) and ( Table. 1. Due to confidentiality policy, only the percentage error is given for the Vestas V52 turbine. As seen in the table, the estimated values match well with 145 the design values. If the torsional displacement (θ) is known, then the determination of torsional stiffness (K) from Eq. (6) is readily feasible as explained. However in practice, the shaft torsional displacement is unknown, and therefore the collage equations may not be directly used to determine the shaft stiffness.

Regularisation
As explained earlier, with ω r and ω g ,θ is obtained by using Eq.
(3) and then by using time integration schemes onθ, the 150 shaft torsional displacement (θ) is obtained. However, these time integration schemes are based on time-marching algorithms, they require initial condition on displacement which are usually unavailable or inaccurate in real situations. The effect of the lack of initial conditions on the reconstructed displacement obtained using the time integration scheme is shown in Fig. 3.
As seen in the figure, the numerical error is multiplicatively increased with time which results in a drift in the reconstructed displacement. As mentioned in the introduction, to minimize the numerical error due to the lack of initial conditions and to Implementation of Tikhonov regularization on the velocity to obtain the displacement is not readily available in the literature, the same is presented here for the sake of completeness. By definition, the velocityθ is expressed as, where,θ(t) is the velocity obtained from Eq. (3) which can be considered as a measured velocity. As explained earlier, the lack of initial conditions in addition to the measurement noise leads to erroneous displacement. In order to minimise the error, following minimisation problem has to be solved, Here,θ is the calculated velocity. By means of the trapezoidal rule, Eq. (8) is discretised as follows (Hong et al., 2008), where ∆t is the time interval of the discretization and L a is the diagonal weighing matrix of order (n + 1) as, Further, the calculated velocityθ is discretised by the central difference rule and written in matrix form as,

170
where, the central difference matrix L c of size (n + 1) × (n + 3) and the displacement vector θ of size (n + 3) are given as, Here, the time steps denoted by −1 and (n + 1) are fictitious nodes. Substitution of Eq. (11) into Eq. (9) leads to, where, L = L a L c . This minimisation problem is regularised for solution boundedness with a parameter λ, and given as, The above minimisation problem is known as the Tikhonov regularisation and λ is referred to as the regularisation parameter.
Minimising Eq. (14) as, yields the following quadratic equation in θ, where, I is the identity matrix of order (n + 3). Since ten-minute SCADA measurements with a sampling frequency of 50 Hz are considered for θ(t) estimation, n becomes 30000.
The choice of regularisation parameter (λ) plays a crucial role in getting an optimal fit for the solution. Based on the knowledge about measurement errors, Hansen (2005) proposed two classes for the estimation of λ:

185
methods based on knowledge of measurement errors methods that do not require details about measurement errors.
In the present scenario, the information regarding the measurement error is unknown, hence class two is used for the current study. In class two, there are three widely used methods ( of λ (Gao et al., 2016). Further, for larger problems, the quasi optimality method is computationally expensive than the Lcurve method. Owing to this fact, the L-curve method is used here for estimating λ. In L-curve method, the optimal λ is the one which gives the maximum curvature in the L-curve between norm of the regularized solution α(λ) = θ reg 2 and norm of the residual β(λ) = 1 2 Lθ − L aθ ∆t 2 and the curvature of the L-curve is given by (Nieminen et al., 2011), Substituting the optimal λ obtained by finding the maximum curvature of Eq.

200
Upon obtaining θ using the Tikhonov regularisation, Eq. (2) is then solved inversely with regularized θ using the collage method for the estimation of K and then the torsional load is obtained as M z = Kθ. This entire methodology is shown as a flow chart in Fig. 5.

Verification of the method on V52 turbine simulations
To verify the proposed methodology for the torsional load estimation described in section 2, aeroelastic simulations are per-205 formed for Vestas V52 turbine corresponding to the design load case (DLC 1.2) (Hansen et al., 2015) in HAWC2. The DLC 1.2 consists of 18 simulations (three yaw directions and six turbulent wind seeds ) for each mean wind speed and the mean wind speed is ranging from 4 m/s to 26 m/s in the interval of 2 m/s, which results in total 216 simulations. From these 216 simulations, the inputs (ω r , ω g , and T g ) for the proposed methods are obtained and by following the procedure depicted in Fig. 5, the torsional loads are obtained and the same is compared with the simulation results. From the simulated ω r and  ω g , the torsional velocityθ is obtained using Eq. (3) and then the corresponding θ for each simulation is obtained by using the Tikhonov regularisation. By assuming that the same level of noise presents in all the simulations of a particular mean wind speed, the optimal λ is estimated only once per mean wind speed. The estimated regularisation parameter (λ) for each mean wind speed is presented in Fig. 6. As seen in the figure, a higher value of λ at higher wind speeds indicates that more numerical damping is needed to suppress the numerical noise. Tikhonov regularisation. The regularized θ dyn for two representative mean wind speeds are compared with the dynamic component of simulated torsional displacement in Fig. 7. By removing the static displacement from the simulated displacement, the resulting dynamic component can be compared with the regularised θ dyn as shown in Fig. 7. As seen in the figure, the regu-220 larised θ dyn matches well with the simulated dynamic displacement for most of the time except for the peak amplitudes. After obtaining θ dyn , Eq.
(2) is solved inversely using the collage method for K for all the 216 simulations. Owing to uncertainty associated with θ dyn estimation along with the collage method, three skewed (outliers) distributions for K can be determined at different mean wind speeds. For the distributions with outliers, mode is the better estimate of the central tendency as it is least biased by the outliers (Hedges and Shah, 2003). Ideally, there exists only one torsional K value for the wind turbine main 225 shaft and hence by taking the mean of modes of the resulting pdfs some stability can be gained for the resultant estimate of K.
The relative error between the estimated K value obtained by taking the mean of modes and the design value of the turbine is 12.06 % and the values are not presented here due to confidentiality policy. At this point, it is important to remember that the estimated K value is a combination of the main shaft and blade edgewise stiffness.
In order to obtain the site-specific torsional load, it is important to estimate the static displacement (θ stat ), since the torsional 230 load is given by M z = Kθ = K(θ stat + θ dyn ). This is achieved by considering the static equilibrium of Eq. (1) as follows, where, η gen is the generator efficiency, which is 94.4 % for the V52 turbine and from Eq. (18), θ stat is obtained as, Hence, by using θ stat and θ dyn , the total torsional load is obtained as, At this stage, the estimated torsional loads can be compared with the simulated torsional loads. The comparison of the torsional loads for two representative mean wind speeds are shown in Fig. (8). In the figure, confidential values such as load Upon estimating the torsional load, the torsional damage equivalent load (DEL) at each mean wind speed is calculated using the following equation: where, T sim is the duration of the load case, N sim is the number of simulations at each mean wind speed, k n is the total number 245 of load cycles in a given time series, N i,k are the number of cycles at load amplitude S i,k (σ) determined with rain flow counting and m is the Wöhler exponent. The zero mean load amplitude is obtained as (Veldkamp, 2006), where, S m is the mean load and M is the mean stress sensitivity. The turbine shaft is made up of cast iron and Veldkamp (2006) has reported that for such material, the mean stress correction factor is M = 0.19. For the 1 Hz torsional DEL of a ten-minute time series, N ref becomes 600 cycles and m = 6 is used for the present computation. Using Eq. (21), the computed 250 torsional DEL for DLC 1.2 is compared with the simulated DEL and is shown in Fig. (9). For most of the means wind speeds, the estimated DEL is in good agreement with the simulated DEL and the absolute error between these two at each mean wind speed ranges from 4% to 12 %. Higher error at higher mean wind speeds is due to the fact that the peak amplitudes in θ dyn are  Table 2. Normal operation filter conditions. not captured well using the Tikhonov regularisation. This could be due to fact that the pitch control will be active beyond the rated wind speed and hence the instantaneous changes in the pitch angle affects the rotor speed. 255 4 Application on V52 turbine measurements Now, using the verified methodology, the torsional loads for the drive train main shaft are estimated from the SCADA measurements. For this purpose, SCADA measurements of the Vestas V52-850 kW research turbine installed at the DTU Risø site will be utilized. In particular, measurements taken for the period of January 2019 consisting of 4459 ten minute recorded cases are used for this study. Since the interest is on normal operations of the turbine, the measurement data is filtered based on the 260 conditions given in Table 2 that results in 627 cases.
It is important to note that based on the site sector assessment, the V52 turbine is in wake free condition between 280 • and 320 • wind directions. Further, speed sensors of the rotor measures the rotational speed with a low sample rate which results in a piecewise constant signal. This type of signal has to be converted as a differentiable signal which is achieved by increasing the sample rate followed by a spline interpolation.  Now, the filtered SCADA measurements are utilized to obtain the torsional loads using the proposed methodology. Using the measured rotor speed and generator speed, θ dyn is calculated using the Tikhonov regularisation method. The obtained θ dyn for two representative mean wind speeds are shown in Fig. 10.
Similar to the V52 simulations, the regularisation parameter is obtained for each wind speed measurements using the L-curve criterion and the obtained values not presented here for the sake of brevity. Subsequently, by applying the collage method on 270 Eq.(2) for all the 627 cases, the K values are estimated for each load case and then the resultant K value is obtained by taking the mean of modes of the resulting pdf as described in the previous section. The relative error between the estimated and design K values is 3.6 %. With the estimated K value, θ stat is calculated for each load case using Eq. (19) and then the torsional load is obtained from Eq. (20). The calculated torsional loads for two representative wind speeds are shown in Figs. (11a, 11b).
After estimating the torsional loads for all the 627 cases, the identified loads are grouped according to the mean wind speeds 275 range from 6 m/s to 22 m/s which are subdivided into 9 wind speed bins of 2 m/s width each. Subsequently, the 1 Hz torsional DEL is calculated for each mean wind speed using Eq. (21) and the same is shown in Fig. (12). It is important to note that the DEL given in Fig. 9 (Simulation) being the design load and the DEL given in Fig. 12 being the site-specific load, the difference between these two will give an estimate about the remaining torsional fatigue life of the main shaft under normal operating conditions. Decisions regarding RUL of the drivetrain may be taken considering the estimated main shaft torsional 280 design tools to predict the loading within the gearbox.

Summary
A novel inverse problem-based approach is developed for estimating the shaft torsional displacement and stiffness by using the SCADA measurements. A mathematical model describing the coupled shaft torsional dynamics is used for this purpose.

285
Numerical errors and the effect of measurement noise on the displacement reconstruction are minimized through the Tikhonov regularization technique. Subsequently, the collage method is used to estimate the main shaft coupled torsional stiffness. The estimated main shaft quantities are then used to identify the main shaft site-specific torsional load. The proposed formulation is successfully verified for the main shaft torsional loads with the aeroelastic simulation of the Vestas V52 turbine. Upon verification, the methodology is extended to identify the site-specific main shaft torsional loads of the same turbine by using 290 SCADA measurements. For this purpose, the measurement data from the DTU Risø site is utilized and the measurement data is filtered and calibrated for the turbine normal operation. Using the identified torsional loads, the torsional DEL is obtained.
Depending on the prior information about the stiffness value, one can either use the entire proposed methodology or follow the torsional displacement estimation part of the proposed methodology for the torsional load identification. Since the site-specific SCADA measurements are used in the analysis, the obtained loads can give a better estimate of the remaining fatigue life which 295 can help in the life extension decision-making process. Besides monitoring the estimated loads can help in inspection planning and scheduling maintenance activities. As the proposed methodology does not require any design basis, it can be used for any older turbines for the estimation of the main shaft torsional RUL.