This paper describes a method to improve and correct an engineering wind farm flow model by using operational data. Wind farm models represent an approximation of reality and therefore often lack accuracy and suffer from unmodeled physical effects. It is shown here that, by surgically inserting error terms in the model equations and learning the associated parameters from operational data, the performance of a baseline model can be improved significantly. Compared to a purely data-driven approach, the resulting model encapsulates prior knowledge beyond that contained in the training data set, which has a number of advantages. To assure a wide applicability of the method – also including existing assets – learning here is purely driven by standard operational (SCADA) data. The proposed method is demonstrated first using a cluster of three scaled wind turbines operated in a boundary layer wind tunnel. Given that inflow, wakes, and operational conditions can be precisely measured in the repeatable and controllable environment of the wind tunnel, this first application serves the purpose of showing that the correct error terms can indeed be identified. Next, the method is applied to a real wind farm situated in a complex terrain environment. Here again learning from operational data is shown to improve the prediction capabilities of the baseline model.

Knowledge of the flow at the rotor disk of each wind turbine in a wind power plant enables several applications, including wind farm control, the provision of grid services, predictive maintenance, the estimation of life consumption, the feed-in to digital twins, and power forecasting, among others.

This paper describes a new method to improve a wind farm flow model directly
from standard operational data. The main idea pursued here is to use an
existing wind farm flow model to provide a baseline predictive capability;
however, as all models contain approximations and may lack the description of
some physical phenomena, the baseline model is improved (or “augmented”,
which is the term used in this work) by adding parametric correction terms.
In turn, these extra elements of the model are learned by using operational
data. The correction terms capture effects that are typically not present in
standard flow models (such as, for example, secondary steering,

Various wind farm flow models have been developed and are described in the
literature. Whereas direct numerical simulation (DNS) is still out of reach
for practical applications due to its overwhelming computational cost, large-eddy simulation (LES) methods are now routinely used for the modeling of wind
farm flows

Even though engineering models are constantly improved and refined

The idea of improving an existing model based on measurements is hardly new,
and it is actually an important topic in the areas of controls and system
identification. For example, in the field of wind farm flows, a Kalman
filtering approach has been proposed by

The contemporary literature – and not only in the field of wind energy –
indicates an increasing interest in data-driven approaches. Just to give one
single example related to wake modeling, a purely data-driven approach has
been recently described by

An alternative to the purely data-driven approach is presented in this work, where a reference baseline model is augmented with parametric error terms, which are then identified using data. The baseline model already includes prior knowledge based on physics, empirical observations, and experience. Therefore, even prior to the use of data, a minimum performance can be guaranteed. The model is augmented with parametric error terms, whose choice is driven by physics and the knowledge of the limitations of the baseline model. Once the errors are identified using operational data, their inspection can clarify the causes of discrepancy between model and measurements. Eventually, this can be used to improve the underlying baseline model. Furthermore, by looking at the magnitude of the identified errors, significant deviations from the baseline model can be flagged to highlight issues with the model itself, the data, or the training process.

Finally, it should be noted that the identification of the error terms can be combined with the tuning of the parameters of the baseline model. This addresses yet another problem: tuning the parameters of a model that lacks some physics may lead to unreasonable values for the parameters, as the model is “stretched” to represent phenomena that it does not contain. By the proposed hybrid approach, the simultaneous identification of the parameters of the baseline model together with the ones of the error terms eases this problem, as unmodeled phenomena can be captured by the model-augmenting terms, thereby reducing the chances of nonphysical tuning of the baseline parameters.

The baseline model parameters and the extra correction terms have a different
functional form in the augmented governing equations. Hence, they should be
distinguishable from each other, as they imply different effects on the
model. However, as for many identification problems, it is in general not
possible to guarantee that all unknown parameters are observable and
noncollinear given a set of measurements and, hence, given a certain
informational content. To address this problem, the method proposed by

The paper is organized as follows. First, the baseline model is introduced
in Sect.

The proposed method is applied here to the baseline wake model of

In this work, the implementation uses the

Engineering wake models depend on a number of parameters, which should be
tuned in order to obtain accurate predictions. For the specific model used in
this work, these tunable factors are the wake parameters

In this work, the parameters are first set to an initial value, either taken
from the literature or identified with ad hoc measurements; these initial
values are held fixed throughout the analysis and not changed further.
Corrections to the initial values are then expressed as

The engineering model described earlier is a rather simple approximation of a flow through a wind power plant and it is therefore bound to have only a limited fidelity to reality, with a consequently only limited predictive accuracy. Even for more sophisticated future models, it is difficult to imagine that all relevant physics will ever be precisely accounted for. But even if such a model existed, in practice one might simply not have all necessary detailed information on the relevant boundary and operating conditions that would be required. For example, one might not know with precision the conditions of the vegetation around and within a wind farm, with its effects on roughness and, hence, on the flow characteristics. In other words, it is safe to assume that all models are in error to some extent and probably always will be.

To address this problem, the model can be pragmatically augmented with correction terms. Here one could take two alternative approaches: either a generic all-encompassing error term is added to the model or “surgical” errors are introduced at ad hoc locations in the model to target specific presumed deficiencies. The first approach could be treated with a brute-force parametric modeling approach, for example by using a neural network. Here, the second approach was used, as it allows for more insight into the nature of the identified corrections. The specific parametric corrections used in the present paper are reviewed next. It is clear that these are only some of the many corrections that could be applied to the present baseline model, so that the following does not pretend to be a comprehensive treatment of the topic. Nonetheless, results indicate that some of these corrections are indeed significant and provide for a marked improvement of the baseline model.

To capture some of these effects, the model ambient flow speed

Local orographic effects and blockage may also induce variability in the
wind direction

The change of wind direction

This particular choice of the shape functions is motivated by the results
shown in Fig. 8b of

Note that the change in local wind direction also leads to a slight lateral
deflection of the nonuniform wind farm inflow introduced previously. More
precisely, for a turbine that is located

Figure

To account for such effects, the wake velocity

Effect of secondary steering on the trajectory of a downstream
turbine.

The parameters of the baseline model and of its correction terms are
identified with the method developed by

The formulation is based on the classical likelihood function, which describes the probability that a given set of noisy observations can be explained by a specific set of model parameters. By numerically maximizing this function, a set of parameters is identified that most probably explains the measurements. Bound constraints are used to guide the process and ensure convergence to meaningful results.

The accuracy with which the parameters can be estimated depends on how flat
the likelihood function is with respect to changes in the parameters. For
example, a flat maximum of the function implies that different nearby values
of the model parameters are associated with similar values of the likelihood.
These characteristics of the solution space are captured by the Fisher
information matrix, which can be interpreted as a measure of the curvature of
the likelihood function. Furthermore, it can be shown that the variance of
the estimates is bound from below (Cramér–Rao bound) by the inverse of the
Fisher matrix

To overcome this limitation of the classical maximum likelihood formulation,
following

As shown later on, this approach achieves multiple goals: it allows one to successfully solve a maximization problem with many free parameters, some of which might be interdependent on one another or not observable in a given data set; it reduces the problem size, retaining only the orthogonal parameters that are indeed observable; it highlights, through the singular vectors, the interdependencies that may exist among some parameters of the model, which provides for a useful interpretation tool that may guide the reformulation of parts of the model and its correction terms.

A steady-state wind farm model can be mathematically expressed as

Given a set

To ensure reasonable and physically viable solutions, parameters can be
forced to stay within predefined upper (subscript ub) and lower (subscript
lb) bounds, by adding the corresponding inequality constraints

The Fisher information matrix

When some parameters are highly correlated or have large variance, the problem is ill-posed: it might exhibit sluggish convergence, or no convergence at all, and small changes in the inputs may lead to large changes in the estimates. Such situations are difficult to solve in physical space, because parameters are typically coupled together to some degree through the model.

To untangle the parameters, one may resort to the SVD

The Fisher matrix

By using Eq. (

The physical parameters

To remove parameters that cannot be estimated with sufficient accuracy,
matrix

In some cases, it may be useful to increase the importance of some
measurements in the parameter estimation problem. This can be readily
obtained by simply treating an observation with weight

The proposed method is first applied in Sect.

Whether identified model corrections are indeed physical or only an artifact of the model–measurement mismatch is difficult to prove in general. From this point of view, wind tunnel experiments provide a unique opportunity to verify the concept proposed in this paper. Indeed, the overall flow within a cluster of turbines can be measured with good accuracy, and the experiments can be repeated in multiple desired operating conditions. The aim of this section is then to show that, even in the presence of multiple possibly overlapping model terms, the correct improvements to a baseline model can be learned from operational data only.

The experimental setup is composed of a scaled cluster of three G1 wind
turbines, each of them equipped with active yaw, pitch, and torque control.
The turbines were operated in the boundary layer test section of the wind
tunnel of the Politecnico di Milano. Details on the models and the wind
tunnel are reported, among other publications, in

The turbines are labeled WT1, WT2, and WT3, starting from the most upstream
one and moving downstream. The machines are mounted on a turntable, whose
rotation is used to change the wind direction with respect to the wind farm
layout. In the nominal configuration, i.e. for a turntable rotation

Wind farm layout for a null turntable rotation, looking down onto the wind tunnel floor.

A pitot probe was placed at hub height,

The yaw angle

Figure

View looking downstream of the cluster of three G1 turbines.

The ambient wind speed

The FLORIS model implementation used in this work is the one available online

Initial FLORIS parameters for the G1 turbine.

Figure

Power and thrust coefficients vs. wind speed for the G1 turbine.

The ambient wind speed was determined from the pitot tube. It was observed
that, by using this value, the power of a free-stream turbine predicted by
the FLORIS model was slightly underestimated, most probably due to the
sheared flow. To correct for this effect, measurements provided by the pitot
tube were scaled by the factor

To initially assess the role of the various parameters, a ranking analysis
was conducted. The parameters were clustered in sets, depending on their role
in the model. A first identification was performed using all parameter sets,
yielding the presumed best value, denoted

Definition of the parameters, together with their initial values, lower and upper bounds, and identified values.

All augmentation terms described in Sect.

Figure

Relative increase in the optimization cost function when eliminating one parameter set at a time.

A total of 451 observations were available, including 11 different turntable
positions and thus wind farm layouts, with turbine yaw misalignments ranging
from

Among all the available measurements gathered at each operating condition,
only the steady-state power of the wind turbines was utilized, mimicking what
could be done at full scale in the field using SCADA data. The model outputs

The threshold of the highest acceptable standard variance

Variance of the orthogonal parameters before

The constrained optimization problem (

Transformation matrix

Interestingly, the

Identified nonuniform inflow speed augmentation term (solid
line) and associated standard deviation (whiskers). Hot-wire measurements
at different heights above the floor are shown in thin solid lines. The
upstream turbine (WT1)

Correlation coefficients

Table

Figure

The identified secondary steering augmentation term is visualized in
Fig.

Identified wind direction change

Wake profiles 5

The validity of the augmentation terms, identified as explained, was assessed
by comparing the results of the simulation model with experimental wake
measurements from a different test campaign. The setup was identical to the
one considered here, except for the fact that only the first two upstream
wind turbines were installed in the wind tunnel. At the downstream distance
where the third wind turbine should have been installed, flow velocity
measurements were obtained at turbine hub height using hot-wire probes.
Figure

In the left subplots, the improvements of the augmented model with respect to
the baseline FLORIS are exclusively due to the inflow correction, as the
upstream turbine is aligned with the flow and therefore there are no
secondary steering effects. In the right subplots, the upstream turbine is
misaligned (

The turbine power coefficients are computed as

Error distributions for each turbine for all tested configurations, for the baseline FLORIS model (black dashed line), the 11-parameter augmented model (red solid line), and the 27-parameter augmented model (red dotted line).

Note that the FLORIS error distribution shows two peaks for WT1 and WT3, indicating the presence of two uncorrelated errors. The 11-parameter model removes these peaks, even though a smaller pair of peaks remains for WT2 and WT3, indicating additional errors that only the 27-parameter augmented model is able to capture.

Here again the trend is clear: the addition of nonuniform speed and secondary steering substantially increases the accuracy of the baseline model, with additional small – but not insignificant – gains offered by the additional correction terms. Finally, there is still room for improvement, possibly through extra correction terms not yet explored.

Turbine specifications.

In this section the model augmentation and identification method is applied to a full-scale wind farm, to test its applicability and usability in a realistic scenario. In such conditions, it is often difficult to assess whether the identified model corrections are indeed physical or not, due to a lack of knowledge of the actual ground truth. To deal with this problem, the classical approach of splitting the data set was used here: first, a relatively small subset of measurements is used for model and error identification; then, the rest of the data set is used for a verification of the generality of the identified model and of its improved performance with respect to the baseline one.

The onshore wind farm is situated close to Sedini, on the Italian island of
Sardinia, and it consists of

The 3D view of the Sedini wind farm with terrain elevation, as seen from

The wind farm is located at a rather complex site, as shown in
Fig.

Top view of the Sedini wind farm with turbine identifiers. The
gray arrows indicate the

Historical

Scaled number of measurement data points (10 min mean) within each speed and direction bin.

As no direct measurements of ambient conditions were available, the method
described by

Here again the FLORIS implementation was based on the version available
online

Initial FLORIS parameters for the Sedini wind farm.

The required turbine power and thrust versus wind speed curves were provided
by the turbine manufacturer. The vertical shear exponent of the inflow was
set to

Correlation between power output and hub height with respect to SL.

The different turbine foundation heights were accounted for by accordingly
increasing the tower heights, using the lowest foundation height as reference
(turbine A1-02). Indeed, power measurements of the upstream turbines show a
correlation with the actual turbine hub height with respect to sea level
(SL), as shown in Fig.

As for the wind tunnel experiments, here again a first analysis was aimed at
ranking the various correction terms. However, since the turbines were
operated with a conventional wind-aligned strategy, secondary steering
corrections were neglected. The ranking is based on data points in the range

Figure

Relative increase in the optimization cost function for the Sedini wind farm when eliminating one parameter set at a time.

On the other hand, the additional model augmentation parameters

Given these results, the rest of the analysis is based only on the subset of
parameters

Definition of the parameters, together with their lower and upper bounds, and initial and identified values. Bold italic numbers indicate vectors containing that number repeated as many times as the vector length.

The definitions of the correction parameter, together with their bounds and
converged values, are reported in Table

To identify the 40 parameters of Table

Identified inflow augmentation parameters

The model outputs

The identified optimal parameter values

Power coefficient of each individual wind turbine, as indicated
by the subplot title, as a function of wind direction

Figure

Even though the baseline FLORIS power estimates already exhibit a reasonable
correlation with the measurements for many turbines and wind directions, a
significant improvement is achieved by the augmented model. Note that for

Error probability density functions for different wind speed ranges.

The same results of Fig.

This paper has presented a new method to calibrate and augment parametric wind farm models. The proposed approach builds on the vast body of knowledge and experience embedded in available reduced wind farm flow models. However, recognizing that any such model will always have only a limited prediction accuracy, the present approach augments a baseline model with extra ad hoc terms designed to correct some of its presumed specific deficiencies. These additional elements of the model are then learned from operational data. Optionally, the baseline model parameters can also be tuned within a single integrated process. By design, the method has been exclusively based here on SCADA power measurements; therefore, it is readily applicable to most operational wind farms, whenever such data are available. However, the concept of model augmentation is very general and could clearly also be used with other measurements.

To limit the number of free parameters and to overcome the fact that the identification problem can be over-parameterized and hence ill-posed, a parameter transformation into an orthogonal space has been used. Thereby, only parameters that are sufficiently visible within a given data set enter into the identification process.

The method was first applied to a large data set obtained with scaled wind turbines operating in a boundary layer wind tunnel. Thereby, it was shown that a correct learning of the extra modeling terms is achieved. These conclusions are made possible by the fact that, in this case, the flow and wake characteristics are known with good accuracy. Next, the method was tested on a real wind farm, in a realistic and highly complex situation.

Based on the results shown here, the following conclusions can be drawn.

Within the wind tunnel environment, a correct learning of nonuniform wind farm inflow speed and of secondary steering effects has been achieved. In particular, the latter shows a good match with detailed wake measurements in wind-misaligned conditions. It is remarkable, and very promising, that such detailed features of the solution could be inferred purely from operational power data, even when starting from a baseline model that does not at all consider secondary steering.

The application to field data has shown that, as expected for the complex-terrain site analyzed here, orographic effects play a driving role. A marked model improvement could be observed, even in conditions where the model was used for extrapolating outside of the training conditions. It is worth noting that, in many practical onshore applications, orographic effects will be present, and the fact that one can learn them from simple and readily available operational data is very encouraging. Again, it should be explicitly pointed out that the baseline model did not include any orographic corrections.

It has been shown that model tuning and the learning of extra correction terms can be achieved simultaneously. This reduces the risk of adapting the baseline parameters beyond their reasonable limits, driven by unmodeled physics.

Although the augmented models show a much improved accuracy with respect to the baseline, some model mismatch still remains. Although these remaining errors may often be caused by issues in the data rather than in the model, additional improvements are thought to be possible.

Future work will apply the proposed method to other wind farms, to increase confidence in the obtained results. From longer and richer data sets, possibly in conjunction with meteorological reanalyses, it is presumed that yearly and seasonal variations could be observed. The integration of CFD analyses can be used to support and confirm the identification of orographic effects. Attention should also be paid to improved and additional forms of model corrections, including wake overlap models. Finally, it is worth pointing out again that an improved knowledge of the flow within a wind farm finds applicability in a potentially large range of digitally driven applications, including wind farm control, lifetime estimation, power forecasting, predictive maintenance, and others. Therefore, it is expected that methods for high-accuracy flow predictions in wind farms will be the subject of significant future research efforts.

Figures

For the first upstream wind turbine, WT1, the baseline FLORIS shows
significant errors depending on the turntable position. For

The power of WT2, shown in Fig.

The power of WT3, reported in Fig.

Wind turbine WT1. Each cluster of three subplots represents a
unique turntable position, as indicated by the title and the wind farm
layout sketch. Left subplot: turbine power coefficient

Wind turbine WT2. Each cluster of three subplots represents a
unique turntable position, as indicated by the title and the wind farm
layout sketch. Left subplot: turbine power coefficient

Wind turbine WT3. Each cluster of three subplots represents a
unique turntable position, as indicated by the title and the wind farm
layout sketch. Left subplot: turbine power coefficient

A MATLAB implementation of the wind farm model can be obtained by contacting the authors.

JS conducted the main research work. CLB developed the core idea of model augmentation, its formulation and the overall solution methodology and supervised the whole research. JS and CLB wrote the manuscript. BS preprocessed the field measurements. FC was responsible for the execution of the wind tunnel tests and the elaboration of the experimental results. All authors provided important input to this research work through discussions, feedback, and improving the manuscript.

The authors declare that they have no conflict of interest.

The authors express their gratitude to Enel Green Power S.p.A., which granted access to the field data, and to Stefan Kern of GE Renewable Energy for helping with the data post-processing. Special thanks go to Robin Weber of Technische Universität München and Stefano Cacciola, Alessandro Croce, Paolo Schito, and Alberto Zasso of Politecnico di Milano for their help in conducting the wind tunnel experiments.

This research has been supported by the Horizon 2020 Framework Programme, H2020 Energy (CL-Windcon, grant no. 727477) and the Bundesministerium für Wirtschaft und Energie (CompactWind II, grant no. 0325492G).

This paper was edited by Alessandro Croce and reviewed by two anonymous referees.