Online model calibration for a simplified LES model in pursuit of real-time closed-loop wind farm control

. Wind farm control often relies on computationally inexpensive surrogate models to predict the dynamics inside a farm. However, the reliability of these models over the spectrum of wind farm operation remains questionable due to the many uncertainties in the atmospheric conditions and tough-to-model dynamics at a range of spatial and temporal scales relevant for control. A closed-loop control framework is proposed in which a simpliﬁed model is calibrated and used for optimization in real time. This paper presents a joint state-parameter estimation solution with an ensemble Kalman ﬁlter at its core, which calibrates the surrogate model to the actual atmospheric conditions. The estimator is tested in high-ﬁdelity simulations of a nine-turbine wind farm. Exclusively using measurements of each turbine’s generated power, the adaptability to modeling errors and mismatches in atmospheric conditions is shown. Convergence is reached within 400 s of operation, after which the estimation error in ﬂow ﬁelds is negligible. At a low computational cost of 1 . 2 s on an 8-core CPU, this algorithm shows comparable accuracy to the state of the art from the literature while being approximately 2 orders of magnitude faster.


Introduction
Over the past decades, global awakening on climate change and the environmental, political and financial issues concerning fossil fuels have been catalysts for the growth of the renewable energy industry. As the primary energy demand in Europe is projected to decrease by 200 million tonnes of oil equivalent from 2016 to 2040, there is an additional shift in the energy source used to meet this demand (International Energy Agency, 2017). Shortly after 2030, onshore and offshore wind energy are projected to become the main source of electricity for the European Union. By then, about 80 % of all new capacity added is projected to come from renewable energy sources, enabled by a favorable political climate.
While these developments have clear benefits, an important problem with wind energy is that the rotational speed of most commercial turbines is decoupled from the electricity grid frequency via each turbine's power electronics (Aho et al., 2012). As the current grid-connected fossil fuel plants are replaced by non-synchronous renewable energy plants, the inertia of the electricity grid will decrease, making it less stable and more prone to machine damage and blackouts (Ela et al., 2014). Therefore, there is a strong need for wind farms and other renewable sources to provide ancillary grid services. Wind farm control aimed at increasing the grid stability is more commonly defined as active power control (APC). In APC, the power production of a wind farm is regulated to meet the power demand of the electricity grid, which may change from second to second.
Existing literature on wind farm control has mainly focused on maximizing the power capture (e.g., Rotea, 2014;Gebraad et al., 2016;Munters and Meyers, 2017). However, literature on APC has been receiving an increasing amount of attention (e.g., Van Wingerden et al., 2017;Boersma et al., 2017). The main challenges in wind farm control are the large time delays caused by the formation of wakes, the many uncertainties in the atmospheric conditions, and the questionable reliability of surrogate models over the wide spectrum of wind farm operation. (See Boersma et al. (2017) and Knudsen et al. (2015) for state-ofthe-art overviews of control and control-oriented modeling for wind farms.) While there has been success with modelfree methods for power maximization (e.g., Rotea, 2014), it is unclear to what degree such methods can be used for power forecasting. Furthermore, model-free methods typically have long settling times, making them intractable for APC. On the other hand, for model-based approaches, the aforementioned challenges make it impossible for any model to reliably provide power predictions in an open-loop setting. Hence, a model-based approach in which a surrogate wind farm model is actively adjusted to the present conditions is a necessity for reliable and computationally tractable APC algorithms. This closed-loop wind farm control framework, consisting of three components, is shown in Fig. 1.
The first component of the closed-loop framework is a computationally inexpensive surrogate model that accurately predicts the power production of the wind farm ahead of time, on a timescale relevant for control. The most commonly used surrogate models in wind farm control are steady-state models, which are heuristic and neglect all temporal dynamics . While some of these models have shown success in wind tunnel tests (e.g., Schreiber et al., 2017) and field tests (e.g., Fleming et al., 2017a, b) for power maximization, the actuation frequency is limited to the minutes timescale, since the flow and turbine dynamics are predicted on the minutes timescale. Furthermore, time-ahead predictions with these models are limited to the steady state, limiting their use for APC. There is a smaller yet significant number of dynamic surrogate wind farm models (e.g., Munters and Meyers, 2017;Boersma et al., 2018;Shapiro et al., 2017a) that attempt to include the dominant temporal dynamics inside the farm. These models can be used for control on the seconds timescale, and furthermore allow timeahead predictions, some even under changing atmospheric conditions. Specifically, the dynamic surrogate model employed in Shapiro et al. (2017a) is computationally feasible, but only models the flow in one dimension and furthermore allows no turbine yaw or changes in the wind direction, limiting its applicability. Furthermore, the dynamical model in Munters and Meyers (2017) has shown success for closedloop control applications, but it is too computationally expensive for any kind of real-time control, and the authors present their results solely as a benchmark case. In the work presented here, the model described in Boersma et al. (2018) is used, which is a two-dimensional (2-D) large eddy simulation (LES) code with wind farm control as its main objective. This dynamic surrogate model, named "WindFarmSimulator" (WFSim), includes yaw and axial induction actuation, turbine-induced turbulence effects, and spatially and tempo-rally varying inflow profiles, with a moderate computational cost.
The second component of the closed-loop framework is an algorithm that adjusts the surrogate model's parameters to improve its accuracy online using flow and/or turbine measurements (e.g., supervisory control and data acquisition, SCADA, data; lidar measurements; meteorological masts). In terms of control, this turns into a joint estimation problem, in which both the model state and a subset of model parameters are estimated online. Currently, the optimization algorithms presented in Munters and Meyers (2017) and Vali et al. (2017) have assumed full state knowledge, conveniently ignoring the step of model adaptation. Literature on state reconstruction and model calibration for dynamical wind farm models is sparse, limited to linear low-order models and/or common estimation algorithms.  designed a traditional Kalman filter (KF) for their low-fidelity model, showing marginal improvements compared to optimization using a static model. Shapiro et al. (2017a) present a one-dimensional dynamic wake model used with receding horizon control for secondary frequency regulation, using an estimation algorithm following Doekemeijer et al. (2016). Furthermore, Iungo et al. (2015) used dynamic mode decomposition to obtain a reduced-order model of the wind farm dynamics, which was then combined with a traditional KF for state estimation. To the best of the authors' knowledge, none of these methods have explored more sophisticated models such as WFSim, and often only use simple state estimation algorithms that are lacking in terms of accuracy and computational tractability.
The third component of the closed-loop framework is an optimization algorithm, which typically is a gradient-based or nonlinear optimization algorithm (e.g., Gebraad et al., 2016) for steady-state models, and a predictive optimization method for dynamical models (e.g., Goit and Meyers, 2015;Vali et al., 2017;Siniscalchi-Minna et al., 2018). A more in-depth discussion on optimization algorithms is out of the scope of this article.
The focus of this work is on a model adaptation algorithm for WFSim, which balances estimation accuracy and computational complexity. In previous work (Doekemeijer et al., 2016, state estimation using flow measurements downstream of each turbine has shown success using an ensemble KF (EnKF), with a computational cost several orders of magnitude lower than traditional KF methods. The main contributions of this article specifically are the additional adaptation to a mismatch in atmospheric conditions (specifically the ambient wind speed and turbulence intensity), the option to use turbine's power signals in addition to, or instead of, flow measurements, a further reduction in the computational complexity, Wind Energ. Sci., 3, 749-765, 2018 www.wind-energ-sci.net/3/749/2018/ Figure 1. Closed-loop wind farm control framework. Measurements z (e.g., SCADA or lidar data) are fed into the controller. First, the state of the surrogate model x is estimated to represent the actual atmospheric and turbine conditions inside the wind farm. Secondly, using the calibrated model, an optimization algorithm determines the control policy (e.g., yaw angles) for all turbines q. This control policy may be a set of constant operating points, but can also be time-varying, depending on whether the surrogate model is time-varying and the employed optimization algorithm. The photograph of the wind farm is from Christian Steiness.
and a comparison of the EnKF with the state of the art in the literature.
The structure of this article is as follows. In Sect. 2, the surrogate model will be introduced. In Sect. 3, a time-efficient, online model calibration algorithm for dynamical wind farm models is detailed. This calibration algorithm is validated and compared with standard algorithms in the literature in high-fidelity simulations in Sect. 4. The article is concluded in Sect. 5.

The surrogate model
The framework of Fig. 1 requires a surrogate model of the wind farm. In this work, that is the WindFarmSimulator (WFSim) model presented by Boersma et al. (2018). This model is particularly suited as it includes both yaw and axial induction actuation and yields a relatively high accuracy with a relatively low computational cost 1 . The aim of this section is to give a summary of the surrogate model, rather than a full derivation and motivation of the assumptions made. The reader is referred to Boersma et al. (2018) for more information.
Fundamentally, WFSim is based on the 2-D unsteady incompressible Navier-Stokes (NS) equations. The surrogate model can be completely described by the flow and rotor dynamics in a horizontal plane at hub height. WFSim deviates from a traditional 2-D NS model in two ways. Firstly, the diffusion term is neglected, as it plays a negligible role due to the low viscosity of air. Secondly, the dissipation term in the lateral direction in the continuity equation is multiplied by a factor of 2 to approximate flow dissipating in the ver-tical flow dimension. Other vertical flow contributions such as vertical meandering and shear are neglected. The subgridscale model is formulated using an eddy-viscosity assumption in combination with Prandtl's mixing length model. The mixing length is parameterized as a function of the spatial location, increasing linearly with distance from the downstream rotor, starting at zero at distance d downstream and peaking at distance d, where s defines the slope of the mixing length. Basically, the larger s , the quicker wakes recover to their free-stream properties. Furthermore, the turbines are modeled using the non-rotating (static) actuator disk model, projected onto the 2-D plane at hub height. The turbine is assumed to be a rigid object applying a 2-D force vector on the flow. Both the turbine forcing term and the turbine power output are scaled by tuning factors c f and c p , respectively, to account for unmodeled effects. Together with the three parameters from the turbulence model, this leads to a total of five tuning parameters.
These NS equations are solved over a spatially and temporally discretized domain (Boersma et al., 2018). Dirichlet boundary conditions for the longitudinal and lateral velocity are applied on one side of the grid for inflow, while Neumann boundary conditions are applied on the remaining sides for the outflow. The surrogate model reduces to a nonlinear discrete-time deterministic state-space model as where x k ∈ R N is the system state at time k, which is a column vector containing the collocated longitudinal flow velocity at each cell in the domain u k ∈ R N u , the lateral flow velocity at each cell in the domain v k ∈ R N v , and the pressure term at each cell in the domain p k ∈ R N p , with N = N u + N v + N p and N u ≈ N v ≈ N p ≈ 1 3 N. The state x k is formulated as Empirically, good results have been achieved with cell dimensions of about 30-50 m in width and length, resulting in N with a typical value on the order of 10 3 -10 4 for six-to nine-turbine wind farms (e.g., Vali et al., 2017;Doekemeijer et al., 2016Doekemeijer et al., , 2017Boersma et al., 2018). Such a number of states may seem very small for LES simulations, yet is very high for control purposes. Furthermore, q k ∈ R O includes the system inputs, i.e., the turbine control settings: the turbine yaw angles γ i and the thrust coefficients C T i for i = 1, . . ., N T , with N T being the number of turbines. The system outputs z k ∈ R M are defined by sensors. It can include, among others, flow field measurements (z k ⊂ x k ) and power measurements. We define the integer M u,v ∈ Z with 0 ≤ M u,v ≤ M as the total number of flow field measurements. The nonlinear functions f and h are the state forward propagation and output equation, respectively. The computational cost may vary from 0.02 s for twoturbine wind farms with N = 3 × 10 3 states (e.g., in Doekemeijer et al., 2017) to 1.2 s for N = 1 × 10 5 states for medium-sized wind farms (e.g., in Boersma et al., 2018), for a single time-step forward simulation on a single desktop CPU core. The computational complexity of the model is what motivates the use of time-efficient estimation algorithms in this work, and time-efficient predictive control methods for optimization in related work (Vali et al., 2017). Here, the limits of computational cost are explored to maximize model accuracy while still allowing real-time control. Note that research on the computational feasibility of optimization algorithms using WFSim is ongoing.

Online model calibration
Due to the limited accuracy of surrogate wind farm models, and due to the many uncertainties in the environment, surrogate models often yield predictions with significant uncertainty in the wind flow and power capture inside a wind farm. Since control algorithms largely rely on such predictions, this may suppress gains or even lead to losses inside a wind farm. Unfortunately, higher-fidelity models are computationally prohibitively expensive for control applications. Hence, lower-fidelity surrogate models are calibrated online using readily available measurement equipment.
In this section, first the challenges for real-time model calibration for the surrogate "WFSim" model described in Sect. 2 will be highlighted in Sect. 3.1. Secondly, a mathematical framework for recursive model state estimation will be presented in Sect. 3.2. Thirdly, a number of nonlinear state estimation algorithms are presented in Sects. 3.3 to 3.5, building up from the industry standard to the state of the art in the literature. Finally, a robust and computationally efficient model calibration solution is synthesized in Sect. 3.6, which allows for the simultaneous estimation of the boundary conditions, model parameters, and the model states of WFSim in real time using readily available measurements from the wind farm.
Note that we will henceforth refer to the estimation of x as state(-only) estimation. The estimation of both model states and model parameters such as s is referred to as (joint) stateparameter estimation.

Challenges
Online model calibration for WFSim is challenging for a number of reasons. First of all, the model is nonlinear, and thus the common linear estimation algorithms cannot be used without linearization, which limits accuracy (Boersma et al., 2018). Secondly, an estimation solution relying on WFSim is sensitive to instability when the model state sufficiently deviates from the continuity equation. Finally, the surrogate model typically has on the order of N ∼ 10 3 -10 4 states, which is extraordinarily high for control applications. However, real-time estimation is a necessity for real-time modelbased control, and thus one needs to find a trade-off between accuracy while guaranteeing state updates at a low computational cost.

General formulation
This section summarizes the basics of the KF, which is the literature standard for state estimation in control. The goal of a KF is to recursively estimate the unmeasured states of a dynamical system through noisy measurements. Assumed here is a system (the wind farm) represented mathematically by a discrete-time stochastic state-space model with additive noise, where k is the time index, x ∈ R N is the unobserved system state, z ∈ R M are the measured outputs of the system, q ∈ R O and w ∈ R N are the controllable inputs and process noise, respectively, that drive the system dynamics, and v ∈ R M is measurement noise. Furthermore, we assume w and v to be zero-mean white Gaussian noise with covariance matrices with E the expectation operator. Estimates of the state x k , denoted byx k|k , are computed based on measurements from the real system. Here,x k| means an estimate of the state Wind Energ. Sci., 3, 749-765, 2018 www.wind-energ-sci.net/3/749/2018/ vector x at time k, using all past measurements and inputs Z aŝ State estimates are based on the internal model dynamics and the measurements, weighted according to their probability distributions. We aim to find an optimal state estimate, in which optimality is defined as unbiasedness, E[x k −x k ] = 0, and when the variance of any linear combination of state estimation errors (e.g., the trace of In reality, the assumed model described by f and h always has mismatches with the true system, and the assumptions in Eq. (3) often do not hold. Further, the matrices Q k , R k , and S k are usually not known and are rather considered to be tuning parameters, used to shift the confidence levels between the internal model and the measured values. For R Q, estimations will heavily rely on the measurements, while for Q R, estimations will mostly rely on the internal model. Kalman filtering remains one of the most common methods of recursive state estimation. KF algorithms typically consist of two steps.

A state and output forecast, including their uncertainties
(covariances): In Eqs. (5) and (6),x k| andẑ k| are the forecasted system state vector and measurement vector, respectively.
2. An analysis update of the state vector, where the measurements are fused with the internal model: in Eq. (10) is the pseudo-inverse of P z k|k−1 , since this matrix is not necessarily invertible. Traditionally, state estimation for linear dynamic models is done using the linear KF (Kalman, 1960). However, this is not a viable option here, as the surrogate model is nonlinear. Rather, a number of nonlinear KF variants are looked upon.

Extended Kalman filter (ExKF)
Linearization of the surrogate model is the most popular and straight-forward solution to the issue of model nonlinearity, as done in the extended KF (ExKF). The ExKF has shown success in academia and industry (Wan and Van Der Merwe, 2000) and is perhaps the most popular nonlinear KF. However, it has a number of disadvantages. As described in Sect. 3.1, model linearization is troublesome. Furthermore, for surrogate models with many states such as WFSim, the ExKF has an additional challenge: computational complexity. The operation in Eq. (10) includes a matrix inversion with a computational complexity of O(M 3 ), and the ExKF furthermore includes two matrix multiplications each with a complexity of O(N 3 ). As there are significantly fewer measurements than states (M N) for the problem at hand, these matrix multiplications dominate the computational cost. The ExKF has a CPU time on the order of 10 1 s for a two-turbine wind farm, which may be too large for our purposes. To reduce computational cost in the ExKF, the surrogate model and/or the covariance matrix P have to be simplified. This is not further explored here. Instead, two KF approaches will be explored that directly use the nonlinear system for forecasting and analysis updates. Doing so, we circumvent the problems with linearization, and additionally better maintain the true covariance of the system state.

The unscented Kalman filter (UKF)
The unscented Kalman filter (UKF) relies on the so-called "unscented transformation" to estimate the means and covariance matrices described by Eqs. (5) to (9). The conditional state probability distribution of x k knowing Z k is again assumed to be Gaussian. In the UKF, firstly a number of sigma points (also referred to as "particles") are generated such that their mean is equal tox k|k and their covariance is equal to Cov (x k , x k ). Secondly, each particle is propagated through the nonlinear system dynamics (f , h). Thirdly, the mean and covariance of the forecasted state probability distribution is again approximated by a weighted mean of these forecasted sigma points (Wan and Van Der Merwe, 2000).
Mathematically, we define the ith particle as ψ i k| ∈ R N , which is a realization of the conditional probability distribution of x k given Z . The UKF follows a very similar forecast and analysis update approach as the traditional KF in Eqs. (5) to (12), yet applied to a finite set of particles (Wan and Van Der Merwe, 2000).
1. For the forecast step, a particle-based approach is taken. i. A total of Y = 2N + 1 particles, with N equal to the state dimension, are (re)sampled to capture the mean and covariance of the conditional state probability distribution p x k−1 |Z k−1 by where λ = α 2 (N + κ) − N is a scaling parameter, α determines the spread of the particles around the mean, and κ is a secondary scaling parameter typically set to 0 (Wan and Van Der Merwe, 2000). The vector ψ k−1|k−1 is the estimated state vector calculated as where the weight of each particle's mean w i mean and covariance w i cov. is given by and β is used to incorporate prior knowledge on the probability distribution. In this work, β = 2 is assumed, which is stated to be optimal for Gaussian distributions (Wan and Van Der Merwe, 2000).
ii. Each particle is propagated forward in time using the expectation of the nonlinear model as where ζ i k| is defined as the system output corresponding to the particle ψ i k| . iii. The expected state ψ and expected output ζ are calculated aŝ and the covariance matrices are (re-)estimated from the forecasted ensemble by 2. For the analysis step, one can apply the same equations as in Eqs. (10) to (12).
The UKF has been shown to consistently outperform the ExKF in terms of accuracy, since it uses the nonlinear model for forecasting and covariance propagation. However, this does come at an increased computational cost. Namely, Y = 2N + 1 particles are required to capture the mean and covariance of the conditional state probability distribution. This implies that 2N + 1 function evaluations are required for each UKF update. Even for a two-turbine wind farm in WFSim, a computational cost of 1 × 10 2 s per iteration (k → k + 1) would not be surprising. While Eq. (14) can easily be parallelized, computational complexity remains troublesome, especially for larger wind farms. The issue of computational complexity is tackled by the ensemble KF.

The ensemble Kalman filter (EnKF)
The ensemble Kalman filter (EnKF; Evensen, 2003) is very similar to the UKF in that it relies on a finite number of realizations (the "sigma points" or "particles" in the UKF) to approximate the mean and covariance of the conditional probability distribution of x k knowing Z k . However, whereas the UKF relies on a systematic way of distributing the particles such that the mean and covariance of the conditional probability distribution p [x k |Z k ] are equal to that of the particles, the EnKF relies on random realizations, without guarantees that the mean and covariance are captured accurately. However, the EnKF has been shown to work well in a number of applications, with typically far fewer particles than states, i.e., Y N (e.g., Houtekamer and Mitchell, 2005;Gillijns et al., 2006). The forecast and update step are very similar to that of the UKF.
1. In the UKF the particles are redistributed at every time step, in contrast to the EnKF. Rather, the EnKF propagates the particles forward without redistribution. We define the ith particle as ψ i k| ∈ R N , which is a Wind Energ. Sci., 3, 749-765, 2018 www.wind-energ-sci.net/3/749/2018/ realization of the conditional probability distribution p [x k |Z ]. The forecast steps are i. Each particle is propagated forward in time using the nonlinear system dynamics, and with the realizations of noise terms w and v denoted bŷ w i k−1 ∈ R N andv i k ∈ R M , generated using MAT-LABs randn(. . . ) function.
ii. The expected state and output are identically calculated as in the UKF using Eq. (15) with w i mean = (Y − 1) −1 . The covariance matrices are (re-)estimated from the forecasted ensemble by 2. For the analysis step, one applies Eq. (10) to determine the Kalman gain L k . Then, each particle is updated individually as Note that, in contrast to the ExKF and the UKF, the state covariance matrix P x (see Eqs. 7 and 12) need not be calculated explicitly in the EnKF. This, in combination with the small number of particles Y N , is what makes the EnKF computationally superior to the UKF (and often also computationally superior to the ExKF). However, this reduction in computational complexity comes at a price. The disadvantages of the EnKF are discussed in the next section.

Challenges in the EnKF for small number of particles
The caveat to representing the conditional state probability distribution with fewer particles than states, Y N, is the formation of inbreeding and long-range spurious correlations (Petrie, 2008). The former, inbreeding, is defined as a situation where the state error covariance matrix P x is consistently underestimated, leading to state estimates that incorrectly rely more on the internal model. One straight-forward method to address this is called "covariance inflation", in which P x (or rather, the ensemble from which P x is calculated) is "inflated" to correct for the underestimated state uncertainty (Petrie, 2008). Mathematically, this is achieved by applying before the analysis step, with r ∈ R the inflation factor, typically with a value of 1.01-1.25. The latter problem, long-range spurious correlations, can be better visualized in Fig. 2.
In particle-based approaches, the covariance terms cannot be captured exactly. This may lead to the formation of small yet nonzero covariance terms between states and outputs which, in reality, are uncorrelated. This can lead to the drift of unobservable states, and eventually to instability of the KF. Increasing the number of particles is the most straightforward solution to this problem, but comes at a huge computational cost. A better alternative is "covariance localization", where physical knowledge of the states and measurements is used to steer the sample-based covariance matrices. Recall that in the surrogate model of Sect. 2, the model states are the velocity and pressure terms inside the wind farm at a physical location. Define that the ith state entry (x k ) i belongs to a physical location in the farm s i . Then, looking at an arbitrary state covariance term (i, j ), we define the physical distance between these two states as s i,j = ||s i − s j || 2 . Now, we introduce a weighting factor into our covariance matrices by multiplying physically distant states with a value close to 0, and multiplying physically nearby states with a value close to 1. A popular choice for such a weighting function is Gaspari-Cohn's fifth-order discretization of a Gaussian distribution (Gaspari and Cohn, 1999), given by with c i,j = || s i,j || 2 L a normalized distance measure, with L the cut-off distance. Applying Eq. (24) for the covariance matrices P z k|k−1 and P xz k|k−1 we can define the localization matrices  Figure 2. Long-range spurious correlations arise in the case where a covariance matrix is described by a small number of particles. Using physical knowledge of the system, these undesired correlations can be corrected. x is the localization matrix. Applying localization, the covariance of physically nearby states are multiplied with a value close to 1, and the covariance of physically distant states are multiplied with a value close to 0. In our example case, this results in the localized covariance matrix x • P x , where • is the element-wise product.
where c z i,j is the normalized distance between two measurements i and j , and c xz i,j is the normalized distance between state i and measurement j , respectively. Finally, localization and inflation can be incorporated into Eqs. (20) and (21) by where • is the element-wise product (Hadamard) of the two matrices. The improvement in terms of computational efficiency and estimation performance is displayed in Fig. 3. A significant increase in performance is shown, especially for smaller numbers of particles. This is in agreement with what was seen in previous work . Furthermore, performance is more consistent. Additionally, note that there is no increase in computational cost, as the covariance matrices are made sparse, leading to a cost reduction in the calculation of Eq. (10), which makes up for the extra operations of Eqs. (25) and (26). Also, note that the localization matrices are time-invariant and can be calculated offline.

Synthesizing an online model calibration solution
Certain model parameters such as s are closely related to the turbulence intensity, which vary over time. Estimation of such parameters is achieved by extending the state vector with (a subset of) the model parameters. In this work, s is concatenated to the state vector as random walk model, with a certain standard deviation (covariance). Higher values of s lead to more wake recovery, making the calibration solution adaptable to varying turbulence levels. This adds one scalar entry to x k , which is a negligible addition in terms of computational cost.
Furthermore, a proposal is made for the estimation of the free-stream wind speed U ∞ . This is suggested to be done using the turbine's power generation measurements, following the ideas of Gebraad et al. (2016) and Shapiro et al. (2017b). Using the wind vanes and employing a simple steady-state wake model from the literature (Mittelmeier et al., 2017), the turbines operating in free-stream flow can be distinguished from the ones operating in waked flow. Next, define ∈ Z ℵ as a vector specifying the upstream turbines, with ℵ the total number of turbines operating in free stream. Then, the instantaneous rotor-averaged flow speed at each turbine's hub can be estimated by inverting the turbine power expression from WFSim (Boersma et al., 2018). One wind-farm-wide free-stream wind speed U ∞ is then calculated using actuator disk theory. Smoothing results with a low-pass filter with time-constant c u ∞ on the average of U ∞ i for each upstream turbine i, we obtain where it is assumed that U ∞ i ≈ U r i 1 + 1 4 · C T i when γ i ≈ 0, with U r i the wind speed at the rotor of turbine i. Research is currently ongoing on how to best incorporate the effects of turbine yaw (γ = 0) into the definition of C T . Furthermore, ρ is the air density, A is the rotor swept area, and P meas. turb,i is the measured instantaneous power capture of turbine i 2 .
Combining these elements yields an efficient, modular, and accurate model calibration solution for WFSim. The model states are estimated using SCADA and/or lidar data, of which the former is readily available, and the latter becoming more popular. State estimation paired with parameter estimation improves the accuracy of the surrogate model, potentially leading to more accurate control. Additionally, the free-stream wind speed is estimated using readily available SCADA data. This control solution is implemented in MAT-LAB, and leverages the numerically efficient pre-compiled solvers and parallelization for model propagation. The EnKF is orders of magnitude faster than existing estimation algorithms due to covariance localization and inflation, while competing with the UKF in terms of accuracy.

Results
In this section, the calibration solution detailed in Sect. 3 will be validated using high-fidelity simulations. First, the model used to generate the validation data will be described in Sect. 4.1. Then, a two-turbine and a nine-turbine simulation case are presented in Sects. 4.2 and 4.3, respectively.
Note that for the presented results, pressure terms are ignored in the state vector, as they appeared unnecessary for state estimation in previous work . Furthermore, for simplicity and due to lack of information, the process and measurement noise will be assumed to be uncorrelated, S k = 0, and Q k and R k are assumed to be time-invariant and diagonal. Also, note that the simulations presented are not conclusive on the feasibility of the solution under all relevant conditions experienced in an operational wind farm. Rather, this work presents a first step towards algorithm validation.

SOWFA
High-fidelity simulation data are generated using the Simulator fOr Wind Farm Applications (SOWFA), developed by the National Renewable Energy Laboratory (NREL). SOWFA provides accurate flow data at a fraction of the cost of field tests. It solves the filtered, three-dimensional, unsteady, incompressible Navier-Stokes equations over a finite temporal and spatial mesh, accounting for the Coriolis and geostrophic forcing terms. SOWFA is a LES solver, meaning that largerscale dynamics are resolved directly, and turbulent structures smaller than the discretization are approximated using subgrid-scale models to suppress computational cost. . The turbine rotor is modeled using an actuator line representation (ALM) as derived from Sorensen and Shen (2002). SOWFA has previously been used for lower-fidelity model validation, controller testing, and to study the aerodynamics in wind farms (e.g., Fleming et al., , 2017aGebraad et al., 2017). The interested reader is referred to Churchfield et al. (2012) for a more indepth description of SOWFA and LES solvers in general.

Two-turbine simulation with turbulent inflow
In this section, a two-turbine wind farm is simulated to analyze the effect of different measurement sources, KF algorithms, and the difference between state-only and stateparameter estimation. This simple wind farm contains two NREL 5-MW baseline turbines with D = 126.4 m, separated 5D in stream-wise direction. This LES simulation was described in more detail in Annoni et al. (2016). Important simulation properties are listed in Table 1 for SOWFA and WF-  Sim. The effect of the turbulence intensity on the wake dynamics in SOWFA is captured in WFSim through its mixinglength turbulence model. In these simulations, WFSim is purposely initialized with a too low value for s in order to represent the realistic situation of a model mismatch. The remaining tuning parameters in WFSim were chosen such that a weighted-sum cost function of the power and flow errors was minimized. Firstly, the three KF variants will be compared in Sect. 4.2.1. Secondly, in Sect. 4.2.2, estimation using different information sources is compared. Thirdly, the potential of joint state-parameter estimation is displayed in Sect. 4.2.3.

A comparison of the KF variants for state estimation
In this simulation study, four estimation cases are compared: (1) the ExKF, (2) the UKF, (3) the EnKF, and (4) the openloop (OL) simulation, i.e., without estimation. The focus here is on state-only estimation, thus excluding s . Flow measurements downstream of each turbine are assumed (e.g., using lidar), their locations denoted as red dots in Fig. 4, which is about 2 % of the full to-be-estimated state space. These measurements are artificially disturbed by zero-mean white noise with σ = 0.10 m s −1 . The KF settings are listed in Tables 2 and 3. The KF covariance matrices were obtained through an iterative tuning process in previous work (Doeke-Wind Energ. Sci., 3, 749-765, 2018 www.wind-energ-sci.net/3/749/2018/ Table 2. Covariance settings for the KF variants, with I q the R q × q identity matrix. The full cov. matrices are diagonal concatenations of the entries. For example, P 0 is diag P 0,u , P 0,v and diag P 0,u , P 0,v , P 0, s for state-only and state-parameter estimation, respectively.

Variable Symbol Units Value
Init. state error cov. of u k P 0,u (m s −1 ) 2 1.0 × 10 −1 · I N u Init. state error cov.  Figure 4 shows state (flow field) estimation of the three KF variants for two time instants, t = 300 s and t = 700 s. In this figure, ( u) q ∈ R N u is defined as the absolute error between the estimated and true longitudinal flow velocities in the field. Looking at Fig. 4, the OL estimations are accurate for the unwaked and single-waked flow, yet are lacking in the situation of two overlapping wakes, for which the KFs correct. There is no significant difference in accuracy between the different KF variants, yet they differ by 2 orders of magnitude in computational cost (Table 3).

A comparison of sensor configurations
Previous results (Doekemeijer et al., 2016 have relied on flow measurements for state estimation. However, in existing wind farms, such measurements are typically not available. Rather, readily available SCADA data should be used for the purpose of model calibration. For this reason, state Table 3. Choice of tuning parameters for the KF variants, for both the two-turbine and nine-turbine simulation case. Note that the ExKF does not support power measurements nor parameter estimation due to the lack of linearization, and does not have any additional tuning parameters. In terms of computational cost: simulations were run on a single node using 8 cores in parallel.   Tables 2 and 3. In Fig. 5 it can be seen that SCADA data allows comparable performance compared to the use of flow measurements, making the proposed closed-loop control solution feasible for implementation in existing wind farms, without the need for additional equipment. Furthermore, this modular framework allows for the use of a combination of lidar systems, measurement towers, and/or SCADA data -whichever is availablefor model calibration.

Joint state-parameter estimation
Forecasting, as used in predictive control, benefits from the calibration of model parameters in addition to the states. Joint state-parameter estimation using flow measurements downstream of each turbine (as shown in the rightmost plots in Fig. 5) disturbed by zero-mean white noise with σ = 0.10 m s −1 for the EnKF and UKF is displayed in Fig. 6. The KF settings are shown in Tables 2 and 3. At t = 0 s, both the OL and the KF simulations start with the same (wrong) value for s . Then, every second, (noisy) measurements are fed into the KFs, and the state vector as well as the model parameter s is estimated. However, for the OL simulation, no measurements are fed in: the state vector is simply updated with the nominal model, and the value for s remains the same throughout the simulation. Now, after 600 s (left plot in Fig. 6) and 900 s (right plot in Fig. 6), a forecast is started, meaning no measurements are available after that time. At that moment, the OL model still has the same (poor) value for s as at t = 0 s, while the value for s in the KFs has improved. From Fig. 6, it becomes clear that the estimates are not only improved for the 3 min forecast, but are also consistently better than the non-calibrated (open-loop) model's 10 min forecast due to the estimation of s 3 . Furthermore, the EnKF performs comparably to the UKF at a lower computational cost. Note that the EnKF even outperforms the UKF in this simulation, expected to be due to randomness in the EnKF.

Nine-turbine simulation with turbulent inflow
In this section, we investigate the performance of the EnKFbased model calibration solution under a more realistic nineturbine wind farm scenario. The purpose of this case study is to highlight the need for state-parameter estimation for accurate wind farm modeling. The wind farm contains nine NREL 5-MW baseline turbines, oriented in a three by three layout, separated 5D and 3D in stream-and crosswise directions, respectively. The turbines start with a 30 • yaw misalignment, but are then aligned with the mean wind direction within the first 30 s of simulation. The turbine layout With power measurements only, the EnKF is able to estimate these parameters successfully in addition to the model states.  Fig. 8. This LES simulation has been used before in the literature, and is described in more detail in Boersma et al. (2018). A number of important simulation properties are listed in Table 4 for SOWFA and WFSim, respectively. Compared to the two-turbine case, N has increased by a factor of 4. In the UKF, this would result in the same factor of additional particles. Thus, not only is each particle more expensive to calculate but there are also more particles. Rather, in the EnKF, the approach is heuristic. None of the EnKF settings needed to be changed for good performance compared to Sect. 4.2, as displayed in Tables 2 and 3. As shown in Table 3, the EnKF has a low computational cost of 1.2 s per iteration (8 cores, parallel). In this case study, both the complete model state (flow field), the turbulence model parameter s , and the free-stream flow speed U ∞ are estimated in real-time using exclusively (readily available) power measurements from the turbines. The EnKF and one of the open-loop simulations (OL) will deliberately be initialized with poor values for s and U ∞ to investigate convergence. The other OL simulation will be initialized with a poor value for s but a correct value for U ∞ for comparison.
In Fig. 7, it can be seen that the EnKF is successful in estimating U ∞ and s after 300 s using only wind turbine power measurements. Furthermore, the flow fields of SOWFA, of the OL simulation with U ∞ = 9.0 m s −1 , and of the EnKF at various time instants are displayed in Fig. 8. From this figure, it can be seen that the EnKF has large errors at the start of the simulation. However, after 10 s, the error in flow states surrounding each turbine significantly decreases through the use of turbine power measurements. This estimated flow then propagates downstream, "clearing up" the errors in the vicinity of the wind turbines. As time further propagates, the freestream estimation improves, and finally the estimation error converges.
The power forecasting performance is shown in Fig. 9 and Table 5. As also seen in Fig. 7, the EnKF converges after 300 s, and indeed the power forecasts outperform those of the OL simulation at t = 300 s. Furthermore, it is interesting to see that the filtered power estimates of the first row of turbines (i = 1, 2, 3) starts low at t = 1 s, but converges to the true power at t ≈ 200 s. This can be related to the mismatch in U ∞ , which takes approximately 300 s to converge to the true value of 12 m s −1 , as seen in Fig. 7. The oscillatory behavior in both the OL and EnKF power predictions is due to the absence of rotor inertia in the rotor model, turbulent structures in the flow, and large fluctuations on the excitation signal C T .
Finally, the forecasts for flow at times t = 300 s and t = 600 s are examined in Fig. 10. The large flow estimation mismatch in the EnKF at t < 250 s quickly reduces and for t ≥ 250 s the EnKF estimation is consistently better than both the OL cases. This has to do with the convergence of the model parameters s and U ∞ , and the estimation of the states surrounding the turbines using the power measurements. The variables U ∞ and s are incorrectly initialized in both the OL and the EnKF. In the EnKF, U ∞ and s are estimated in addition to the states, using only turbine power measurements. The EnKF quickly converges for the states, and more slowly for s and U ∞ . After 300 s, the EnKF has converged to a negligible estimation error.
A crucial remark with the simulations presented here is that low-frequency changes in the atmosphere are neglected. In a real wind farm, atmospheric properties such as the mean wind direction and turbulence intensity change continuously, and this will impact the estimation and forecasting performance. The EnKF uses an assumption of persistence for the atmospheric properties at the time of forecasting, and thus a change in mean wind direction may invalidate the model forecast. In future work, the algorithm presented here should be tested under high-fidelity simulations with such realistic low-frequency changes. This would provide insight into the potential of the work at hand, and advance towards a practical wind farm implementation.

Conclusion
This paper presented a real-time model calibration algorithm for the dynamic wind farm model "WFSim", relying on an ensemble Kalman filter (EnKF) at its core. The joint stateparameter calibration solution was tested in two high-fidelity simulation case studies. Exclusively using SCADA measurements, which are readily available in current wind farms, the adaptability to model discrepancies in a nine-turbine wind farm simulation was shown, at a low computational cost of 1.2 s per time step on an eight-core CPU. Specifically, the free-stream wind speed and turbulence intensity were shown to converge to their optimal values within 300 s. Furthermore, the EnKF was shown to perform comparably in terms of accuracy to the state-of-the-art algorithms in the lit-  Figure 9. Comparison of power forecasting using the EnKF with measurements available up until time t = 600 s. After convergence U ∞ (as seen as a positive power slope for the first row of turbines), s is also calibrated. After convergence, forecasting is better than in open-loop. Oscillatory behavior is still present due to an oscillatory input signal (C T ), turbulent flow field, and the absence of inertia in the rotor model. erature, at a computational cost of multiple orders of magnitude lower. Additionally, estimation using flow measurements from lidar was compared to estimation using SCADA data, and it was shown that SCADA data can effectively be used for real-time model calibration. In future work, the algorithm presented here should be tested under high-fidelity simulations with realistic low-frequency changes. This would provide insight into the potential of the work at hand, and advance towards a practical wind farm implementation. This work presented an essential building block for real-time closed-loop wind farm control using surrogate dynamic wind farm models.