Automatic controller tuning using a zeroth-order optimization algorithm
- Department of Electrical, Computer & Energy Engineering, University of Colorado Boulder, Boulder, CO 80309 USA
Correspondence: Daniel S. Zalkind (firstname.lastname@example.org)
We develop an automated controller tuning procedure for wind turbines that uses the results of nonlinear, aeroelastic simulations to arrive at an optimal solution. Using a zeroth-order optimization algorithm, simulations using controllers with randomly generated parameters are used to estimate the gradient and converge to an optimal set of those parameters. We use kriging to visualize the design space and estimate the uncertainty, providing a level of confidence in the result.
The procedure is applied to three problems in wind turbine control. First, the below-rated torque control is optimized for power capture. Next, the parameters of a proportional–integral blade pitch controller are optimized to minimize structural loads with a constraint on the maximum generator speed; the procedure is tested on rotors from 40 to 400 m in diameter and compared with the results of a grid search optimization. Finally, we present an algorithm that uses a series of parameter optimizations to tune the lookup table for the minimum pitch setting of the above-rated pitch controller, considering peak loads and power capture. Using experience gained from the applications, we present a generalized design procedure and guidelines for implementing similar automated controller tuning tasks.
In this article, we present a data-driven, simulation-based optimization procedure for tuning wind turbine controllers using measures that are directly related to component design. Controller tuning influences the power capture and structural loading on wind turbines, which are directly related to the cost of the wind energy generated. At the same time, different turbine models require different control parameters. As rotor designs are iterated upon and also customized, e.g., with larger towers, tip extensions, or for site-specific turbulence, an updated (and ideally optimized) controller is required for component design and cost specification. Given the aeroelastic turbine model, the algorithm presented in this article automatically finds the optimized parameters of the predefined control architecture, reducing the effort required of the control designer.
The wind turbine control tuning can be automated, but design choices for the various parameters often require expert knowledge of the controller and turbine operation. An automated procedure to determine these choices could reduce the design cycle time of a manufacturer's research and development process or aid researchers in other disciplines of wind engineering that require a well-tuned controller without worrying about its finer details. Several control parameters are directly related to the performance of the turbine and must be tuned for each design iteration or model update. The simplest method to determine these design choices using simulation results is to exhaustively search the design space and then make an educated design choice of the parameter. However, exhaustive search may become computationally intractable for fine discretizations of the search space; on the other hand, coarse discretizations may lead to suboptimal design choices.
A systematic, simulation-based parameter search of the pitch control gains for generator speed control was first published in Hand and Balas (2000). On a single turbine, turbulent simulations were used to sample and visualize the design space against competing design measures: generator speed regulation versus blade pitch actuation. A similar data processing flow was used in Hansen et al. (2005) while the problem was formulated in a numerical optimization framework for structural load reduction; the authors concluded that a good initial guess was only marginally worse than the optimized result and that the effort required to set up the optimization procedure was not profitable for the benefit in structural loading. Shortly thereafter, an adaptive control framework was found to be beneficial for reconciling plant–model mismatches in field testing, especially for control parameters that affect power production (Johnson et al., 2006), where even small benefits are profitable to the operator.
As the wind industry has matured and computational cost has decreased, wind turbine design increasingly relies on simulation of power capture and structural loads for design analysis. As a result, system engineering tools for wind turbine design have been developed and refined, leading to updated efforts in automated controller development, with the aim of deploying tuning methods for many different turbines. One approach is to use a model-based control scheme in order to limit the control tuning effort (Bottasso et al., 2012). Model-based pitch (Hansen et al., 2005) and torque (Johnson et al., 2006) controllers usually result in functioning controllers but require rules of thumb to determine the closed-loop characteristics and can be inaccurate when there are uncertainties in the model.
A scalar cost function was presented in Tibaldi et al. (2014) using measures that are directly related to wind turbine component design, like peak and fatigue loads. The cost function included terms for each turbine part, with factors to capture its relative cost to the turbine. Using measures directly related to the component design, like fatigue and extreme loads in turbulent simulations, is ideal because it most accurately reflects the eventual component design, but these simulations require more detailed and computationally expensive methods to generate the measures.
Simulation-based optimization has been used to solve these problems, where solving for the value of the cost function is expensive compared to the optimization procedure. One approach to solving these types of problems is using response surface methodology (RSM) (Fu, 2014). RSM was originally developed for experimental design (Box and Wilson, 1951) but has increasingly been used with simulation information; it works by fitting a cost function to the simulation results, finding the local gradient of the fit, and optimizing the fitted cost function. The question of how to sample the parameter space remains an open question. Samples can be generated using a grid search or random sampling. An example in the wind energy community by Moustakis et al. (2019) samples the parameter space based on a cost function that considers both where the cost is expected to be optimal and also where it is unknown. Our approach to sampling the parameter space is based on stochastic approximation or “zeroth-order optimization,” which uses the sampled cost function to estimate the local gradient and then optimizes the function with proven convergence results (Ghadimi and Lan, 2013). Other optimization algorithms require an analytical model; the proposed method relies on functional evaluations (e.g., simulation data) and does not require a model to compute gradient information. In one of the original stochastic approximations methods (Kiefer and Wolfowitz, 1952), each dimension of the decision variable is perturbed and a finite-difference method is used to approximate the gradient. Multipoint methods were developed for higher-dimensional cases, where the decision variable can be perturbed in a direction containing multiple dimensions and the directional derivative is used to estimate the gradient (Duchi et al., 2015). If multiple directions are randomly sampled using a normal Gaussian distribution and then averaged to find the directional derivative, it is known as Gaussian smoothing, which has been shown to improve convergence rates (Hajinezhad et al., 2017).
We use a Gaussian smoothing approach to generate samples, estimate the gradient, and identify a (possibly local) minimum point. Then we use the samples to visualize the design space and provide a level of confidence in the result. Previous work in controller optimization usually only provides the cost function and goals of the optimization, whereas this work explicitly details the method for determining the sample simulations and how their results are used to iterate on control designs.
Instead of using cost functions directly related to overall wind turbine performance, our work solves specific wind turbine control problems that are related to the cost of energy. First, the optimization procedure is demonstrated on below-rated torque control to increase power capture. Next, the pitch control parameters for above-rated pitch control are optimized to reduce fatigue or extreme loads on the tower or blades with a maximum generator speed constraint. Finally, the minimum pitch setting of the pitch controller is optimized in a series of parameter optimizations aimed at reducing peak blade loads.
This article is organized as follows. The optimization algorithm and visualization method are presented in Sect. 2. Applications of the algorithm for wind turbine control tuning are presented in Sect. 3, followed by a generalization of the design procedure and guidelines for parameter selection in Sect. 4. Conclusions are presented in Sect. 5, and the generalized wind turbine controller that is tuned in Sect. 3 is described in Appendix A.
Superscript notation will be used to index the stage r of the zeroth-order optimization algorithm: e.g., zr. Additionally, FT is the transpose of F. If the power of any value is computed, the base will appear in parentheses: e.g., (γ)r.
The zeroth-order optimization algorithm uses J random samples near the current iteration to estimate a gradient. Then, a typical first-order method ensues: using the estimated gradient, the descent direction and step size are chosen to produce the next iterate. The process is repeated for a number of stages Nstage until convergence is observed. Using the cost function samples, an estimation of the design space is generated to visually verify the results. In each control tuning example presented in this article, the following unconstrained optimization problem is solved:
where is the M-dimensional parameter or vector of control parameters, constrained to be in a set 𝒳 that is convex and compact; is the cost function, which we assume is differentiable, bounded from below, and its gradient is Lipschitz (Ghadimi and Lan, 2013). However, we only have access to the cost function via samples of potentially noisy simulation results.
2.1 Generating samples
The algorithm begins with an initial guess z1. During each stage , the gradient is estimated by randomly sampling the design space. During each stage, sample directions are parameters drawn from a random distribution. In the most general case, a standard normal distribution is used for each dimension and the vectors are normalized to have a unit magnitude; this results in a uniformly random distribution of directions in M-dimensions (Hajinezhad et al., 2017). In this article, we focus on one- and two-dimensional parameter optimizations and will make changes to the generation of the sample directions to ensure an even distribution for a small number of samples.
A search sample in stage r is generated according to
where μ is a smoothing parameter that determines the amount of space over which the parameter space is searched. A large value for μ helps to estimate the value of the cost function over a larger area (Sect. 2.6), but smaller values of μ tend to result in more accurate convergence.
The number of stages and samples per stage must also be chosen by the designer. A large number of samples per stage gives the best estimate for the gradient but requires more simulations. During the development of this work, it was found that a smaller number of samples per stage and more stages resulted in better convergence using the same total number of simulations (e.g., in Sect. 3.2.2).
2.2 Gradient estimation
At each stage r, the cost 𝒞(z) is computed via simulation at each search sample and used to estimate the gradient
Note that Eq. (3) differs from a finite-difference method of estimating the gradient, where the factor would be in the denominator. Because there is uncertainty expected in the computed cost, small perturbations () and a nonsmooth cost function 𝒞(z) could result in noisy gradients. The gradient estimator in Eq. (3) is referred to as the random direction gradient estimator (Fu, 2014, p. 110) and maintains the convergence criterion when used in a first-order algorithm (Hajinezhad et al., 2017).
2.3 Determine descent direction
From the estimated gradient, the possible descent direction is computed:
where a diagonal matrix D of positive scalars is used to relatively increase dr in the directions where the sensitivities of the cost function to parameter changes are smallest, providing a diagonal approximation to Newton's method and improving convergence rates (Bertsekas, 1999), which leads to the following stage gain:
where α is the step size and finds the closest point within the parameter bounds 𝒳, offset with ρ=μ so that search samples in the next stage can be generated within the parameter bounds. The algorithm described in Eqs. (2)–(5) is proven to converge to a ball centered around an optimal solution (Hajinezhad et al., 2017). Next, we describe two adjustments to the original algorithm that improve performance when used in the control tuning applications presented in this article.
2.4 Adjustment 1: decreasing step size and line search
A decreasing step size rule ensures convergence and a line search is used so that the cost function does not increase in successive iterations. After choosing a base step size α0, the cost of test samples
are evaluated (through simulation) along the descent direction, where the step size
decreases for a number of iterations k≤kmax. An upper limit kmax on the number of step size samples is chosen to cap the number of simulations that may be performed along directions that could increase the cost function. To ensure that the cost function is nonincreasing during each iteration, the Armijo rule for step size is used (Boyd and Vandenberghe, 2004):
where, for all examples in the following section, σ=0.05 is chosen, a conservative value that only requires a small decrease in the cost function.
2.5 Adjustment 2: resetting parameter update
Once an adequate step size is found, the parameter z is updated using
If the maximum number of step size simulations (k=kmax) is performed and the step size rule in Eq. (8) is not satisfied, the next iteration of the parameter z is chosen as
where C(r) is the enumeration of the cost function at all stage samples zr, all search samples , and all step size samples within the parameter bounds defined by 𝒳, up until the current stage r:
where has elements, , , and . As before, with the step size sample, since the value of the cost at this point has already been computed, it is unnecessary to compute it again for r>1. Since the resetting parameter update in Eq. (10) results in a sequence of 𝒞(zr) that is nonincreasing, the convergence properties of the original algorithm are maintained; the same argument applies to Adjustment 1 in Sect. 2.4. The solution zsoln of the zeroth-order optimization is determined by the updated parameter of the final stage:
where the number of stages Nstage is determined before running the algorithm. A typical stopping condition involves checking whether the norm of the gradient is less than a given threshold or dictating a budget on the number of simulations that are to be performed. We investigate the performance versus Nstage in Sect. 3.2.2.
To provide confidence in the result of the zeroth-order optimization, we visualize the cost, and the measures associated with it, over the parameter space. If the minimum of the zeroth-order parameter optimization matches that of the visualization, the user can be confident in the result. The visualization method also provides a quantitative measure of the uncertainty of the estimated cost over the parameter space.
To estimate the cost and its variance over the parameter space, we use ordinary kriging. Kriging was originally developed for mining applications, where sparsely sampled information over a geographical space was used to estimate the quantity over the whole area. More recent applications of kriging include engineering design and computer experiments.
Kriging, or Gaussian process regression, is a method of interpolation that incorporates uncertainty in the area between samples. Using all the observed data from the zeroth-order parameter search at stage r, C(r) from Eq. (11), ensuring there are no repeated values, the estimated cost at z is
where the first term in Eq. (13) is the generalized least-squares estimate
Since we are using ordinary kriging, which assumes a constant mean across the parameter space, the regression basis function
where Z is the enumeration of all nsamp sample points like in Eq. (11). The correlation matrix Ψ represents the influence that nearby samples have on each other; it has the form
and is made up of scalar Gaussian correlation functions
where νi is the distance at which the influence is e−1 or 37 % in the ith dimension (Martin and Simpson, 2008). The second term in Eq. (13) interpolates or “pulls” the estimate towards the observed values using the correlation vector
The mean squared error, or variance, of the cost at z is determined by
is the process variance. As the unobserved point z moves away from the observed samples, the second term in Eq. (19) approaches zero and the variance approaches .
The correlation function parameters νi are estimated using a maximum likelihood estimator to be consistent with the observed data. To perform this optimization, we use the ooDACE toolbox to fit the correlation function and kriging model (Couckuyt et al., 2013). Problems can arise when using kriging for simulation-based optimization because of ill-conditioned correlation matrices (Booker et al., 1999). When samples cluster near the optimal solution, closely spaced samples with different values can result in very small values of νi and ill-conditioned correlation matrices. One solution is to add a constant to the diagonal of Ψ (Sasena, 2002). We implement this using “stochastic kriging”, where the samples are assumed to have uncertainty and is equivalent to adding their variance to the diagonal of Ψ (Couckuyt et al., 2013). Additionally, the lower and upper bounds on the values of νi depend on the minimum and maximum spacing of the distance between samples, respectively (Martin and Simpson, 2008).
2.7 Settling function
To measure the number of stages the optimization procedure requires to find the minimum of the cost function, we define the settling function
which is a linear transformation that represents the fraction of change in cost at each stage 𝒞(zr), compared to the overall change in cost function. The initial cost 𝒞(z1) is mapped to s(1)=1 and the cost of the solution 𝒞(zsoln) is mapped to . Often, we perform more stages than is necessary and use this settling function to determine how many stages are required to achieve some percentage of the change in cost function.
In this section, we present three examples of using zeroth-order parameter optimization to tune the parameters of wind turbine controllers. As an initial demonstration, we optimize a one-dimensional parameter to maximize power capture through torque control in below-rated operation. Next, we present the motivating example for this work, a two-dimensional parameter optimization for a standard pitch controller, with the goal of regulating generator speed so that loads are minimized, subject to a constraint on the maximum generator speed. Finally, we demonstrate how a series of one-dimensional parameter optimizations can be used to determine the minimum pitch setting of the pitch controller for controlling peak blade loads.
3.1 Optimal torque control gain
In below-rated (region II) operation, the generator torque is typically controlled using , which controls the rotor speed to its optimal tip speed ratio, where ωg is the generator speed. The optimal gain kopt depends on a number of aerodynamic properties (Johnson et al., 2006):
where ρair is the air density, R is the rotor radius, CP,max is the maximum power coefficient, λopt is the optimal tip speed ratio, and G is the gearbox ratio. We add a multiplicative factor to account for uncertainties in the aerodynamic properties and to allow the gain to be increased or decreased, resulting in the control law
In practice, a value other than kfact=1 is found to be optimal for a realistic turbulent wind input.
The goal of this optimization procedure is to find the gain kfact that results in the greatest energy capture. To maintain the form of a minimization problem, we solve
where the cost function is the negative of the weighted average mean generator power
using the average generator power of a simulation with mean wind speed u, and p(u) is the Weibull wind speed distribution. The optimization parameter z=kfact; the Weibull shape and scale parameters are 2.17 and 10.3, respectively; and we used m s−1 to span the below-rated wind speeds.
At each stage, J=2 samples are simulated to compute the cost function and estimate the gradient. In this one-dimensional problem, no dimensional scaling is required; thus D=1. With J=2, we set the search direction to
which simplifies Eq. (3) to a centered finite-difference approximation of the gradient for this one-dimensional application.
The optimal value of kfact is expected to be between 0.3 and 1.7, so these values are set as hard bounds. A search range of μ=0.05 is set to adequately search the space and estimate meaningful gradients. For tuning controllers of turbines with different power ratings, the base step size is scaled with the inverse of the initial simulation's average power . Larger power values result in larger gradients; since the scale of the parameter is constant for all rotors (it should ideally be 1), the step size should be reduced to maintain the same rate of descent. Note that a positive step size is required, even though the cost is negative, and a maximum of three step size sample simulations are performed (kmax=3). A summary of the parameters used in the torque control parameter optimization is shown in Table 1, and an illustration of the first iteration is shown in Fig. 1.
The parameter optimization was performed on the NREL-5MW reference turbine with the standard lookup-table-based torque controller in Jonkman et al. (2009). The algorithm finds close to the optimal kfact in five stages and realizes diminishing returns thereafter (Fig. 2b). The full procedure, with seven stages, performs 81 simulations in total, which includes step size samples and the initial guess. If only a single simulation at 8 m s−1 is used and the controller is exclusively in region II, we find a lower optimal kfact than is shown in Fig. 2a (Zalkind et al., 2020). By including other wind speeds and the transition region, as shown in Fig. 2c, d, and e, the optimal kfact is nearly 1. The use of turbulent simulations contributes noise to the signal that determines the cost function, which is apparent by the nonsmooth behavior of the cost samples with respect to the gain factor parameter in Fig. 2a. However, the algorithm appears to be robust to these uncertainties.
3.2 Pitch control for generator speed regulation
In this section, we optimize the parameters of an above-rated blade pitch controller for load reduction and generator speed regulation. Each time a new rotor is designed, the pitch controller should be tuned so that the structural loads can be computed to design the various hardware components of the wind turbine. As will be seen, the pitch controller affects the loads that drive turbine design. The procedure for tuning the gain-scheduled proportional–integral (PI) controller is detailed in Appendix A. First, steady-state simulations at above-rated wind speeds are used to determine the turbine operating points and aerodynamic parameters at various pitch angles, which parameterizes the gain scheduling. The final, and most involved, step is to tune the natural frequency (ωreg) and damping ratio (ζreg) of the “regulator mode,” which represents the generator speed response to a disturbance (wind) input. The following optimization procedure aims to find an optimal set of parameters (ωreg,ζreg) so that structural loads are minimized and adequate generator regulation is maintained.
In general, changing the bandwidth of the pitch controller via ωreg alters the structural loading of various components, which we denote generically with M in the following. In this section, we use M to denote tower fatigue or peak blade loading, though any load could be used that results in a feasible optimization problem; the control and hardware designers must determine what loads are important to the overall turbine design. However, controllers with lower natural frequencies allow greater generator speed transients, which is acceptable up to some maximum constraint. If the generator speed exceeds some threshold ωg,hard, most turbines enter into a shutdown procedure to avoid further damage, which reduces the availability of the turbine and the net annual energy production; this must be avoided.
First, we reformulate the constrained optimization
where B(z) is a boundary function that penalizes samples that have a maximum generator speed that exceeds some “soft” generator speed constraint ωg,soft,
A quadratic boundary function is used so that the cost is differentiable, even when a nonfeasible solution is sampled. The factor kB is chosen to provide a sufficient penalty on high generator speeds but not so high that exceedingly large gradients are determined from the gradient estimation in Eq. (3), which can be problematic for the algorithm.
ensures that the barrier function B(z)=cmax when the maximum generator speed equals the hard generator speed constraint ωg,hard.
To adapt the cost function to different rotors and load measures,
where M(z1) is the load measure of the initial stage sample and the factor is based on experience gained using the algorithm with simulation results. A smaller factor does not penalize maximum generator speeds enough, leading to possibly infeasible solutions that violate Eq. (28). Factors greater than were found to create large gradients that lead the iterates away from the constraint boundary; typically, the optimal solution is found close to that boundary.
* In Sect. 3.2.2, we compare the performance of using different numbers of stages Nstage and samples per stage J.
In most cases, the initial parameter set z1 is chosen to be near values that were tuned manually but offset (usually with a higher natural frequency) to allow the algorithm to converge properly. If the parameters were not previously tuned, the values suggested in the NREL-5MW reference manual (Jonkman et al., 2009) are chosen as the initial parameter set. A summary of the parameters used to tune all the rotors in this study is presented in Table 2. The algorithm is tested using different numbers of stages Nstage and samples per stage J in Sect. 3.2.2. The best results were achieved using a quasi-deterministic search direction,
is used to evenly space the samples in the two dimensions (ωreg,ζreg), and ψ0 is randomly generated according to , resulting in the generated samples in Fig. 3.
The cost function is more sensitive to changes in ωreg than it is to changes in ζreg, so was chosen to relatively increase the search direction in the ζreg dimension. Hard bounds on (ωreg,ζreg) are chosen to avoid unstable parameter sets. The base step size α0 scales with the inverse of the initial load M(z1) so that the algorithm works for turbine models of different sizes, with initial loads specified in Table 3.
The algorithm was tested on a range of rotor models with different wind classes and load measures. First, the pitch control parameters of the NREL-5MW reference model are optimized, starting from the parameters specified by the NREL-5MW reference manual (Jonkman et al., 2009), and using the tower base moment (fore–aft) damage equivalent load (DEL) as the load measure. For three different wind classes (1A, 1B, and 1C), with different turbulence levels (A – highest and C – lowest), the parameters were optimized, and an example is shown in Fig. 4. In Fig. 5, the estimated cost (a), load (c), and maximum generator speed (e) across the parameter space is shown, along with the estimated uncertainty in (b), (d), and (f). The lowest turbulence level (Class 1C) has the lowest optimal natural frequency ωreg, since the reduced turbulence results in lower generator speed transients. In each case of the NREL-5MW reference model, the optimized parameters have a lower natural frequency and higher damping ratio than the original setting.
The optimization procedure was also performed for each rotor design in the Segmented Ultralight Morphing Rotor (SUMR) project (Loth et al., 2016); for these rotors, the design driving load case for blade design was the maximum blade root bending moment. In practice, the combined edgewise and flapwise load is used for design, but since the edgewise load is deterministic, we used the maximum flapwise load as the load measure for optimization, which is a good indicator of maximum combined loads. The SUMR rotor radii range in size from 22 to 240 m (Appendix Table B1), and the same optimization parameters (Table 2) were used for each optimization procedure, albeit with different initial conditions and loads, which adapt the cost function and step size accordingly. The optimization procedure generally settles on a lower natural frequency and higher damping ratio than the initial guesses (Table 3), which has the effect of producing the lowest control bandwidth (for reducing loads) but only to the point so that the generator speed constraint is not exceeded.
3.2.1 Discussion of low natural frequency, high damping ratio regulator mode
For most of the rotors in this section, the baseline rotor speed proportional–integral (PI) control parameters are optimized to have a regulator mode with a lower natural frequency and higher damping ratio than the initial guess (Table 3). To understand why this is the case, we must consider the cost function of the optimization. In this tuning procedure, our goal is to minimize structural loading with a constraint on the maximum generator speed. From Fig. 4c and d, we see that the collective blade pitch angle θc has a large effect on the thrust-based structural loading; this includes tower fatigue and blade peak loads. Figure 4 shows that, in most cases, pitch and loads mirror each other: when pitch increases, loads decrease, and vice versa. A good example occurs between 230 and 250 s in the time series of Fig. 4. The direct effect of the blade pitch signal on the load signal is the primary reason for the optimal PI gains (or regulator mode parameters) found in this section.
PI gains derived from a regulator mode with a low natural frequency result in less pitch actuation and thus less change in the load. Higher natural frequencies result in faster and more frequent pitch control variations, which translate to the structural load signals and increase fatigue loading. A controller with a high natural frequency can also be problematic when the wind speed decreases. Because the underlying controller is trying to regulate the generator speed, the pitch will decrease during a wind lull to maintain the generator speed at its rated value, which can also lead to large peak loads, especially when an increase in wind speed follows.
High damping ratios are also found to be optimal when using the described cost function. A generator speed response and pitch control response with a high damping ratio lacks any overshoot and secondary transients when the system is subjected to a disturbance (wind). Secondary transients and overshoot in the pitch command result in load transients. The original NREL-5MW controller (where ωreg=0.6 rad s−1, ζreg=0.7) has regulator mode poles at , indicative of a fast response with overshoot and transients in the pitch and generator speed signals. The optimized controller in Class 1A turbulence (with ωreg=0.10 rad s−1, ζreg=2.08) has two real poles at −0.40 and −0.025, which results in a fast initial pitch response and a slower secondary response.
When comparing the PI gains of the original versus optimized controller, we see that the proportional gains are of similar magnitudes, but the integral gain is much less in the optimized set of gains. The original NREL-5MW gains are s and , whereas the optimized gains (in Class 1A turbulence) are s and . These optimized gains reflect the cost function and control goal and environmental setting; we assume that special and fault cases would be handled by a supervisory controller. The optimized proportional gain is still large enough to mitigate generator speed transients, ensuring that the generator speed does not exceed the maximum threshold, while the integral gain is reduced because it causes transients in the blade pitch and structural loads. Since our primary goal is not to regulate the generator speed to some fixed set point but instead constrain its maximum value, integral control is less important. If the cost function included a term related to regulating the generator speed to its rated set point, using, e.g., mean squared error compared to the rated generator speed, the optimal integral gains might be larger. However, we believe that a controller that constrains extreme events and maximizes power capture better reflects the overall wind turbine design goals. Typical pitch controllers are designed to regulate the generator speed to some fixed rotor speed with a high enough bandwidth so that the maximum generator speed constraint is not violated. Instead of focusing on a quantity that measures how well the generator speed is regulated, we focus on whether or not the maximum generator speed constraint is violated. An initial investigation (Zalkind and Pao, 2019) of the loads on the other turbine components shows a reduction in blade and low-speed shaft fatigue and pitch actuation, but a more in-depth load investigation is left for future work.
3.2.2 Results: comparison with grid searching
To quantify the performance of the zeroth-order optimization (ZOO) for this pitch control application, we compare it, in terms of the number of simulations and optimal cost, with a grid search optimization for tuning the pitch controller of the NREL-5MW reference turbine in Class 1A turbulence. The same area spanned by the hard bounds of the zeroth-order method (Table 2) is sampled by a Ngrid×Ngrid grid, with . The cost, defined by Eqs. (29)–(32) and sampled using the Ngrid=10 grid search, is shown in the background of Fig. 6a–e. The parameter z with the minimum cost over all simulations in the search is the optimal parameter zopt. In practice, we would refine the search area and resample based on experience. However, different models may change the resampled area and would add a manual step that we can avoid when using the zeroth-order optimization procedure.
The ZOO procedure outlined in Sect. 2 is performed three times for each of the following cases. Each procedure uses randomly generated samples that should result in different optimal parameters for each instance. We use four different initial conditions, distributed so that one is in each of the four quadrants spanning the bounded parameter space. The starting location in each quadrant was generated randomly, except for the bottom, right quadrant in Fig. 6d, which is the suggested parameter set, , defined in the NREL-5MW reference manual (Jonkman et al., 2009). The ZOO was performed with Nstage=7 stages using samples per stage and also with Nstage=12 using J=3 samples per stage. Theoretical results suggest that better gradient estimates (from a larger number of samples per stage J) result in convergence within a ball with a smaller radius centered around an optimal solution (Hajinezhad et al., 2017). The optimal parameter set zopt found in each instance of the ZOO is shown in Fig. 6a–d. We compare the cost of the ZOO method with the grid search optimization in terms of the defined cost function in Eqs. (29)–(32) (Fig. 6f). The results are normalized to the cost function found using the suggested parameter set zsug.
Compared to zsug, all of the methods result in a 20 % to 26 % reduction in the cost function, with about a 1 % standard deviation in the results. For fewer than 80 simulations in this application, ZOO performs better than the grid search benchmark in almost every case. The optimal cost decreases with increasing J and total number of simulations on average, but not necessarily always. However, in terms of efficiency on a per-simulation basis, J=3 (blue in Fig. 6) achieves similar results to those found using J=10 (yellow in Fig. 6). Additional stages (Nstage=12) with J=3 (purple in Fig. 6) decrease the cost function further; the optimal cost of this case is the best we tested in terms of the number of simulations and cost reduction. By a small margin (1 %–2 %), using the initial parameter z1=zsug in the bottom, right quadrant in Fig. 6d performed better than the other z1 locations shown in Fig. 6a–c. In Fig. 6a–d, we see that if we use a z1 that is closer to the area where zopt is found, there is less variation in zopt; there is also less variation in the minimum cost 𝒞(zopt).
We should note that the comparison presented in this section applies only to this pitch control tuning application. To compare the efficacy of the zeroth-order optimization with a grid search more generally would require comparing functions of different complexities and dimensions, which is outside the scope of this article and we leave for future work.
3.3 Minimum pitch setting for peak load reduction
In this final example, a series of one-dimensional parameter optimizations will be used to tune the minimum pitch setting of the above-rated pitch controller described in Appendix A and shown in Fig. A1. Increasing the minimum pitch setting can reduce the peak blade and tower loads. However, it also slightly reduces power capture. To represent this trade-off, the cost function
will be minimized, quantifying the relative importance κ between reducing peak loads and reducing power capture, where z=θmin(u) is the minimum pitch setting at wind speed u, M is the maximum blade flapwise load (over all blades), and is the mean generator power of a turbulent simulation with mean wind speed u. A value of κ=0.01 is used, which represents a 10 % reduction in peak load being roughly equal to a 0.1 % decrease in power capture; this parameter can be tuned by the control designer based on the goals of the design; however, the feasibility of the optimization problem should be verified. In future work, a family of optimal minimum pitch control laws, using different values for κ, could be generated but would require more global wind turbine design information to determine the design choice.
During the load analysis of a control design, a number (Nseeds) of randomly generated turbulent seeds are used to simulate the turbine across wind speeds to identify peak loads on the various components. Often, peak loads on the blade and tower occur in situations where there is first a lull in the wind speed, which causes the pitch angle to decrease, followed by an increase in wind speed. If the pitch controller does not react in time, the combination of high wind speeds and low pitch angles causes a large thrust on the rotor. However, if the minimum allowable pitch is increased, the peak loads resulting from wind speed lulls can be reduced. An example is shown in Fig. 7c and d at 275 s. While these events are fairly common in simulation, not all produce equal peak loads; the minimum pitch setting is optimized for the worst-case simulation.
A wind speed estimate, which can be found using, e.g., one of the methods in Soltani et al. (2013), is used to determine the minimum pitch setting of the above-rated pitch controller. A smooth lookup table is generated using a cubic spline interpolation and a table of three minimum blade pitch settings with breakpoints at above-rated wind speeds, in addition to a breakpoint in below-rated wind speeds and one above the cut-out wind speed. The minimum pitch setting is nondecreasing with respect to wind speed. An example is shown in Fig. 8. The minimum pitch at each above-rated breakpoint will be optimized using the zeroth-order optimization procedure previously described.
The algorithm (presented in Algorithm 1) is initialized by choosing an initial lookup table for the minimum blade pitch, θmin(ui) in Table 4. In Step 1, Nseeds random seeds are simulated to find the worst-case seed nmax with the maximum load at that breakpoint . The optimization procedure in Step 2 is only performed if the current breakpoint has a problematic peak load: one that is greater than loads seen at the other mean wind speeds (line 4 of Algorithm 1). The starting loads are initialized to so that at least the first active breakpoint is optimized. At the low wind speed breakpoint u0=5 ms−1, the minimum pitch angle is set to the aerodynamically optimal angle θfine; and at the high wind speed breakpoint ms−1, the optimal minimum pitch angle is set to the feather pitch angle. Neither u0 nor is an actively optimized breakpoint; they are, however, used as lower and upper bounds for the first u1 and last active breakpoints.
In Step 2, the initial guess that is used by the optimization procedure (z1) is offset from the lower and upper bounds by the sample search area μ (lines 11–19 of Algorithm 1). Step 3 is optionally performed to recheck the other random turbulent seeds using the new, optimized minimum pitch lookup table. In some cases, a different random seed will have a peak load that exceeds that of the wind input that was originally optimized; if this is the case, Step 2 is repeated up to three times, using the previously optimized pitch angle as a lower bound.
Algorithm 1 is used to optimize the minimum pitch table in Fig. 8 using the parameters in Table 4 and the SUMR-13A wind turbine model. Six random turbulent seeds are initially simulated and only the 14 and 18 ms−1 breakpoints require optimization (the peak loads of the 24 ms−1 simulations are all less than the 14 and 18 ms−1 loads). Nstage=4 stages with J=2 samples per stage are used to optimize the 14 ms−1 breakpoint; the results of the procedure are depicted in Fig. 7.
3.4 Coupling between control optimizations and systems engineering considerations
Changing the minimum pitch setting of the controller can have an effect on the below-rated power production (optimized in Sect. 3.1) and peak and fatigue loads (optimized in Sect. 3.2). Though this coupling exists, in our experience, the effect on the optimized parameters is small.
In future work, a multiobjective optimization might be more suitable, where all the tuning procedures are simultaneously performed; the zeroth-order method is suitable in this case. A potential challenge would be determining what simulations should be used to efficiently optimize all of the control parameters. Currently, each control tuning procedure requires running different simulations. Additionally, the goal of this work was to automate design choices, rather than having to choose from a set of possible choices that would result from a multiobjective optimization.
However, with additional resources, our goal could shift from efficient optimization of smaller problems to larger optimizations of the overall turbine system. Within a system engineering framework, more information might determine which simulations and loads are sensitive to control parameter changes. In this article, we focused on minimizing the peak blade loads of the SUMR rotors because those were the design driving load of those blades. Other loads could certainly be used, but all loads are not important to the overall design: some components are overdesigned, and others drive design; this information depends on the specific design but could be determined using detailed system engineering tools.
Our goal was to reduce the design cycle times for processes that already occur during control design. Rather than solving all problems at once, we propose solving them in sequence, in the order they are presented: first optimizing the torque and pitch controllers and then tuning the minimum pitch setting for peak loads. Then with the new minimum pitch table, a designer could optionally reoptimize the torque and pitch gains; we have done this and witness little change. Solving smaller problems tends to be more efficient in terms of the number of simulations and more transparent in terms of how control parameters affect different performance measures during specific simulations. We discuss setting up similar optimization problems for future work in Sect. 4.
In this section, we present guidelines for performing similar optimization procedures. Experience gained in problem formulation, the usefulness of performing a preliminary offline analysis, and the determination of the parameters of the solver is shared.
4.1 Determine problem and goals
Using the zeroth-order optimization procedure described in this article for determining control parameters through simulation requires effort in setting up the problem and developing software. In order to justify the up-front effort, the task would ideally be one that is repeated for many different rotor models, like the examples in Sect. 3. A task that is repeatedly performed also allows the designer to gain a deeper understanding for how control inputs (gains, parameters) affect simulation outputs of the wind turbine.
It is important to determine how the turbine should be simulated in order to generate the measures that are used for the optimization; they should highlight some problematic or indicative case that the control solution is trying to solve. For example, when optimizing the torque control gain for below-rated operation in Sect. 3.1, a below-rated wind field should be used, and the power should be used as the cost function. The cost function should reflect the goals of wind turbine design (e.g., increasing power capture or decreasing loads), have a basis in the reality of wind turbine operation (e.g., using gains that provide a stable control input), and also have a feasible solution. The optimization procedure presented in this article is only useful if the cost function represents the design goal, is represented well by the simulation information, and is simulated in a realistic environment.
4.2 Perform preliminary analysis
It is often helpful to perform a preliminary offline analysis to fine-tune the cost function and optimization parameters. In an offline analysis, a grid search of the optimization parameter is used to estimate the output space of the simulations (e.g., maximum generator speed and blade loads), using a linear or quadratic estimate of the cost function. To clarify, the results in Sect. 3 are of online optimizations, where actual simulation data are used to compute the cost function and perform the optimization procedure. While one of the goals in developing this optimization procedure is to eliminate the large number of simulations associated with grid searching, a grid search does help fine-tune the parameters of future, similar control tuning procedures that use zeroth-order optimization. If multiple measures are used in the cost function (e.g., in the pitch control tuning of Sect. 3.2), it is important to determine whether the cost function has a minimum within the parameter bounds. Otherwise, the cost function must be further refined. A preliminary offline analysis can be used to more quickly determine the optimization parameters (e.g., step size or smoothing parameter) that converge in the fewest number of simulations to some ground truth determined from the estimated cost function based on the initial grid search.
4.3 Set simulation parameters
As the examples of Sect. 3 illustrate, each optimization procedure requires slightly different parameters. While the parameters presented in Tables 1, 2, and 4 may not necessarily be the best ones, they have been fine-tuned through extensive offline testing and evaluating online tests that use actual simulation data as the measures used in the cost function. The goal of this section is to provide general guidelines and rules of thumb, where possible, for choosing the parameters of the optimization procedure.
4.3.1 Sample search range and Newton's approximation
The smoothing parameter μ should be based on the optimization parameter z. The sample z+ϕμ should result in an adequate change to the cost function so that good gradients can be used for the descent algorithm; note that the magnitude of the direction . From the examples in Sect. 3, a different μ is required because the cost function of each application has a different magnitude and changes at different rates. Too large of a μ can result in samples that violate the hard bounds or gradients that do not represent the local gradient at the stage sample. On the other hand, a μ that is too small can result in noisy gradients, the result of possibly nonsmooth simulation information for samples that are close to each other.
When optimizing over multiple parameters, the D matrix is used to approximate Newton's method for optimization. D increases or decreases the descent direction dr, where the sensitivity of the cost function to that parameter (dimension) is small or large, respectively. Ideally, the matrix D incorporates second-order information to scale the gradient estimate in each dimension. In a true Newton method, where second-order information is available, , where H𝒞(z) is the Hessian of the cost function 𝒞 at the parameter z. To approximate Newton's method, we use , where
and z(i) is the ith element of the parameter set z. The elements Di of D can be determined from offline simulation analysis, where Eq. (37) can be estimated by finding a quadratic regression of the cost space. Alternatively, Di can be manually tuned, i.e., if the dimension i is not being adequately searched, Di should be increased.
For example, in the pitch controller tuning (Sect. 3.2), the cost function (shown in Fig. 3) is less sensitive to the damping ratio z(2)=ζreg than it is to the natural frequency z(1)=ωreg, so we use . If only the first-order (estimated) information were used and the direction of the maximum gradient were exactly followed, the solution would zigzag in the ωreg direction and take longer to converge to the optimal solution in both the ωreg and ζreg directions.
4.3.2 Step size
The initial step size α0 is an important parameter to test offline and also fine-tune when using online simulations to compute the gradient. It was found that for all optimization examples in this article, the product of the initial step size and the norm of the gradient should be on the order of a magnitude of 1, namely
The parameters used in the Armijo step size rule were the same for all examples. Conservative values were used, which essentially only ensures a nonincreasing cost function without a requirement on the rate of descent of the cost function.
4.3.3 Stages and samples per stage
Enough stages should be evaluated so that the cost function converges to some value; this is typically learned through offline analysis or by trial and error in online tests. For example, when analyzing the pitch control tuning results of Fig. 4, the results suggest that the procedure could be performed with fewer stages, whereas it seems more stages could be used in the minimum pitch control tuning of Fig. 7. In general, it is found that fewer samples per stage (along with more stages) result in the fastest convergence with respect to the total number of simulations.
4.3.4 Parameter bounds and initial guess
Hard constraints on the parameter should reflect the set of feasible parameters for the control task being optimized. However, the bounds should not be so small as to restrict the space and possibly miss nonobvious control solutions. The initial guess provided to the algorithm should also allow for the space to be adequately searched.
4.4 Perform optimization and evaluate visualization
After performing initial, offline analysis and running the zeroth-order optimization algorithm using online simulation data, the whole procedure should be evaluated with the following questions:
Does the algorithm converge to a feasible solution?
Does the optimized parameter appear to be near the minimum of the visualized cost over the parameter space?
An affirmative answer to both of these questions should provide confidence in the optimized result.
In this article, we developed a data-driven approach for optimizing controller parameters using simulation results. By using a zeroth-order optimization algorithm, random samples are generated near an initial guess, which are used to compute the local gradient. A standard gradient descent method ensues, where a step size rule is used to ensure convergence and attempt to decrease in the cost function before the next guess is chosen and the process is repeated. We also use ordinary kriging to visualize the design space and its uncertainty to provide a level of confidence in the optimized result.
The zeroth-order algorithm was applied to three different applications in wind turbine control. To demonstrate the process on a one-dimensional parameter optimization, the torque control gain was tuned to optimize power capture in below-rated operation. The baseline pitch controller parameters were tuned in a two-dimensional optimization problem with the goal of minimizing structural loads and include a constraint on the maximum generator speed. Using an adaptable cost function and step size, the algorithm was able to tune the baseline rotor speed control for rotors ranging from 40 to 400 m in diameter. We compare the results, in terms of accuracy, convergence, and number of function evaluations (simulations) for different optimization parameters and against the standard grid search method. In a series of one-dimensional parameter optimizations, we also determined the settings of a lookup table for the minimum pitch limit of the pitch controller, reflecting the overall blade design process and system-level goals.
Since each optimization procedure depends on the specific control problem, we have provided a set of guidelines based on the experience gained during this study for developing future, similar optimization procedures. The methods presented in this article automate a usually manual process, reduce designer effort, and require fewer simulations compared with grid searching methods. These methods can be used for repeated control tuning processes that are required for continually updating designs that must be evaluated in simulation using a well-functioning controller.
The pitch controller described in Sect. 3.2 and 3.3 is based on the controller presented in the NREL-5MW reference manual (Jonkman et al., 2009); this standard control scheme is widely used as a reference for comparing control schemes and evaluating different aspects of turbine design. As shown in Fig. A1, the controller is a gain-scheduled proportional–integral (PI) controller with constant torque above rated. The PI control architecture allows the generator speed dynamics to be represented as a second-order system. Since the sensitivity of aerodynamic torque to blade pitch changes with the blade pitch, the PI gains are scheduled on the blade pitch. The pitch command is saturated to some minimum setting to control power or reduce blade loads; thus, an antiwindup scheme is necessary.
A1 Regulator mode and PI gains
To derive the PI gains for a generic rotor model, a rigid model of the drivetrain is used:
where ωg is the generator speed; Jtot is the total drivetrain inertia; including the rotor and generator components; G is the gearbox ratio between the low-speed rotor shaft and the high-speed generator shaft; τa is the aerodynamic rotor torque caused by the wind and controlled via blade pitch; and τg is the generator torque, which is a control input. The rotor torque is nonlinearly dependent on the blade pitch θ. The linearization with respect to a perturbation in blade pitch δθ is
where the differential torque δτg=0 because the torque is constant in above-rated operation. The sensitivity of the aerodynamic torque to rotor speed is omitted since it has a much smaller magnitude than . In terms of power P,
where ωrat is the rated generator speed, which is a constant operating point since it is the desired set point of the controller. The proportional–integral control is
where kP and kI are the proportional and integral control gains, respectively, and δωg represents a generator speed perturbation.
which can be represented by a second-order dynamic system in the form of
where Mreg, Dreg, and Kreg are the mass, damping, and stiffness of the regulator mode, respectively. Alternatively, the regulator mode can be represented by its natural frequency ωreg and damping ratio ζreg, defined by
By defining the desired properties of the generator speed dynamics, ωreg and ζreg, the proportional and integral gains are defined as follows:
A1.1 Power–pitch sensitivity and gain scheduling
which we will define as S(θ) because it is a function of the blade pitch. Simulations in FAST are used to determine the pitch operating points at various above-rated wind speeds. The operating points are used in FAST linearizations, with all the degrees of freedom disabled, providing the input–output sensitivity from pitch to power. The results of performing this sensitivity analysis for several rotors are shown in Fig. A2.
Because of the near-linear relationship with blade pitch θ, the sensitivity can be parameterized by
where S(0) is the sensitivity at θ=0∘ and θk is the pitch angle at which the sensitivity doubles:
and the final, gain-scheduled PI gains:
Figure A1 depicts the implementation in block diagram form.
A1.2 Summary of pitch control tuning procedure
To derive the parameters from simulations and tune the regulator mode, we use the following procedure:
Simulate the operating points in FAST using a steady wind input across above-rated wind speeds. Choose a S(0) so that the PI gains produce a stable result. Simulate for enough time for the values to reach steady state and record the blade pitch at each wind speed.
Linearize the turbine in FAST, disabling all of the degrees of freedom, at the wind speeds and pitch angles found from the previous step. Use the element of the input–output matrix that corresponds to the pitch input and power output matrix to determine the sensitivity of power to pitch at the various pitch angle operating points. Plot the values and fit the parameters S(0) and θk as in Fig. A2.
Tune the regulator mode (ωreg,ζreg) using the desired design measure. Usually, larger natural frequencies (ωreg) result in better generator regulation but also higher structural loads. A grid search could be used or an optimization procedure like the one in Sect. 3.2.
The data from this study can be made available upon request.
DSZ developed the optimization software, developed the example problems, and prepared the visualizations and manuscript. ED reviewed the zeroth-order optimization theory. LYP guided the study, helped formulate the article concept, and reviewed multiple drafts of the article.
The authors declare no competing interests.
The views and opinions of the authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
Support from a Palmer Endowed Chair Professorship is also gratefully acknowledged.
This research has been supported by the Advanced Research Projects Agency-Energy (grant no. DE-AR0000667).
This paper was edited by Katherine Dykes and reviewed by Alan Wai Hou Lio and Luca Sartori.
Bertsekas, D.: Nonlinear Programming, Athena Scientific, Belmont, Mass., 1999. a
Booker, A. J., Dennis, J. E., Frank, P. D., Serafini, D. B., Torczon, V., and Trosset, M. W.: A rigorous framework for optimization of expensive functions by surrogates, Struct. Optimization, 17, 1–13, https://doi.org/10.1007/BF01197708, 1999. a
Box, G. E. P. and Wilson, K. B.: On the experimental attainment of optimum conditions, J. Roy. Stat. Soc. B Met., 13, 1–45, 1951. a
Boyd, S. and Vandenberghe, L.: Convex Optimization, Cambridge University Press, New York, NY, USA, 2004. a
Couckuyt, I., Dhaene, T., and Demeester, P.: ooDACE – a Matlab Kriging toolbox: getting started, Tech. rep., Ghent University, available at: http://www.sumo.intec.ugent.be/ooDACE_download (last access: 16 October 2020), 2013. a, b
Duchi, J. C., Jordan, M. I., Wainwright, M. J., and Wibisono, A.: Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations, IEEE T. Inform. Theory, 61, 2788–2806, https://doi.org/10.1109/TIT.2015.2409256, 2015. a
Hansen, M. H., Hansen, A. D., Larsen, T. J., Øye, S., Sørensen, P., and Fuglsang, P.: Control design for a pitch-regulated, variable speed wind turbine, Tech. Rep. 1500(EN), Risoe-R, available at: http://orbit.dtu.dk/files/7710881/ris_r_1500.pdf (last access: 16 October 2020), 2005. a, b
Johnson, K. E., Pao, L. Y., Balas, M. J., and Fingersh, L. J.: Control of variable-speed wind turbines: standard and adaptive techniques for maximizing energy capture, IEEE Contr. Syst. Mag., 26, 70–81, https://doi.org/10.1109/MCS.2006.1636311, 2006. a, b, c
Jonkman, J., Butterfield, S., Musial, W., and Scott, G.: Definition of a 5-MW reference wind turbine for offshore system development, Tech. Rep. NREL/TP-500-38060, National Renewable Energy Laboratory, available at: https://www.nrel.gov/docs/fy09osti/38060.pdf (last access: 16 October 2020), 2009. a, b, c, d, e
Loth, E., Griffith, D. T., Johnson, K. E., Pao, L. Y., Schreck, S., and Selig, M. S.: SUMR – Segmented Ultralight Morphing Rotor, available at: https://sumrwind.com/ (last access: 16 October 2020), 2016. a
Moustakis, N., Mulders, S. P., Kober, J., and van Wingerden, J.-W.: A Practical Bayesian Optimization Approach for the Optimal Estimation of the Rotor Effective Wind Speed, in: American Control Conference, Philadelphia, PA, USA, 4179–4185, available at: http://www.jenskober.de/publications/Moustakis2019ACC.pdf, 2019. a
Sasena, M. J.: Flexibility and efficiency enhancements for constrained global design optimization with Kriging approximations, PhD thesis, available at: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.4697 (last access: 16 October 2020), 2002. a
Soltani, M. N., Knudsen, T., Svenstrup, M., Wisniewski, R., Brath, P., Ortega, R., and Johnson, K.: Estimation of rotor effective wind speed: A comparison, IEEE T. Contr. Syst. T., 21, 1155–1167, https://doi.org/10.1109/TCST.2013.2260751, 2013. a
Zalkind, D. S., Dall'Anese, E., and Pao, L. Y.: Automatic controller tuning using a zeroth-order optimization algorithm, Wind Energ. Sci. Discuss., https://doi.org/10.5194/wes-2020-63, in review, 2020. a