Reply to reviewers

Unstable atmospheric conditions are often observed during the daytime over land and for significant periods offshore, and are hence relevant for wake studies. A simple k-ε RANS turbulence model for simulation of wind turbine wakes in the unstable surface layer is presented, which is based on Monin-Obukhov similarity theory (MOST). The turbulence model parametrizes buoyant production of turbulent kinetic energy (TKE) without the use of an active temperature equation, and flow balance is ensured throughout the domain by modifications of the turbulence transport equations. Large eddy simulations and 5 experimental data from the literature are used for validation of the model.

The time for a wind turbine wake RANS simulation (∼ 1 wallclock minute at ∼ 50 CPU's) is contemporary, see Appendix A in our article. While we don't simulate any LES ourselves in this paper, our experience for recent similar LES runs in EllipSys3D (not published) is that LES is still on the order of 10 3 more expensive than RANS in terms of CPU-hours as was also reported by van der Laan et al. (2015). The question of computational costs is naturally heavily dependent on the CFD code, turbulence model, mesh, CPU, user experience, etc., so we would like to stress that 10 3 is a rough estimate.
3. Sec 2, line 66: the comparison in terms of computational ressources needed and return time between LES and RANS is well described and objective. Is a comparison with the engineering models would also be relevant here?
We do not think of an engineering model as a "simulation tool", which is why we only discussed the cost of RANS and LES in Section 2 (section 2.1 is called "simulation setup"). If an engineering wake model is programmed in an efficient manner, it typically takes on the order of seconds or milliseconds to execute on a regular laptop, so there is essentially no computational cost.
4. Sec 2.1, line 94: the reason to not take into account more realistic inflow profiles covering the whole ABL is understandable. But what would be the implications or consequences in terms of physics (compatibility of the proposed model for example) or numerics (impact computation time for example)?
This is a topic we have also been considering, and wish to pursue in the future; hopefully we can give a good answer with regards to the implication of more complex inflow profiles in a future article. A range of more realistic inflow profiles from the "ABLp" and "ABLc" models are discussed by van der Laan et al. (2021a), but they are only available for neutral/stable conditions. 5. Sec 2.1, line 96: the choice of K and Cu values may be justified The κ and C µ values have been moved to Table 2.1 and a reference to Sørensen (1995) is added.
6. Sec 2.1, line 116 + Fig. 2: the comments on the eddy viscosity decomposition are poor and may be expanded We agree and have re-written the section. The message we were trying to send with the decomposition and figure is that the faster wake recovery in typical unstable conditions is not only caused by increased velocity scale (i.e. TI), but also due to an increased turbulent length scale.
7. Sec 2.2, line 124: the advantage of the Joukowsky-AD compared to airfoil-AD is not very clear. Is it just interesting because it uses few input parameters?
Yes, exactly. The airfoil-AD requires information about chord-distribution, twist-distribution, airfoildistribution and airfoil data (lift-and drag-coefficients for a wide range of angle of attacks) for each airfoil type. The Joukowsky-AD requires none of these, but still predicts similar results for the force distributions on the disk, Appendix B.
8. Sec 2.3, line 143 + Fig. 3.: the vertical mesh stretching chosen coupled to the large domain should imply a large number of mesh cells out of the wake zone, i.e. the interesting area. This is a classical drawback of stretched structured grids. Can you give the number of cells into the wake region (the one of Fig. 3.) over the total number of elements and comment?
The number of cells in the z-direction between 0 < z/D < 3 is n z = 70 (not written in the article), while one can calculate n x = lx ∆x = 160 and n y = ly ∆x = 40. The total amount of cells was given above Fig. 3 in the paper N total = 2.1 million, hence the requested ratio is: nxnynz N total ≈ 0.22. It is therefore correct that the majority of the cells are outside of the region of interest, which certainly is drawback in terms of computational cost (this has been added to the paper). The drawback of not using a large domain is artificial tunnel-blockage and streamwise developing inflow profiles (even with the S k modification of the k-equation there will be some initial development). Some studies, e.g. van der Laan et al. (2021a), use even larger domains of size 100 km, while others only simulate in the wake domain, and in this paper we choose a compromise of 10 km.
to an overestimation of the wake recovery ; the k-ε-f P corrects this behaviour by using f P < 1 in regions with large velocity gradients leading to 0 < ν t /ν tref < 0.4 (it does not go exactly to 0, but to some value in this range according to the colorbar) in the two aforementioned regions. 11. Sec 3.1, Fig. 7: the comments on Fig. 7 are poor. More discussions can be added on wake recovry via velocity field, on shear parameters or turbulent time.
Agree, we have added discussion/motivation of the shear parameter. The turbulence time scale is removed, since it is not of use for the current discussion.
12. Sec 3.2, Fig. 8 & 9: Same remark Fig. 8 has been simplified and the small discussion for Fig. 8 and 9 has been re-written.
13. Sec 4., line 243: the choice of CB and CR values is unclear. As CR depends on CB, how can CR=4.5 be fixed ?
The value of C R is not fixed to 4.5, we simply write that this is the value of C R in the neutral limit, i.e. when B/ε = 0. As written around the text of eq. 28, C B is a new parameter to be calibrated and we choose/recommend C B = 5 from an assessment of the validation cases in Sec. 4. We agree that a more rigorous statistical approach (e.g. similar to the calibration procedure by van der Laan et al. (2015)) would be better, but the quality and variety of our reference data is in our opinion not feasible for such an analysis. It is therefore only a "loose" recommendation of C B = 5, which we also try to underline in the conclusion of the paper.
One can both argue there should be less cases (to simplify the paper) and that there should be more cases (the more cases, the more fair validation). The V80-Abkar and V80-Keck cases both simulate the V80 turbine, but using different numerical codes and inflow conditions, so in our opinion their are both valuable sources for validation. There are only two cases with TI (V80-Abkar and V80-Keck), so we should be careful about making general conclusions from these, although the 2017 model indeed seems to predict near-wake TI better than the cstB model. As also mentioned in the V80-Abkar case description, RANS models typically overestimate TI, which is also seen for the cstB model. The 2017 model has a slower wake recovery in terms of velocity deficit, which would be consistent with also having less turbulence development in the near-wake region; the wrong behaviour of the 2017 model with regards to velocity deficit therefore seems to counteract the general TI problem with RANS models. We prioritize predicting a correct velocity deficit, since this metric is used for AEP calculations. We have removed C µ and κ from the text; the constants are only given in Table 1 now.

Reviewer 2
The present paper deals with the modelling of wind turbine wakes in the unstable atmospheric boundary layer using a RANS approach. Two models are proposed: the first one aims at accounting the buoyant production of TKE without relying on a temperature equation. The second model improves the so-called k-epsilon-fp RANS turbulence model, based on observed discrepancies and inconsistencies against experimental data and higher-fidelity simulations, i.e. LES. The model is globally clear and very well written.
I would like to start this review with a general discussion about the proposed approach. It is my understanding that in an unstable ABL, the faster wake recovery (with respect to neutral conditions) that is observed can be attributed to the large levels of wake meandering that smear out the wake. The meandering itself is due to a large amount of lateral (y-wise) turbulence intensity in the ABL. In other words, it seems difficult to me to neglect the anisotropic nature of the ABL when dealing with wind turbine wakes. I think it is necessary to include a discussion on that topic in the paper and explain how the authors think they can deal with such anisotropic flows using a two-equation turbulence model, based on an isotropic turbulence assumtion. I understand the main objective is to propose an efficient, intermediate fidelity model (i.e. in between analytical and LES), but the necessary physics should be there and properly represented.
the freestream; this is the case for all turbulence models based on the Boussinesq hypothesis (Prandtl's mixing length, k-ε, k-ε-f P , k-ω, etc.), so the only way to include anisotropy is to abandon the Boussinesq hypothesis, e.g. use differential Reynolds stress models, algebraic Reynolds stress models, or machine-learning approaches. This is beyond the scope of this paper, but is certainly interesting to investigate further in the future (even the neutral ABL has anisotropic freestream turbulence according to Panofsky and Dutton (1984), Chapter 7). Abkar and Porté-Agel (2015) showed with LES data that there is meandering in both stable, neutral and unstable conditions; also there is a non-negligible vertical meandering in all conditions (approximately 60-70% of the lateral meandering at 5D, see their Figure 16). Our steady RANS simulation obviously don't capture the dynamics of the meandering, but it does predict larger turbulent scales for increasingly unstable conditions, which is the physical origin of the increased meandering according to Keck et al. (2014).
The authors introduce a new so-called "cstB" model to account for buoyant TKE production. Although the reasoning is clearly explained, there is, from my point of view, a major drawback in this paper: the model, that seems to make sense physically speaking, is never quantitatively validated, and thus none of the assumptions are properly justified. Only a brief qualitative discussion is provided. It is surprising, since two cited references (Zhang et al. 2013, Hancock andZhang 2015) contain quantitative data that could be used for validation purpose, if I am correct. Furthermore, a minimal validation/verification of the model consistency, that should be provided, is a comparison to the flux-gradient approach (section 3.1), with the integration of the temperature equation in the system. I guess this is feasible with Ellipsys and should be integrated for comparison/validation purposes.
The wind tunnel data of Zhang et al. (2013) or Hancock and Zhang (2015) could have been chosen for the validation procedure, however we preferred real-scale data (either LES or experiments) to avoid scaling effects and because artificial atmospheric profiles generated in wind tunnels are less reliable.
It is indeed possible to use an active transport equation for temperature and the flux-gradient relationship u ′ i θ ′ = − νt P rt ∂Θ ∂xi in EllipSys as the Reviewer suggests, but it will not be possible to obtain a steady-state solution, because heat will continuosly be added to the domain through the wall BC, while there is no sink of heat in the domain. One can then question, if our simple steady-solution of the unstable ABL is even relevant; we choose to interpret it as an ensemble average of many similar unstable conditions, which could be of value for estimation of AEP. For other usages, e.g. day-to-day forecasting, the cstB model can not be used.
More discussions and validation of the physical model assumptions will be undertaken in a planned TORQUE paper next year using new LES data (see last question in this reply).
It is my point of view that these drawbacks also apply to the presented modifications to the k-epsilon-fp model. Some improvements are introduced. These are mainly based on mathematical consistency (i.e. neutral ABL limit) but are not properly justified (no real physical explanations are provided). And, in the end, the proposed validation cases focus on "global" flow properties such as wake velocity deficit or TKE levels. It is my opinion that this paper would gain a lot by showing proper comparisons to LES simulations (or experimental data): one might be able to extract the dissipation rate, the eddy-viscosity (and then estimate fp through (16)), or other quantities that would help to asses the pertinence of the choices that are made. See P.E. Réthoré PhD Thesis as a typical example.
Modification 1 comes directly from the paper of Apsley and Leschziner (1998), while we agree that the arguments for modification 2 are more hand-waving. It is simply a choice to make the C R parameter flowdependent. In the future we wish to explore more advanced turbulence models, e.g. the explicit algebraic Reynolds stress models, to avoid such ad-hoc assumptions. As stated in the previous question, there is a LES/RANS comparison study on its way.

About the presented results
• the TKE levels appear to be overall over-estimated, while the velocity profiles match well. Can the authors provide some analysis on this inconsistency?
• Only single-wake results are compared with LES and/or experiment, as indicated in the paper title. However, it seems that some of the presented LES are based on multi-turbine simulations... Which may rise some doubt about the capacity of the model to deal with wake superposition. Does it perform well in such cases, as for the single wake cases?
The 2017 model with f P (but without the modifications of this paper) was used by van der Laan et al.
(2021a) to simulate a row of ten aligned turbines and it performed poorly for unstable conditions, i.e. it had significantly slower wake recovery compared to a similar neutral case and no wake equilibrium seemed to be reached. Preliminary results for the same case run with the cstB model looks better, see Fig. 1, in the sense that it has faster wake recovery than the neutral reference case for the first 3-4 turbines, while the wake deficit is the same in the fully developed region of the wind farm.
The cstB model is a new approach, so our intention was to focus solely on single wake cases in this first paper, but we have now included this aligned row case in Appendix A3. We also agree that the model should be tested for more complicated scenarios in the future, i.e. wind turbine rows, full wind farms, AEP calculations, complex terrain, etc.  About the grid convergence study: wake TI and velocity profiles are extracted 1D behind the wind turbine. However, the presented validation results are taken at 3, 4, 5... up to 12D. And I have the feeling it is easier to converge at 1D than at 12D, since there are much less diffusion effect. What is the reason for this choice? I personally think it is necessary to include the profiles at, let's say, 6 and 12D behind the wind turbine.
As was stated in a small parenthesis in Appendix B, we already did the same grid convergence analysis at 5D downstream; for our simulations it showed that grid convergence is much more critical in the near-wake than in the far-wake, possibly because of the large gradients in the former region, which is why we showed the plots at 1D downstream.
One last general remark: can this model be adapted to stable conditions? And if already done, how does it perform? I guess it is the authors objective to end-up with a model that is valid for all the classical ABL stability a wind turbine may encounter.
It is relatively straight-forward to use this model in stable conditions (one just has to use other expressions for S k and C ε3 , see van der Laan et al. (2017)), but its physical validity and practical value are questionable: The height of a nightime/stable ABL is on the order of 100-200 m. The cstB model is only physical valid in the ASL, where h ASL /h ABL ≈ 0.1, hence it would strictly only be valid in the first 10-20 m above the ground. Probably a more viable route for stable conditions, would be the so-called ABLc and ABLp models (van der Laan et al. (2020), van der Laan et al. (2021b)), which can model stable conditions and be used throughout the whole ABL including the capping inversion.
In the end, both models seem to lead to improve previous modelling approaches, and it sounds like a step forward is achieved in this paper. Thus, I strongly encourage the authors to provide more justification and adapted validation to their work. This will surely provide some confidence in the proposed approaches, although I am aware having high quality data for such unstable cases is not that easy. Most probably running and comparing the RANS approach with an LES simulation would help.
We agree that more detailed LES data could help to justify and validate the model; the LES module of EllipSys is currently not optimally suited for atmospheric flows with stratification, but we have teamed up with external partners, who are capable of running such simulations. A TORQUE paper on this matter is planned for next year.
• Appendix A3 with an aligned row case.
RANS modelling of a single wind turbine wake in the unstable surface layer Abstract. Unstable atmospheric conditions are often observed during the daytime over land and for significant periods offshore, and are hence relevant for wake studies. A simple k-ε RANS turbulence model for simulation of wind turbine wakes in the unstable surface layer is presented, which is based on Monin-Obukhov similarity theory (MOST). The turbulence model parametrizes buoyant production of turbulent kinetic energy (TKE) without the use of an active temperature equation, and flow balance is ensured throughout the domain by modifications of the turbulence transport equations. Large eddy simulations and 5 experimental data from the literature are used for validation of the model.

Introduction
Wind turbine wakes have been studied for decades using many different methodologies, including wind tunnels, field experiments, analytical engineering models, and numerical simulations. A review of these methodologies is given by Porté-Agel et al.
(2020) and it is noteworthy that many of the references therein are from the past decade. The motivation for many of these new 10 studies is the large number of new wind farms emerging each year, where wake effects significantly impact the Annual Energy Production (AEP), as well as wind farm lifetime through increased fatigue.
A sub-category of "numerical simulations" is the Reynolds-Averaged Navier Stokes (RANS) approach, which is a Computational Fluid Dynamics (CFD) method that solves for the mean fields. This means that no time history of the flow is obtained, however the computational resources required for RANS are very small compared to higher-fidelity CFD methods, making 15 RANS an attractive option for parametric studies or for isolating various physical effects (c.f. van der Laan et al., 2021). The wind turbine forces are commonly represented as Actuator Disks (AD) in RANS; several types of AD models are reviewed by van der Laan et al. (2015a). Compared to engineering models, an advantage of RANS is that physical features of the flow (e.g. induction zones, wake interaction, shear layersand ground effects , :::::: ground :::::: effects :::: and :::: flow ::: over :::::::: complex ::::::::: geometries) are solved for directly, rather than being prescribed through empirical relations. Disadvantages are that fatigue loading can not be 20 determined due to the steady nature of the method, and that the solution relies heavily on the turbulence model.
The part of the atmosphere closest to the ground, i.e. the atmospheric surface layer (ASL), can be parametrized with the similarity theory of Monin and Obukhov (1954) (MOST) and used as inflow for RANS simulations of wind turbine wakes. The k-ε turbulence model is usually preferred in RANS wake studies and Crespo et al. (1985) for example utilized this (although in a parabolized RANS setup, which requires less computational resources, but is less accurate) to simulate a single wake 25 1 in stable, neutral and unstable conditions. The wake was found to recover faster (i.e. approach the freestream velocity at a shorter downstream distance) in the unstable ASL and slower (i.e. approach the freestream velocity at a longer downstream distance) in the stable ASL compared to in a neutral ASL, and this was later confirmed in field experiments by Magnusson and Smedman (1994) and full RANS simulations, including the temperature equation, by Alinot and Masson (2002). Rados et al. (2009) added a parametrized buoyancy term to the k-ε-equations based on the MOST expressions, eliminating the need for a 30 temperature equation. The "indirect method" of Rados et al. was shown by El-Askary et al. (2017) to produce similar wake deficit and turbulence intensity (TI) profiles as the "direct" method that employs a temperature equation.
In all the RANS studies discussed thus far, the simulations suffer from a known imbalance in the kand ε-equations; this means that the freestream velocity and turbulence profiles vary horizontally throughout the domain, so that different wake results will be obtained depending on the streamwise position of the simulated turbine. van der Laan et al. (2017)  consensus and explain it with the increased TI encountered in unstable conditions due to buoyant production of turbulence.
Nevertheless, both Alinot and Masson (2002) and Keck et al. (2014) show that a faster wake recovery in unstable conditions is still observed, even when the reference TI is kept fixed (by changing the roughness length); they argue that the enhanced wake recovery must be caused by the increased turbulent length scale associated with the unstable conditions, because the turbulent velocity scale is approximately constant for fixed reference TI and wind speed.

50
The balanced k-ε MOST model by van der Laan et al. (2017) can be combined with the f P -correction, which was originally formulated by Apsley and Castro (1997) and later used by van der Laan (2014) to circumvent the over-diffusiveness of the k-ε model in the wake region. However, the f P -limiter was derived and calibrated for a neutral ASL, where it has been applied in many cases with success, but for non-neutral conditions it has yielded unphysical behaviour, especially in unstable cases (van der Laan et al., 2021). Modifications to the MOST k-ε-f P equations in the unstable regime are therefore suggested in this 55 paper and validated against various field experiments and LES's.
After continuous development since then, it is now a highly scalable code, which can be run in parallel on large high performance clusters (HPC) via Message Passing Interface (MPI). Thus a typical RANS simulation of a single wake only takes a 60 few minutes to simulate on a contemporary HPC, while a similar case with LES would take several hours to run, even with an order of magnitude more computer resources available. In terms of CPU-hours, van der Laan et al. (2015b) estimated LES to be approximately 10 3 times more expensive than RANS and this estimate may even be considered conservative, because the LES inflow was created using a Mann-model turbulence box, and not with the more expensive precursor method. Furthermore, several LES runs are in principle necessary in order to create an ensemble average, which multiplies the cost of LES. This clearly motivates the development of the RANS model as a fast, albeit less accurate, alternative to LES.
The different components of the RANS simulations will be discussed in the following sections.

Inflow profile for unstable ASL
Numerous articles have been written about MOST and a historical review is given by Foken (2006). The theory is expressed and applied via the dimensionless stability parameter where z is the height above ground and L is the Obukhov length. Negative ζ corresponds to unstable conditions, while ζ = 0 corresponds to the neutral limit where there is no effect of buoyancy. Neutral conditions are typically defined as |L| −1 0.002 m −1 (e.g. Gryning et al., 2007) and tend to occur most often, with observed distributions of the stability (1/L or ζ) having a peak around zero (Kelly and Gryning, 2010). The most common unstable Obukhov lengths occur at −0.02 m −1 75 L −1 −0.002 m −1 (Kelly and Gryning, 2010); but offshore, there tends to be a bias towards more unstable conditions, i.e. more negative L −1 compared to onshore (Sathe et al., 2013). Various parametrizations have been suggested for wind speed, k, and ε in terms of ζ; in this paper we use the widely accepted forms of Dyer (1974) for U (namely the Ψ m and Φ m functions), and those found in Kaimal and Finnigan (1994) for ε and k (see van der Laan et al., 2017, for more details). Under unstable conditions these are: where and (7) Φ ε = 1 − ζ.
The above relations are valid for −2 ζ < 0, so for a fixed L < 0, it means that the equations are in principle only valid up to z ≈ −2L, where the free convection regime starts (i.e. buoyant production dominates over shear production of TKE).
Although the blade tip of a modern turbine can reach beyond −2L in unstable conditions (e.g. for z tip = 200 m this happens when L −1 −0.01 m −1 , which is not rare), we nevertheless still choose to apply the profiles-and in fact use them all the way up to the top boundary. More realistic inflow profiles for RANS covering the whole Atmospheric Boundary Layer (ABL) are indeed a current research topic, but will not be discussed further in this paper. Maronga and Reuder (2017) reason that MOST is a "pragmatic solution," because the parameters needed for more realistic inflow profiles are often not available.

Wind Turbine representation
The main advantage of the Joukowsky-AD over the widely used "airfoil-AD" (e.g. Sørensen and Kock (1995), Porté-Agel et al. (2011) and van der Laan et al. (2015a)) is that only a few parameters are necessary: The thrust coefficient C T , tip-speed 130 ratio λ, rotor radius R and freestream reference wind speed U ref (in addition to these, the power coefficient C P is also made an input parameter in our implementation as described by van der Laan et al. (2020)). Nevertheless, it is still able to model non-axissymmetric force distributions and wake rotation, similar to the disk loading of an airfoil-AD. Porté-Agel et al. (2011) argued that these are important features to capture the correct wake behaviour in the near-wake, while van der Laan et al.
(2015c) showed that they are only of minor importance for the far-wake. Wake deficit and rotor loading of the Joukowsky-AD 135 have been found to compare well with several validation cases conducted by Sørensen et al. (2020) and Sørensen and Andersen (2020). This is verified to also be the case for our RANS simulations in Appendix A2.
No nacelle nor tower are included in our simulations, which have been shown to be a good approximation for > 3D downstream of the turbine according to Kasmi and Masson (2008) and Li et al. (2020).

2.3 RANS
A homogeneous, flat lower surface is assumed for all cases in this paper. The inner part of the mesh surrounding the AD is called the "wake domain" and is shown for a typical case in Fig. 3. In this area, a horizontal resolution of ∆x = ∆y = D/10 is used (based on the grid study in Appendix A1), while grid stretching is used in the vertical direction with ∆z = z 0 at the first cell and ∆z = D/10 at the cell at z/D = 3. The wake domain is however only a small part of the full domain: The full domain extends an additional x in = 5 km to the west, north, east and south, respectively, while the top of the grid is at z/D = 25. Grid 145 stretching is used in all directions outside of the wake domain to circumvent an excessively large number of cells (the case shown in Fig. 3 has ::::::::: ≈ 0.45 · 10 6 ::::: cells :: in ::: the :::: wake ::::::: domain ::: and ≈ 2.1·10 6 cells in total). The choice of having grids with size on the order of ∼ 10 km is made to avoid tunnel-like blockage effects, and to have fully developed inflow profiles at the turbine position : , ::::: while ::: the :::::::: drawback :: is ::: that ::: the :::::::: majority :: of ::: the :::: cells ::: are ::::::: actually :::::: outside ::: the ::::: region ::: of ::::::: interest, :: i.e. ::: the ::::: wake ::::::: domain. publications (Michelsen, 1992;Sørensen, 1995;Sørensen et al., 2007), so only the main features are discussed here. The SIMPLE method is used with a modified Rhie-Chow algorithm, following Troldborg et al. (2015), to avoid the numerical wiggles induced by the discrete actuator disk body forces.
As mentioned in the introduction, the flow variables in an empty domain with MOST inflow can be kept in balance by modifying the kand ε-equations, as suggested by van der Laan et al. (2017): where 160

165
The source term, S k , and the C ε,3 parameter constitute the two modifications compared to the usual k-ε equations (similar corrections exist for the stable ASL, but are not discussed in this paper). Viscous terms have been neglected in the above equations, which is a good approximation in atmospheric flow applications due to the Reynolds number being very large (Wyngaard, 2010). The Coriolis force is also neglected, hence no veer is present in the simulations. Definitions of the f P -correction (which was in fact set to f P = 1 in the work of van der Laan et al. (2017)) and buoyant production, B, are deferred to the Section 3. The 170 parameters used in the above equations are summarized in Table 1. Finally, S k and C ε,3 differ slightly from those printed in van der Laan et al. (2017), with the only difference being that here we choose Φ h Φmσ θ → 1; this "modeller's choice" for turbulent Prandtl number (σ θ ) avoids the inconsistency mentioned in that paper, and makes the model independent of the temperature similarity function Φ h .
8 3 Modification of the k-ε-f P model in the unstable ASL

175
The background eddy viscosity shown in Fig. 1 is perturbed in the turbine presence and especially so in the wake region, see Fig. 4 for an example with neutral inflow. As explained in Section 2.1, ν t is very important for the wake development and the f P -correction effectively attenuates the ν t perturbation in the interface between the wake and freestream, known as the shear layer, and in the region around the AD to improve wake predictions. This attenuation can also be viewed as a modification of the turbulence scales, u t and t . x/D [-] (d) At z = z hub The cause of the ν t perturbation in the first place is the large velocity gradients across the AD and the shear layer, which enhances TKE shear production, but other terms of Eq. (11) are also highly active in these regions, and it is this complex interplay together with the f P -formulation that in the end determine the wake recovery. The effect of the buoyancy term in this interplay is discussed first, then afterwards the role of f P in the unstable ASL.

Buoyant production term 185
The buoyant production of TKE is B ≡ g θ0 w θ and the heat flux is typically obtained using a temperature equation and a flux-gradient relationship. In this work, we pursue an alternative way and investigate two simple parametrizations: The "2017 model" is the one utilized by van der Laan et al. (2017) Another deficiency of the unstable 2017 model is illustrated in Fig. 6: For a given I ref , it unphysically predicts slower wake 210 recovery than in neutral as also noted by van der Laan et al. (2021). This is remedied in the cstB model, where a slightly faster wake recovery is seen. It can be noted that in the near-and far-wake of the cstB model both B and S k terms are small compared to the other TKE terms, c.f. Fig 5, and as such it effectively resembles the neutral model, but with the one difference that it has a larger turbulent length scale, c.f. Fig. 2, which explains the faster wake recovery seen in Fig. 6. The B and S k terms must nevertheless still be retained to enforce the freestream balance of the kand ε-equations throughout the domain.

Turbulence closure with f P in non-neutral conditions
As stated in the introduction, k-ε models tend to predict faster wake recovery compared to experiments and LES. This can be corrected by using f P = 1 in the ν t definition, Eq. (16), which clearly affects the veloctiy deficit as shown in Fig. 4. The form 225 of f P used for wakes in the neutral ASL by van der Laan (2014), can be summarized as:
A more subtle modification arises recognizing that the f 0 parameter is also stability-dependent, i.e., 240 This is actually the form suggested by Apsley and Leschziner (1998), but they were not considering stability effects, i.e. no variation ofσ nor C R with stability; it has not been used in previous applications of the 2017 model. We note Eq. (27) is consistent with the neutral limit, sinceσ 2 → C −1 µ for ζ → 0. The Rotta constant was calibrated to C R = 4.5 for wind turbine wakes in the neutral ASL in the work of van der Laan (2014) and we therefore require C R → 4.5 in the neutral limit (ζ → 0).
When both modifications are used, faster wake recovery for a given I ref occurs in unstable conditions, as shown in Fig. 9; this was also the case , ::: i.e. :::::: similar ::::::::: behaviour :: as when f P was 'turned off' (set to 1 :: i.e. :::::: f P = 1), c.f. Fig. 6.  The cstB model with the f P -modifications described in the previous section is tested for the cases summarized in Table 2. Each case is simulated with a range of C B parameters, C B = {0.0, 5.0, 10.0}, while keeping C R = 4.5 fixed. The latter has been calibrated for a suite of neutral EllipSys3D LES's by van der Laan (2014), but in practice if it was calibrated with another LES code, a different, optimal C R might have been obtained. In the same way, it cannot be expected that a universally valid C B exists, when we compare with results from a range of different codes and experiments. Hence, no optimal C B will be obtained 260 in this section, but rather the qualitative effect of C B is shown.
The numerical setup for each case follows that described in Section 2.3, e.g. the cell size and extent of the wake region are scaled with the rotor diameter.

Case
Type

SWiFT case
A large wake benchmark study was conducted by Doubrawa et al. (2020) to compare various simulation methodologies and 265 codes against LIDAR measurements in different atmospheric conditions. The measurements were carried out for a Vestas V27 turbine at the Scaled Wind Technology Facility (SWiFT) in Lubbock, Texas, USA, which is an area of flat terrain.
The inflow parameters of the SWiFT row in Table 2 were obtained from the ensemble average of five 10 min-averages from a met. mast located 2.5D upstream of the turbine. Note, that the stability parameter was measured to ζ = −0.089 at z = 10 m, which at hub height corresponds to ζ ref = −0.29. Also, the streamwise turbulence intensity was measured at hub This conversion could actually be slightly different in the unstable ASL, because the ratios of velocity variance change with stability (e.g. Chougule et al., 2018), but unfortunately only the vertical velocity variance follows MOST, so that no general surface layer formula can be constructed Wyngaard, 2010). The operational state parameters {C T , P , Ω} were taken from the OpenFAST steady-state curves, which were supplied for the benchmark.

275
For the unstable SWiFT case, the wake profile was only measured at 3D downstream and the results are shown in Fig. 10.
Three different LES codes were used in the benchmark and the filled area in Fig. 10 represents the spread of the LES results.
It can be seen that all three LES's underpredict velocity deficit compared to the LIDAR measurements, which highlights the fundamental problem of comparing measurements with numerical models: Even highly computational expensive simulations do not always match experimental results. This must either be due to experimental errors in the provided input data or because 280 the idealizations used for the LES's are too simple to capture the wake behaviour.
RANS can generally not be expected to perform better than a well-performed LES and if it does, it is likely due to fortunate error cancellations. Therefore, from a theoretical point of view one could argue that the performance of RANS should mainly be assessed with regards to how well it matches the LES results. Both RANS and LES use many of the same idealizations (uniform roughness, flat terrain, homogeneous inflow, etc.) and indeed our RANS results in Fig. 10 are also closer to the LES 285 results, than to the experimental results.
For the SWiFT case : , the 2017 model seemingly performs better than the cstB model, but this is probably due to fortuitous model error and/or some unaccounted mesoscale effects. Model error was expected, because the neutral EllipSys3D RANS simulation in the SWiFT study (surprisingly) did not compare well with experimental results (Doubrawa et al., 2020), despite

NTK41 case
A Nordtank NTK41 500kW wind turbine was installed at what is now the Risø campus of the Technical University of Denmark in 1992, and was used for many research studies before its decommissioning in 2021. Among these studies, the "NTK41 testcase" of this paper is based on LIDAR measurments and LES's conducted by Machefaux et al. (2016). They used two 295 different models for their LES's; the results included in Fig. 11 (along with our RANS results) are from their more advanced model, which they called the "LES-ABL" or "extended model." The inflow parameters ("NTK41" row in Table 2) and LIDAR measurements were ensemble-averaged over 20 10-min averages.
The NTK41 turbine is a stall-regulated wind turbine and is therefore operated at constant rotational speed independent of the inflow wind speed (Hansen, 2015), in this case at Ω = 27.1 rpm. The thrust coefficient for the unstable case of Machefaux 300 et al. (2016) was measured with strain-gauges to be C T,meas = 0.71, while their LES gave C T,LES = 0.83. Looking up the thrust curve of the NTK41 turbine at U ref = 6.8 m/s also gives C T,curve = 0.83, so this will be used in the present RANS simulations. They argued that the lower thrust coefficient of their measurement could be explained by the large uncertainty of the strain gauges. Finally, the measured power was P meas = 120 kW, while P LES = 127 kW and P curve = 125 kW, where the latter will be used to set C P for the AD model of the RANS simulations. Figure 11 shows that the cstB model matches the LES and experimental data better than the 2017 model, although a still faster wake recovery is seen for both of these reference data. Compared to more conventional LES setups (e.g. V80-Abkar, V80-Keck and NREL5MW cases), the LES model used by Machefaux et al. (2016) is simplified by using a modified Mann box for its inflow, and a slip condition at the bottom wall; together these add some uncertainty to the LES results. The experimental wake data can also be expected to have large uncertainties and/or biases connected to them, but the sources and sizes of these 310 were not discussed in detail by Machefaux et al. (2016).
Note, total TI (lower row) is based on U ref and not the local velocity.

V80-Keck case
This case is based on a LES from Keck et al. (2014), where the SOWFA solver was used with a similar setup as in the work of Churchfield et al. (2012), i.e. using a precursor simulation for the inflow and modelling the turbine with Actuator Lines (AL).
More specifically the "unstable North Hoyle row A" case is considered here; it features four V80 turbines spaced 11D apart, using the inflow parameters described in the "V80-Keck" row of Table 2. Wake data is available at x/D = {4, 5, 6} downstream 330 of the first turbine and the induction effect of the downstream turbines on this data should therefore be minimal, hence they are omitted from the RANS simulations. The inflow wind speed is U ref = 8 m/s, the same as in the V80-Abkar case, thus the same operational state of the wind turbine is utilized, c.f. Table 2. The streamwise TI given by Keck et al. (2014) is converted to total TI with I ref ≈ 0.8I u,ref , similar to the method used in the SWiFT case. Figure 13 shows that the cstB model improves the velocity deficit prediction over the old 2017 model, when comparing 335 with the LES results, which were digitized from Fig. 11 in Keck et al. (2014) (note that a typo is present in that figure, i.e. label should be "[D]" instead of "[R]"). Streamwise TI was also available at the same downstream distances and in the RANS simulations it was obtained by converting from the total TI as described above. In both the streamwise TI and velocity deficit LES data, a misalignment of the wake center can be observed, which Keck et al. (2014) explains with that only 10 min of LES data was averaged. This is especially visible in the streamwise TI plots, but nevertheless it seems that the cstB model predicts 340 streamwise TI in the right range except for an overprediction in the wake center, which was also seen in the previous case.  The last case is based on the LES studies by Churchfield et al. (2012), more specifically their "U-L case" (see inflow parameters in the "NREL5MW row" of Table 2). They model two NREL5MW turbines with the Actuator Line (AL) methodology coupled to the aeroelastic FAST solver and the turbines are spaced 7D apart. For the RANS simulations of the present study, we omit 345 the second turbine and only compare with the first wake of the LES study. To avoid biases from the induction zone of the second turbine, we only consider wake results ≥ 2D upstream of the second turbine.
The steady-state power, thrust coefficient and rotational speed were not given in the paper, so therefore the steady-state curves from the DTU in-house aeroelastic solver, HAWCStab2, were used. These are similar to the curves shown by Jonkman et al.  Churchfield et al. (2012). Profiles extracted at z = z ref .

360
Originally developed to account for the general over-diffusive nature of k-ε models in wind turbine wakes under neutral conditions, here the f P -correction is combined with the new cstB model by making two non-neutral modifications. These introduce a new parameter, C B ; it is a free parameter analagous to C R in the original f P -formulation. Both modifications are consistent in the sense that the new, non-neutral f P -formulation becomes equal to the original neutral f P form for ζ → 0.
By using this updated f P -model with the cstB model, a faster wake recovery is obtained for unstable conditions over neutral 365 conditions, when TI is fixed, as was also the case when no f P -model was applied.

A2 The Joukowsky AD method
In summary, the surface force distributions (unit: N/m 2 ) on the AD are calculated in each iteration as: Here, f n,ij and f θ,ij are the normal and azimuthal surface force distributions at the (i, j)'th AD element (i: radial direction, j: azimuthal direction), which are applied in the CFD domain using the methodology described by Réthoré et al. (2014) and Troldborg et al. (2015). F 0 n ≡ i j f 0 n,ij A ij is the total normal force of the unscaled distribution, P 0 ≡ 400 U ref λ i χ i j f 0 θ,ij A ij is the total power of the unscaled distribution, λ ≡ ΩR/U ref is the tip-speed ratio, A is the area of the AD, A ij is the area of the (i, j)'th AD element, χ i ≡ r i /R is the local normalized radius, U ij is the normal velocity at the (i, j)'th AD element, N b = 3 is the number of blades, q 0 is the normalized circulation, g is the root correction and F is the tip correction. The latter two are obtained with Delery's root correction (with parameters a = 2.335, b = 4.0 and δ = 0.25) and Prandtl's tip correction, respectively, c.f. Eq. (A4) and (A5).

405
The normal and tangential loadings for the same case as used in Section A1 are compared between the uniform-AD, airfoil-AD and Joukowsky-AD, the latter with two different δ's, in Fig. A2. Clearly, the Joukowsky-AD with δ = 0.25 produces similar loadings as the airfoil-AD, e.g. qualitatively correct root behaviour, tip behaviour and constant tangential loading region, which makes it superior to the uniform-AD. is prescribed a-priori for the uniform-AD.
Data availability. The RANS results were generated with DTU's proprietary software, but the data presented can be made available by contacting the corresponding author. Interested parties are also welcome to hand-digitize the results and use them as reference in other Competing interests. The authors declare that they have no conflict of interest.