Sensitivity analysis of mesoscale simulations to physics parameterizations: a case study of storm Ciara over the Belgian North Sea using WRF-ARW

The Weather, Research and Forecasting (WRF) model includes a multitude of physics parameterizations to account for atmospheric dynamics and interactions such as turbulent fluxes within the planetary boundary layer (PBL), long and short wave radiation, hydrometeor representation in microphysics, cloud ensemble representation in cumulus, amongst others. A sensitivity analysis is conducted in order to identify the optimal WRF-physics set-up and impact of temporal resolution of 5 re-analysis dataset for the event of sudden changes in wind direction that can become challenging for reliable wind energy operations. In this context, Storm Ciara has been selected as a case study to investigate the influence of a broad combination of different interacting physics-schemes on quantities of interest that are relevant for energy yield assessment. Of particular relevance to fast transient weather events, two different temporal resolutions (1-hourly and 3-hourly) of the lateral boundary condition’s re-analysis dataset, ERA5, are considered. Physics parameterizations considered in this study include: two PBL 10 schemes (MYNN2.5 and scale-aware Shin Hong PBL), four cumulus schemes (Kain-Fritsch, Grell-Devenyi, and scale-aware Grell-Freitas and multi-scale Kain-Fritsch,) and three microphysics schemes (WSM5, Thompson and Morrison) coupled with two geospatial configurations for WRF simulation domains. The resulting WRF predictions are assessed by comparison to observational RADAR reflectivity data on precipitation. In addition, SCADA data on wind direction and wind speed from an offshore wind farm located in the Belgian North Sea is considered to assess modeling capabilities for local wind behavior at 15 farm level. For precipitation, results are shown to be very sensitive to model setup, but no clear trends can be observed. For wind-related variables on the other hand, results show a definite improvement in accuracy when both scale-aware cumulus and PBL parameterizations are used in combination with 1-hourly temporal resolution reanalysis data and extended domain sizes.

Shin-Hong scheme modifies the YSU PBL scheme  for sub-kilometer transition scales (1 km down to 200 m) by reducing the strength of the non-local term with decreasing grid size, assuming gradual resolution of the largest eddies.
While it has been found to outperform conventional PBL formulations for desert convective boundary layers (Xu et al., 2018) 95 and for the XPIA study in the western Great Plains of the United States (Doubrawa and Muñoz-Esparza, 2020), its interaction with cumulus and microphysics options is yet to be tested for extreme weather in coastal environments featuring strong interaction between PBL and microphysical processes.
In the context of wind energy applications, various sensitivity studies have been conducted with the aim of determining a universal "best" case WRF setup to assess local wind resources (Hahmann et al., 2015;Giannakopoulou and Nhili, 2014; 100 Carvalho et al., 2012). The literature presents equivocal results from a multitude of sensitivity analyses conducted at various locations around the planet, indicating a high dependency on the physics combinations and lateral boundary conditions used.
Concerning the estimation of wind energy production compared to measured wind data, Hahmann et al. (2015) study the long-term sensitivity of simulated WRF offshore climatology evaluated against wind LiDAR observations indicating a strong sensitivity to PBL parameterizations and the spin-up period, and insensitivity to global reanalysis and vertical resolution of the 105 model. Carvalho et al. (2014), for offshore and onshore areas in the Iberian Peninsula, indicated a close dependency of PBL and SL parameterizations with different physics combinations, favoring better reproduction of different prognostic variables. Cunden et al. (2018) performed a sensitivity analysis considering different combinations of non-scale-aware cumulus, PBL and microphysics schemes (despite kilometer-range resolutions) for the Island of Mauritius under clear and extreme weather (cyclonic and anti-cyclonic), and were able to identify a best case WRF setup. In contrast, a similar study by Islam et al. 110 (2015) for the Haiyan tropical cyclone over west Pacific Ocean concluded in no particular combination of WRF physics to best reproduce the extreme weather event. For the European continent, studies by García-Díez et al. (2013); Stergiou et al. (2017); Mooney et al. (2013) have conducted year-long and/or long-term sensitivity analyzes indicating a wide spread of possible physics parameterizations depending on the type of local weather event, season and time-lapse considered within the diurnal cycle. 115 As discussed above, the optimal selection of WRF physics parameterizations remains an important open challenge for accurate wind and weather modeling. The current study quantifies the sensitivity of WRF simulation results to physical parameterizations and numerical setup, and aims at identifying most suitable combinations for modeling the storm Ciara EWE passing over the Belgian offshore wind farms in the North Sea in February 2020. A simplified multi-variant sensitivity analysis considering 12 combinations of PBL, cumulus, microphysics, temporal resolution of lateral boundary conditions and geospa-120 tial nested domain configurations are investigated. The remainder of this paper is structured as follows. Firstly, the storm Ciara extreme weather event is introduced in Section 2. Next, the numerical methodology and setup are introduced in Section 3, where also the design of the sensitivity matrix is further described. Subsequently, the results are presented and discussed in Section 4. Lastly, conclusions and perspective are exposed in Section 5.
The case study selected in this study is one of the first extratropical cyclones to hit the European continent in the year 2020, storm Ciara. Originating on the Atlantic Ocean, the storm occurred on 10 February 2020 over the Belgian North Sea, transpiring from North America (starting 3 February 2020) to the European continent (16 February 2020). Storm Ciara swept across the majority of western Europe including United Kingdom and Norway, bringing in heavy precipitation and strong winds with a maximum recorded wind gust of 219 km h −1 at Cap Corse, Corsica, France. Over Belgium, the Royal Meteorological Institute 130 -Belgium (RMI-B) reported wind gusts of up to 115 km h −1 in Ostend, located at the Belgian offshore coast, with precipitation averaging 28 mm in few hours accompanied by strong winds and thunderstorms over the local region. The selection of this case is motivated by the occurrence of fast changes in wind direction as observed by offshore wind farms located in the Belgian offshore concession zone at several moments during the storm. During the early hours of storm Ciara on 10 February 2020, a commercial offshore wind farm observed fast changes in wind direction of 40 • over few min-135 utes accompanied by concentrated rainfall over a short period of time. This event was investigated with RADAR observations provided by RMI-B, for brevity the time-stamp at 04:40 is presented in Fig. 1, illustrating the presence of a bow-echo transpiring from the British isles to Belgium, indicative of a possible micro-burst phenomena (Fujita, 1978). Fast changes in wind direction are potentially influential of the state of power and grid balances and have been found to be potentially harmful for operational conditions and lifetime of wind turbines as demonstrated in the studies by Damiani et al. (2018); Bakhshi and 140 Sandborn (2016). With the addition to RADAR data and the availability of operational wind farm data (SCADA), the premise of this study provides a unique opportunity to investigate storm Ciara as felt by the Belgian offshore wind farms and formulate an informed decision on the optimum WRF set-up in the context of wind energy applications. with a factor of 3, resulting in resolutions of 9, 3, and 1 km for d02, d03, and d04 respectively. In the vertical direction, 57 terrain following levels are considered with a model top pressure at 1000 Pa. The vertical velocity damping option based on Courant-Friedrichs-Lewy condition as implemented in WRF is also turned on. The model is initialized on 9 February 2020 150 at 00:00, followed by a spin-up period of 24 hours. Subsequently, the model is run on 10 February 2020 from 00:00 to 21:00, which adequately captures the storm event on the Belgian North Sea. The long-wave and short-wave radiation physics schemes are kept constant as Rapid Radiative Transfer Model (RRTMG) (Iacono et al., 2008). Similarly, the land surface interactions are kept constant as unified Noah land surface model (Mukul Tewari et al., 2004).
In order to sufficiently categorize and distinguish the key features of different parameterizations and options available in 155 WRF, a combination of different simulation pairs in the multi-variant sensitivity Table 2 is considered. A total of 12 WRF simulations are categorized into different simulation pairs (A -K) assigned to either variations of the temporal resolution of  The simulated wind direction and wind speed from WRF runs are evaluated against front-row averaged SCADA data from the commercial offshore wind farm located in the Belgian North Sea. The precipitation rate using RADAR reflectivity is evaluated against RADAR data from RMI-B. Further, RADAR reflectivity data is qualitatively compared to WRF-simulated reflectivity for 04:40 on 10 February 2020. Model accuracy is assessed using a standard Mean Absolute Error (MAE), as well as the Kantorovich distance as described in Wang and Basu (2016). The Kantorovich distance d, initially formulated in Kantorovitch (1958), is defined as the solution to the optimal transport problem transforming a discrete source signal a i (i = where r i,k is a cost associated with the phase shift between indices i and k, and x i,k is related to the difference in amplitude 180 between a i and b k . In this regard, the Kantorovich distance d is a metric for the similarity between two signals, in this case WRF results and field data, detecting both amplitudes and temporal phase shifts, hence supplementing the standard MAE as a point metric only considering pairwise differences in amplitudes at the same time instance. To recover a single performance metric, MAE and Kantorovich are normalized to the so-called Nnormalized Euclidean Distance (NED), given by NED = MAE 2 N + d 2 N , defined as the resultant of normalized mean absolute error MAE N , and 185 normalized Kantorovich distance d N for all simulation runs. Normalization is performed with the mean over all simulations.
This study considers a univariate analysis for wind variables and precipitation using performance metrics, NED and Kantorovich distance, respectively. The simulated radar reflectivity is extracted for a single point in space at the location of the offshore wind farm and converted to rainfall rate using the Marshall and Palmer equation (Marshall and Palmer, 1948) and is evaluated against reflectivity data from the dual-polarization C-band radar from the RMI-B, located in Jabbeke at the Belgian North Sea coast.

Results and Discussion
The summary of evaluated metrics for all WRF simulations is presented in conditions, non-scale-aware physics parameterizations, to scale-aware physics parameterizations with higher resolutions of lateral boundary conditions. Cell colors are assigned to indicate the better metric-specific value in green per column. The average NED is calculated as the average of horizontal wind direction and wind speed NED.
Before discussing the influence of specific parameterizations, we discuss some general trends. Overall, an increasing conformity with observations is observed for simulations with scale-aware physics parameterizations coupled with a larger domain 200 configuration and higher resolution of lateral boundary conditions. The best-case setup in this study is determined to be simulation case 12 with the lowest average NED. Case 12 encompasses scale-aware Shin-Hong PBL scheme coupled with a double moment 6-class Morrison microphysics scheme, a scale-aware GF cumulus scheme and ERA5 1h reanalysis dataset as the lateral boundary conditions. The wind direction and wind speed comparisons for the complete list of WRF simulations along with their ensemble average is presented in Fig. 3. Further, Fig. 4 illustrates individual and ensemble precipitation results as com-205 pared to RADAR data. Qualitatively, the simulations capture the changes in wind direction reasonably well when compared to SCADA data. On the other hand, matching wind speeds and precipitation seems significantly more challenging, as shown by the large spread among different modeling setups in the afternoon and evening hours. Interestingly, despite the average NED metric not indicating the ensemble average as a clear winner amongst the simulation cases (due to relatively high errors on wind directions), the precipitation accuracy is greatly improved over the best case 12. spatial resolution) as presented in Fig. 2. Note that J furthermore applies a standard MYNN PBL scheme whereas K uses  the scale-aware Shin-Hong PBL scheme. Quantitative performance metrics for these simulation pairs are shown in Table 4, whereas a qualitative view is presented in the form of a wind direction time series and a snapshot of RADAR reflectivity in Figs. 5 and 6 respectively.
Focusing firstly on the wind direction, it can be seen in Fig. 5 that the simulations with the larger spatial extent on Domain 225 2 (yellow lines) produce more short-timescale fluctuations, similar to fluctuations observed in the SCADA data, than those simulations performed on the smaller Domain 1 configuration (orange lines). However, assessing the performance metrics in

Simulation pair: Planetary boundary layer
In this section, the influence of using classical non-scale-aware PBL schemes versus using scale-aware PBL schemes is elaborated. More specifically, the standard MYNN scheme is compared to the scale-aware Shin-Hong scheme by simulation pairs H (run on Domain 1, see previous section) and I (run on Domain 2). Note that these are the same simulations considered in the previous section, but compared in a different manner here. Table 5 contains quantitative performance metrics for these 245 simulation pairs. Furthermore, for simulation pair I, the time series of wind direction and wind speeds are shown in Fig. 7 and a snapshot of RADAR reflectivity is presented in Fig. 8.
Considering wind direction and wind speed first, the simulations with the Shin-Hong PBL scheme observe better concurrence to SCADA data considering MAE. In some cases, the Kantorovich distance however contradicts MAE in favor of the MYNN PBL scheme, resulting in overall similar NED scores for wind direction for Shin-Hong and MYNN. For wind speeds on 250 the other hand, the resulting NED significantly favors scale-aware Shin-Hong. Finally, considering the averaged NED as the defining metric, better overall results were observed for simulations with the scale-aware Shin-Hong PBL scheme.
Even though we observed larger vertical velocity profiles and higher water vapour mixing ratios for Shin-Hong than for MYNN during times of high precipitation (see Fig. 4), the precipitation Kantorovich distance in Table 5 presents inconclusive results regarding which PBL scheme results in a closer match to RADAR data. The qualitative results as presented in Fig.   255 8 for simulation pair I indicate a better representation of the precipitation front for simulations with the Shin-Hong PBL scheme. However, for simulation pair H (not further shown here), Shin-Hong scheme significantly underestimates the RADAR precipitation.
Overall, current results show the scale-aware Shin-Hong PBL scheme performs generally better for wind variables in the current case, yet results for precipitation could indicate a dependency with cumulus and microphysics parameterizations, which 260 has also been reported in literature (Hong and Dudhia, 2012;Choi and Han, 2020;Chen et al., 2021).

Simulation pair: Temporal resolution of lateral boundary conditions
The effect of varying temporal resolution or update frequency of the ERA5 lateral boundary conditions is investigated in this section. Simulation pairs A and B represent four WRF simulations in which the temporal ERA5 resolution is varied between hourly and three-hourly updates. Note further that A and B mutually differ in all setup parameters other than the 265 cumulus scheme, which justifies them to serve as independent pairs to judge the influence of the temporal boundary condition resolution.  A qualitative look at the RADAR snapshot in Figure 10 for simulation pair B reveals that whereas the 1h resolution run reproduces the main frontal structure of the precipitation, the 3h resolution simulation does not show any reflectivity at all. For simulation pair A, which is not further plotted here, no apparent qualitative differences in RADAR reflectivity were observed.

280
This section presents the consolidated results for cumulus and microphysical scheme variation. Since both of these highlight a specific aspect of modeling precipitation in WRF, they are considered jointly in this section. A set of 7 simulation configurations is considered, covering a combination of 4 cumulus schemes (KF, GD-3D, msKF and GF; the latter two being scale-aware) and 3 microphysical schemes (WSM5, Thompson and Morrison; the latter two representing higher complexity in microphysical parameterization).

285
Firstly, the wind results for cumulus parameterization pairs are discussed. Pairs C, D and E each allow to assess the influence of the cumulus scheme, while considering one specific microphysics option, i.e. WSM5, Thompson, and Morrison for C, D, and E, respectively. Quantitative results are presented in Table 7. For wind direction and wind speed, scale-aware cumulus parameterizations (msKF and GF) produce better overall NED in comparison to non-scale-aware schemes (KF and GD-3D).
The overall outcome, as drawn from average NED, indicate best results for the scale-aware GF cumulus parameterization.

290
Turning to the wind result sensitivity on microphysical parameterizations in Table 8, simulation pairs F (with KF cumulus) and G (with msKF cumulus) assess the impact of WSM5, Thompson and Morrison schemes. As shown in the table, these simulation pairs do not allow formulating a conclusive trend regarding the impact of microphysics on wind modeling in terms of average NED. However, considering both Table 7 and 8, an observation is that the impact of cumulus schemes on average NED is greater than that of microphysics, and that the best overall performance is obtained using scale-aware cumulus schemes 295 (msKF and GF) combined with high-complexity microphysics (Morrison and Thompson).
For precipitation, results are found to be highly sensitive to the combination of cumulus and microphysics setup. Varying magnitudes and time lags in the precipitation time series were produced by different combinations of cumulus and microphysics schemes, with higher magnitude observed for Morrison and Thompson microphysics coupled with scale-aware cumulus schemes. Qualitatively, the contour plots for WRF versus RADAR observations presented in Fig. 11 also result in sig-300 nificantly different reproductions of precipitation fronts. Once more, both quantitative and qualitative results on precipitation accuracy are inconclusive.