Low-level jets over the North Sea based on ERA5 and observations: together they do better

Ten years of ERA5 reanalysis data are combined with met-mast and lidar observations from 10 offshore platforms to investigate low-level jet characteristics over the Dutch North Sea. The objective of this study is to combine the best of two worlds: (1) ERA5 data with a large spatiotemporal extent but inherent accuracy limitations due to a relatively coarse grid and an incomplete representation of physical processes and (2) observations that provide more reliable estimates of the measured quantity but are limited in both space and time. We demonstrate the effect of time and range limitations on the reconstructed wind climate, with special attention paid to the impact on low-level jets. For both measurement and model data, the representation of wind speed is biased. The limited temporal extent of observations leads to a wind speed bias on the order of ±1 m s−1 as compared to the long-term mean. In part due to data-assimilation strategies that cause abrupt discontinuities in the diurnal cycle, ERA5 also exhibits a wind speed bias of approximately 0.5 m s−1. The representation of low-level jets in ERA5 is poor in terms of a one-to-one correspondence, and the jets appear vertically displaced (“smeared out”). However, climatological characteristics such as the shape of the seasonal cycle and the affinity with certain circulation patterns are represented quite well, albeit with different magnitudes. We therefore experiment with various methods to adjust the modelled low-level jet rate to the observations or, vice versa, to correct for the erratic nature of the short observation periods using long-term ERA5 information. While quantitative uncertainty is still quite large, the presented results provide valuable insight into North Sea low-level jet characteristics. These jets occur predominantly for circulation types with an easterly component, with a clear peak in spring, and are concentrated along the coasts at heights between 50 and 200 m. Further, it is demonstrated that these characteristics can be used as predictors to infer the observed low-level jet rate from ERA5 data with reasonable accuracy.


Introduction
On average, wind speed increases with height above the surface and the rate of increase can be described using simple formulas (e.g. power-law or logarithmic profile; see Sedefian, 1980). Due to their simplicity and ease of use, these wind profile parameterisations have been widely adopted in the wind energy community. However, in some situations these formulas cannot adequately capture the observed wind profile. During these situations, the application of a simplified wind profile parameterisation can introduce error or "un-certainty" into the reconstructed wind climatology. This is clearly the case for low-level jets (LLJs), for which wind speed reaches a maximum not far (i.e. roughly less than 500 m) from the surface (Fig. 1a). 1 Wind shear and turbulence intensity associated with low-level jets also differ substantially from those assumed under "standard" conditions. Low-level jets modify wind power performance and loading by impacting wake recovery rates and vertical profiles of wind speed, direction, and turbulence (Wharton and Lundquist, 2012;Bhaganagar and Debnath, 2014;Park et al., 2014;Gutierrez et al., 2017). Thus, for a complete assessment of loads and power, it is important to have a broad understanding of the site-specific low-level jet characteristics: how often do they occur, under which circumstances, at what height and with what strength, and what mechanisms are responsible for their formation? A large body of literature exists on low-level jets, the majority focusing on the onshore phenomenon. We refer to Rife et al. (2010) for a global climatology and to Shapiro et al. (2016) for a synthesis of the underlying mechanisms. In coastal areas, the occurrence of low-level jets has been attributed to the thermal contrast and differences in surface roughness between land and sea (e.g. Mahrt et al., 2014;Nunalee and Basu, 2014). Dörenkämper et al. (2015) linked the occurrence of coastal jets to their onshore counterpart. In certain areas, other mechanisms like orographic forcing may play an important role (e.g. Moore and Renfrew, 2005). Concerning the spatial and temporal variability of the coastal jets, we refer to Ranjha et al. (2013) and Lima et al. (2018), who presented global maps based on reanalysis data. Their analyses highlight a number of large-scale global "hotspots" that, in effect, overshadow more regional phenomena. Consequently, a systematic long-term characterisation of coastal jets is lacking for the North Sea.
In a previous publication (Kalverla et al., 2017), we reported on low-level jet characteristics at a prospective wind power site 85 km off the Dutch coast (MMIJ, a.k.a "IJmuiden ver"), using 4 years of mast and lidar observations. The cli-matology consisted of the diurnal and seasonal variability in low-level jet occurrence, jet speed, jet height, jet direction, etc. Inherently, this low-level jet climatology is only valid for the single observation site examined. In order to generalise the results from this study and to improve our overall understanding of low-level jets across the North Sea, we now present a spatial climatology of low-level jets based on ERA5 reanalysis data (Sect. 2; Copernicus Climate Change Service, 2017) and an extended set of observations.
Preliminary results based on 10 years of data in the lower 500 m of the atmosphere (Fig. 1b) show that ERA5 provides interesting information about the spatial distribution of lowlevel jets. However, without observational support, this information is of little value. Therefore, we incorporate additional lidar observations to provide this support, but knowledge gained of the Dutch offshore wind climate from these measurements is inhibited by the relatively short duration of measurement collection (i.e. typically ∼ 1 year) and the limited vertical measurement range (i.e. typically less than 300 m; see Fig. 2 and Appendix A for details on measurement time and range). Consequently, the aim of this study appears twofold: (1) observations will be used to validate the ERA5 climatology of wind and low-level jets, and (2) ERA5 will be leveraged to infer long-term low-level jet characteristics based on a limited set of observations. Absolute agreement in low-level jet characteristics between the two data sources would enable perfect execution of these objectives; however, that is unlikely. Therefore, we formulated the following research question to serve/blend both perspectives: How can observations and reanalysis data be combined to obtain a spatial climatology of low-level jets that is both rich (in its spatial and temporal ex-tent) and reliable (in terms of its correspondence with available in situ observations)?
The paper is structured as follows. A brief description of the data and an elementary evaluation of wind speed itself is provided to illustrate how both datasets are biased. Thereafter, low-level jet representation within both datasets is discussed, starting with jet detection and morphology (e.g. jet height). A common thread throughout the paper is how these characteristics are impacted by time and (vertical measurement) range limitations. Using the seasonal cycle of lowlevel jets as an illustrative example, we experiment with various methods to post-process the ERA5 data and extend the observations based on identified correspondence and/or differences. This exercise is repeated for the diurnal cycle, atmospheric stability and various circulation patterns. Finally, all of these characteristics are combined to demonstrate that the "true" low-level jet rate can be reconstructed with reasonable accuracy if sufficient observations are available. The paper ends with a comprehensive discussion of the implications and future research directions.
The focus of this paper is to obtain a reliable spatial representation of the low-level jets. This provides clues as to the physical mechanisms that govern them, but a detailed treatment of these processes is outside the scope of the current work.
To facilitate transparency and reproducibility, a series of Jupyter notebooks is available as a Supplement to this paper. Consequently, some technical details are left out of the main text, which is intended as a readable and coherent treatment of the most important results.

A brief description of both datasets and their shortcomings
Observations are available from seven sites (Fig. 1b). Three of these sites had two lidars operating simultaneously and one site (MMIJ) also featured a 90 m met mast. The temporal span of measurements ranges from 6 months to over 4 years (Fig. 2). Some of the lidars were placed in the vicinity of existing wind farms and are appropriately filtered to remove any potential wind farm wake effects. More information on quality control and post-processing of the lidar data can be found in Appendix A. The observations are available as 10 min averages, but to facilitate comparison with ERA5, the data were converted to hourly averages. ERA5 (Copernicus Climate Change Service, 2017) is the latest reanalysis dataset from the European Centre for Medium-range Weather Forecasts (ECMWF). Re(trospective) analysis is the procedure of fitting a stateof-the-art weather model to historical measurements (satellites, weather stations, etc.) to obtain a long-term dataset that is both spatially and physically consistent and depicts the state of the atmosphere as it evolved through time. ERA5 is the successor of ERA-interim, and similarly ERA5 is ex-pected to be widely used for wind resource assessment studies (Olauson, 2018). Compared to its predecessor, ERA5 has a finer horizontal grid of about 30 km and also enhanced vertical resolution (for this study, data were retrieved on a 0.3 • by 0.3 • latitude-longitude grid). ERA5 is based on a newer model version and, moreover, provides output at hourly intervals, enabling a comprehensive analysis of sporadic features such as low-level jets. ERA5 data from the North Sea domain between 2008 and (the end of) 2017 in the lowest 500 m demonstrates the ability of the model to resolve lowlevel jets (Fig. 1b).
Before analysing the morphology of these jets, we illustrate the limitations of both datasets concerning the representation of wind speed. Figure 3a shows averaged wind profiles for the grid points closest to each of the measurement locations (we verified that this approach is comparable to spatial interpolation between multiple neighbouring grid points). The full lines represent all 10 years of ERA5 data, 2 whereas the dashed lines indicate averaged wind profiles derived from data subsets, which only incorporate ERA5 data when observations are available. The full lines are all quite close together, while the data subsets exhibit a much larger spread. Variability between the full lines can be related to physical differences between sites (e.g. distance to coast). Dissimilarity between the ERA5 10-year datasets and the ERA5 data subsets indicates that, due to the limited time extent of the observations, the data subsets are not representative of the site climatology. For some sites, this representativity bias almost reaches 2 m s −1 , and even for MMIJ, wherein measurements occurred for the longest period, it still amounts to ∼ 0.5 m s −1 . The primary reason for this bias at MMIJ is that the data contain more winter than summer months, and the wind is generally stronger in winter. Because the MMIJ data span more than 4 years, some of them can be discarded in order to ensure an equal representation of the seasons within the data. However, at the other stations, the temporal period of observation is limited, and using a similar seasonality filter would result in almost half of the data being removed, which is not desirable. Worse still, Hollandse Kust Noord (HKN) observations do not encompass a complete year, and even if they did, inter-annual variability can be substantial. Available observations therefore cannot be used to derive the long-term wind climatology directly. However, by correlating a shortterm dataset with long-term observations at a nearby site, the long-term wind characteristics at the target site can be inferred with reasonable accuracy. This procedure is known as measure-correlate-predict (MCP; Carta et al., 2013). While not discussed here, the application of similar techniques to the low-level jet phenomena will be examined later in this document.  ERA5 also demonstrates bias in its representation of site winds. An error diagram of the wind speed in ERA5 (subsets) versus observations is provided in Fig. 3c. In this diagram (co-opted from Kalverla et al., 2019), the mean error (BIAS) is plotted on the x axis, the standard deviation of the error distribution (STDE) is plotted on the y axis and, by virtue of the relation BIAS 2 +STDE 2 = RMSE 2 , the distance to the origin represents the root mean square error (RMSE). Wind speed data from all observation levels were aggregated in this figure to evaluate the overall performance of ERA5 at each measurement site. For example, the Hollandse Kust Zuid (HKZ) lidars show a strong bias (i.e. systematic error) but have a relatively small standard deviation (i.e. random error). ERA5 site-specific RMSE values, ranging from 1.25 to 1.5 m s −1 , can be caused by multiple model aspects such as the limited grid resolution and the incomplete representation of physical processes. Uncertainties in the observations can also contribute to overall error statistics. Based on the manufacturer information and previous validation (Poveda and Wouters, 2015), the uncertainty in the observations can only account for about 2 % of the errors. Finally, displacement in space or time as well as discrepancies between point-based measurements and modelled control volumes can contribute to errors, although we did our best to minimise these effects, e.g. by using appropriate time averaging of the observations (see the Supplement).
The observed biases exhibit a strong diurnal variation. During the night (Fig. 3b), the bias is roughly between 0 and −0.5 m s −1 , depending on the location. However, at 10:00 UTC, there is a sharp decrease in the bias of ∼ −0.5 m s −1 for most stations. The reason for this discontinuity can be found in the IFS (Integrated Forecasting System) documentation (ECMWF, 2016). ERA5 is produced with a 4D-VAR data-assimilation algorithm that uses two 12hourly windows running between 09:00-21:00 and 21:00-09:00 UTC. This means that all hourly fields up to the 09:00 UTC analysis are based on the nighttime observations, while data from 10:00 UTC onwards are based on the daytime observations. We hypothesise that the impact of the data assimilation is magnified during the nighttime because nighttime boundary layers are generally shallower; the difficulty of appropriately assimilating observational data within the (stable) boundary layer is discussed in Reen and Stauffer (2010) and Tran et al. (2018). Discontinuity in the diurnal cycle is present at each model level up to 300 m, irrespective of the season and platform; however, it seems to be slightly stronger for those stations closer to the coast.

Jet detection: a precarious procedure
Low-level jets are identified by seeking local maxima in the wind profiles. Having identified a local maximum, the jet strength, height, and fall-off are analysed. Fall-off, as indicated in Fig. 1a, is defined as the difference between the maximum and the subsequent (moving upwards) local minimum or, if no local minimum is present, the top of the wind profile. Most results in this study are based on an absolute fall-off threshold of 2 m s −1 . Figure 4 demonstrates how this threshold influences the low-level jet detection rate and further how the detection of low-level jets is influenced by both time and (vertical measurement) range limitations. The figure consists of five scatter plots, each depicting the fall-off versus the jet height for each wind profile that was detected with a local maximum. The differences between the panels are the underlying data analysed -i.e. observations and varying subsets of ERA5 data.
The first panel (Fig. 4a) is based on 10 years of ERA5 data and the model levels contained within the lower 500 m of the atmosphere. The two dashed lines represent limiting factors: (1) the fall-off threshold of 2 m s −1 (horizontal dashed line) and (2) limitations due to observation height (vertical dashed line). The model data extend up to 500 m, but the observa- tions only reach up to about 300 m (depending on the platform). All platforms are overlaid (shorter datasets on top). Only points above the horizontal dashed line are included in the low-level jet climatology that is presented in the next sections. The numbers in the top left corner of each panel give the number of jets above the fall-off threshold and the total number of jets plotted. Figure 4b-d are based on subsets of the ERA5 dataset. In panel (b), ERA5 data are incorporated only if observations are available; as expected, this substantially limits the total number of low-level jets (85 % reduction). In panel (c), we have retained all 10 years of data, but only at observation heights (i.e. data above 300 m were discarded and the remaining data were vertically interpolated -using a cubic spline -between the remaining model levels to obtain the ERA5 wind speeds at the exact observation height). The effect of this step is that 93 % of the meaningful jet events (i.e. those exceeding the fall-off threshold) vanish, and not just those above 300 m. In order to classify a wind profile as a jet, falloff above must be properly resolved. This explains why a jet at 100 m can also vanish from the climatology if data from above 300 m are removed. The pronounced impact of this vertical range limitation on the ERA5 data raises the question of whether the observed low-level jet climatology would be much different if we could observe higher-altitude winds. An increased measurement range might reveal not only lowlevel jets above hub height, but also new low-level jets at hub height that are currently not identified as such.
Height and time limitations are combined in panel (d) in order to develop an ERA5 dataset that is fair to compare with observations (panel e). Judging from the figure, it seems that ERA5 does not perform well. Much fewer jets are found above the fall-off threshold in the ERA5 data as compared to the observations. Indeed, a more quantitative comparison in the form of a contingency table, based on one-to-one (1 : 1) jet correspondence between the two datasets, shows a very low critical success index (∼ 0.2) and probability of detection (∼ 0.2; see the Supplement). In other words, only 20 % of low-level jets are correctly represented by ERA5. Does that imply that ERA5 is useless? No! Figure 4a indicates that potentially relevant information was filtered out. Even though the fall-off is typically much smaller (to the extent that it falls below the fall-off threshold), the height distribution of the ERA5 jets seems similar to the observations (also see Sect. 4). Perhaps the ERA5 jets appear vertically displaced or just not strong enough? This would not come as a surprise: weather models have long been known to generate excessive vertical mixing under stable conditions, effectively "smearing out" low-level jets (Holtslag et al., 2013). If the height thresholds for the ERA5 data are modified to 500 m, the 1 : 1 correspondence is still quite poor (critical success index ∼ 0.2; probability of detection ∼ 0.5), but despite an inability to accurately denote the total number of low-level jets, other characteristics appear to be captured quite welle.g. the average monthly low-level jet rate. Therefore, the remainder of this paper is devoted to the analysis of such lowlevel jet characteristics and methods to consolidate ERA5 and measurement data.

Vertical range affects perceived jet morphology
Jet height and jet strength are of paramount importance for wind energy applications. Small variations in height can result in either symmetric or asymmetric loads on the turbine, and typical strengths in the rated part of the power curve are probably less critical than typical strengths in the cubic part. It turns out, though, that the concepts of "typical" height and strength are not self-evident.  . Scatter plots of fall-off versus jet height for various representations of model data and observations. In (a) and (b), the jet height is represented by discrete model levels. Since these are specified in terms of pressure rather than height, they can exhibit small height variations in time. In (c), (d), and (e), jet height is represented by fixed measurement heights, and to improve the readability of the graph we added small random perturbations to these heights. See text for further explanation of the figure. of the ERA5 data and observations. 3 It shows that the jet height and strength distributions are sensitive to the range limitation. The median observed jet strength is about 8 m s −1 . This is quite well reflected in the ERA5 data if we consider all levels up to 500 m, but after imposing the range limitation, the jet strength is underestimated by about 3 m s −1 . The observed median jet height is around 80 m. The ERA5 jet height distribution is broader with greater jet heights for the data up to 500 m, while it is narrower with lower jet heights for the range-limited data. To obtain a robust result, this figure is based on the aggregated data from all platforms. Separate figures for each individual platform show similar characteristics, although the jets near the coast seem to be somewhat closer to the surface than jets further offshore (not shown).
Three different representations of the observations are included in Fig. 5. The first one is based on the 10 min data. The second is based solely on the data of each full hour; in other words, we discarded five-sixth of the data. With this strategy, (small) discrepancies in low-level jet timing can have a disproportionate impact on the results. A more permissive evaluation (the third representation) is based on hourly averages obtained with a sliding window, where each full hour is an average including the 10 min data from the preceding two and the following three time stamps. This last version of the observations is used throughout the remainder of the paper. This figure demonstrates that the differences between various resampling methods in terms of jet height and jet strength are small. Figure 6 displays the seasonal cycle of low-level jets and, in a similar fashion as Fig. 4, how this cycle is subject to time and range limitations. Over 10 years' time and 500 m height (panel a), the seasonal cycle is smooth and differences between the individual platforms are small. Ideally, we would compare this to 10 years of observations up to 500 m, but since those data are not available we take spatial and temporal subsets of the ERA5 data instead. By investigating how this affects the seasonal cycle, we identify methods to extend upon the limited observations. Over the shorter measurement periods (panel b) the seasonal cycle appears much more erratic than the 10-year climatology. Some years are not very representative, and some datasets do not even cover a complete cycle. As we will see later on, a favourable weather pattern for low-level jets is a weak large-scale forcing typically associated with high-pressure systems. Such "blocked" weather patterns can last for several weeks, and their occurrence can thus cause large differences in monthly low-level jet rates. In other words, the seasonal cycle based on only 1 or a few years is very sensitive to inter-annual variability. Upon vertical subsetting or interpolation to measurement heights (panel c), the seasonal cycle is still visible, albeit with a much smaller amplitude. The combined effect (panel d) leads to a very uninformative climatology because the monthly lowlevel jet rates are all (close to) zero except for some unrepresentative spikes. Based on panel (b), we expect that the observations are similarly affected by the limited time window of the observations. Indeed, panel (e) shows an erratic seasonal cycle with an amplitude somewhere between panels (b) and (d).

Datasets agree: most jets in spring and summer
Thus, both datasets agree on the presence of an annual cycle, but the amplitude differs between (various representations of) ERA5 and the observations. Moreover, the observation periods are too short to obtain a reliable climatology. To distill a more robust signal from the observations, we combined the data from all sites before computing the monthly means and smoothed the resulting signal with a moving average of 3 months. The result is the dashed black line in panel (e). We then repeated these steps for the ERA5 data (panels a-d), but before plotting these lines, we scaled them with the observations, using a fixed scaling factor that is simply the ratio between the mean low-level jet rate in the respective representation of ERA5 (panels a-d) and the mean of the observations (panel e). The result is promising: the seasonal cycle is similar for all datasets, peaking at about 5 % in June. The crude manipulation of the data leads to a large error margin, though, and we wonder whether we can find a more sophisticated approach to achieve a similar result. Furthermore, because valuable information is lost if we discard the ERA5 data above observation heights, we will continue to work with the ERA5 data up to 500 m in the remainder of this paper.

Simple scalings for the seasonal cycle
In the previous section we learned that 10 years of ERA5 data leads to a smooth seasonal cycle, but shorter observation periods lead to an erratic seasonal cycle because the months in the subset are not representative of the long-term monthly means. We also saw that upon aggregation and smoothing, both ERA5 and observations show similar seasonal cycles that differ mostly in their amplitudes. In this section we seek to combine the information from both data sources to reconstruct the "true" seasonal cycle of low-level jets over the North Sea. We considered two different approaches.
The first method applies a correction to the observations, based on information about their representativity. For each month and each platform, we calculated the ratio between the low-level jet occurrence in the full and subsets of the ERA5 data. Months for which this factor is much smaller (or larger) than 1 are characterised by above-(below-) average low-level jet occurrence. We then applied these ratios as correction factors to the observed monthly means to adjust the outliers and obtain a more representative seasonal cycle. However, this method did not lead to satisfactory results because the correction factors were not robust: if only 1 year of data was available, and a month was very unrepresentative, the correction factor would become very high/low and the adjustment would overcompensate. Consequently, the reconstructed long-term seasonal cycles still appeared erratic and were deemed unreliable (this result is therefore not shown here, but is available in Supplement 4/6). For MMIJ the measurement period spanned more than 4 years and consequently, the monthly low-level jet occurrence already started converging on the climatological seasonal cycle. For this platform, the correction factors were closer to 1 and we obtained a reasonably smooth seasonal cycle. This  emphasises that for this correction method, at least several years of measurement data are required to obtain a reliable estimate of the long-term low-level jet climatology.
Whereas the first method was aimed at correcting the observations (using ERA5 as a "vehicle" to assess their representativity), with the second method we aim to correct the long-term ERA5 data based on prior evaluation of its performance during the short-term period for which we have observations. This can be readily understood from Fig. 6. We compare panels (b) and (e), and seek a fixed scaling factor that minimises the difference between each pair of monthly observed and simulated LLJ frequencies. Denoting the monthly mean low-level jet frequency in ERA5 and collocated observations with x and y, respectively, an optimised scaling factor can be found by solving for a in y = ax (using linear least squares regression). We do this for each platform individually and also for their combined signal.
The results are illustrated in Fig. 7a. The lighter colours represent the individual platforms, while the black line and scatter points represent the combined monthly means. The overall fit, based on all available data, has a slope of 0.44, but there are substantial differences between the individual platforms, with slopes between 0.15 and 0.73 and relatively large scatter. The difference between platforms could be random, due to the limited availability of measurement data, or systematic, in which case different sites need different scaling parameters. If the difference is random, the global optimum indicated by the black line in Fig. 7a could do justice to all individual platforms because it incorporates a much larger body of measurement data than any single-site regression. Applying this factor of 0.44 to the full ERA5 data provides us with a smooth seasonal cycle with reduced amplitude (similar to the black dashed line in Fig. 6a, but now based on an optimised scaling factor). In other words, the seasonal cycle of low-level jets based on ERA5 data up to 500 m overestimates the observed cycle (based on measurement up to 300 m) by a factor of ∼ 2. However, as shown in Fig. 7b, there seems to be a spatial dependence in the scaling factors with larger slopes away from the coast, implying that the different sites need different scaling parameters. In order to cross-validate the single-platform regressions, we need to split the measurement data in train and test datasets, but this poses a challenge. Like before, the data record at MMIJ is long enough to obtain a reasonable prediction of the test data, but some of the other data records are very short and splitting them would, for example, leave only 3 months of training data, which obviously leads to very poor statistics, especially since there are hardly any low-level jets in winter. Without cross-validation, more data are available for regression, but this introduces the risk of overfitting and therefore quantitative evaluation will be biased. Qualitatively, the resulting seasonal cycles still appear erratic (Supplement 4/6).
Thus, despite similarities between the datasets, it is not straightforward to either correct the observations using ERA5 representativity factors or to correct the ERA5 data using a scaling factor derived from collocated observations. In this section, we used the seasonal cycle to obtain aggregated low-level jet characteristics (i.e. monthly means), but perhaps we can identify other characteristics that lead to better results.

Diurnal cycle and stability
After analysing the seasonal cycle of low-level jets in-depth, we now briefly consider some other variables that describe relevant characteristics of the low-level jet climatology, starting with the diurnal cycle. Figure 8a-c are again similar to Fig. 6, now only including the ERA5 data up to 500 m. From the observations, it appears that the low-level jets occur throughout the day, but with a small dip around 11:00 UTC. Panels (b) and (c), based on short temporal subsets, are so erratic that it is difficult to distinguish this diurnal cycle by eye. After aggregating all platforms and smoothing the data (black dashed lines), we find that the observations and ERA5 agree on the general shape, but again we needed to scale the ERA5 signals because they differed in magnitude: the diurnal cycle in ERA5 is much more pronounced. At this point, we think it is good to stress that several mechanisms can lead to low-level jets in coastal areas (see Sects. 1 and 9), and the resulting diurnal signature should not be confused with that of the typical onshore nocturnal jet that is often found over land. As in the previous section, we performed linear regression to identify optimal scaling parameters for the dashed black lines in panels (a and b). The difference with the previous section is that the regression is now based on pairs of hourly instead of monthly observed and simulated low-level jet frequencies.
The scatter in this data is larger than for the seasonal cycle, but the spatial distribution of the fitting parameters is similar (not shown). The second row in Fig. 8 shows the relation between lowlevel jet occurrence and atmospheric stability (expressed by the bulk Richardson number based on the ERA5 surface data: 2 m temperature, skin temperature, and 10 m wind). Scatter points represent mean aggregated low-level jet frequencies over 50 stability bins. Both ERA5 and the observations agree that low-level jets are typically associated with stable stratification, although for some platforms in panels (d) and (e), there seems to be a substantial number of jets for unstable conditions as well. In the subsets (panel e) this distinctive behaviour is not as clear, and in the observations it seems mostly absent. Without going into detail, we note that lowlevel jets can be formed by different mechanisms, and it is possible that ERA5 represents one mechanism better than another or perhaps one mechanism is actually over-represented.
Also note that in panels (e) and (f) there are (positive) values of the Richardson number for which no low-level jets are observed. In panel (d), this is not the case, which indicates that the measurement periods are too short to adequately sample the full range of stability conditions. Finally, we note that in panel (d), the low-level jet rate seems to decrease again for very stable situations. This could be an artefact of the bulk Richardson number or a physical limit: a stable atmosphere leads to a low-level jet, but the low-level jet produces wind shear, and consequently, the bulk Richardson number decreases. The fact that this behaviour is not reflected in the observations suggests that the true stability (that would have been observed) was actually smaller than what ERA5 predicted. Again, we tried to scale the amplitude of the stability signature by performing linear regression between pairs of low-level jet frequencies in ERA5 and observations (now based on stability bins instead of monthly or hourly groupings). The slopes are larger than those based on the seasonal and diurnal cycle (∼ 1.0), but qualitatively they seem to be less robust (not shown).

Weather types and the spatial distribution of low-level jets
We also investigated the relation between low-level jet frequency and typical circulation patterns. We used Lamb weather types (LWTs; Jones et al., 2013) to perform this analysis. To derive these weather types we used the ERA5 mean sea level pressure on a 5 • latitude-longitude grid of 16 points as laid out in the appendix of Jones et al. (2013) but centred over the area of interest. The method distinguishes three main groups: those with a dominant cyclonic (anticlockwise, low-pressure area) circulation, those with a dominant anticyclonic (clockwise, high-pressure area) circulation, and those with a "pure directional" flow. These three groups are further subdivided based on the main direction of the flow over the North Sea (north, northeast, east, etc.). If there is no dominant direction, the LWT is "pure (anti)cyclonic". Pressure fields characterised by the absence of a dominant forcing are "undefined". In total, this yields 27 different circulation patterns. We computed average low-level jet rates for each group.
To illustrate the association between the circulation type and the low-level jet occurrence, Fig. 9 shows the average low-level jet rate per weather type in the North Sea domain, based on 10 years of ERA5 data up to 500 m. The streamlines show the dominant flow pattern for each weather type: the columns represent different wind directions over the North Sea, while the full rows represent different rotation types. In the first full row, the rotation is predominantly clockwise, in the bottom full row, the rotation is mostly anticlockwise, and the middle full row is characterised by the absence of rotation. Notice how the same wind direction can be associated with different large-scale flows -and how this can impact the low-level jet rate. Like before, we will not go  into each individual feature in this figure in detail, but we will focus on overall characteristics. In general, we see that low-level jets are concentrated along the coastlines. This extends and refines the global findings of Ranjha et al. (2013) and Lima et al. (2018) for the North Sea domain. Low-level jets are much more dominant for certain Lamb weather types. Most notably, the weather type "undefined" often gives rise to the formation of jets. This makes sense, as low-level jets are subtle phenomena, and the absence of a strong largescale flow eases their development. Furthermore, we observe that low-level jets occur frequently during large-scale flows with a pronounced easterly component. Note that easterly flows bring in continental air, while westerly flows originate from the Atlantic. Low-level jets are uncommon for westerly flows. Closer inspection reveals that the differences in spatial distribution of the low-level jets (e.g. comparing the Dutch and Norwegian coastlines) seems to be related to whether the large-scale flow is directed offshore. The British Isles are different in this respect, since for westerly flows we do not observe an increased low-level jet rate off the eastern coast of Great Britain.
Like with the previous characteristics, we performed linear regression between ERA5 and observed low-level jet frequency, this time aggregated over the various Lamb weather types. We found similar patterns in ERA5 and the observations (not shown), but the spatial distribution of the scaling parameters is different. Most slopes are around 0.4, but Lichteiland Goeree (LEG) stands out with a slope of 0.65. This is not a huge difference, but it implies that our earlier hypothesis -that the slope increases with distance to coastdoes not hold for all predictors. Indeed, one could argue that with Lamb weather types as a predictor, the scaling parameters are spatially more robust. Thus, while we believe that the spatial distribution in Fig. 9 is actually meaningful, the absolute low-level jet rate (as indicated by the colour bar) is still off by a factor of ∼ 2.

Combining multiple predictors to extend observations
So far, we have tried to scale the low-level jet climatology with simple linear factors applied to individual characteristics (e.g. seasonal cycle). Perhaps, we can find a more sophisticated transformation function by combining multiple predictors? In this section we use the MMIJ data to illustrate how this could be applied in practice. In contrast to the previous sections, which focused on aggregated lowlevel jet frequencies, here we consider individual wind profiles. The procedure resembles the Model Output Statistics (MOS) forecasts that are widely used for weather forecasts (e.g. Glahn and Lowry, 1972;Carter et al., 1989;Wilks, 2006, chap. 6.5.2) and is similar to the measure-correlatepredict methods mentioned in Sect. 2 (Carta et al., 2013). We use a machine learning package to perform this task, and for readability, we will not highlight all the technical details here. However, Jupyter notebooks are available as a Supplement to facilitate reproducibility. The general idea is illustrated in Fig. 10a: we have a short time series with observations and a long reanalysis dataset. Based on the overlapping part of the data, we determine the optimal parameters of a statistical model (depicted by the red box). We then use this model to predict the value of the observations, given the available long-term reanalysis data. In the illustration, it seems as though one reanalysis variable is used for this purpose, but in fact, we can use as many variables as we want. In our case, the variable we want to predict is the probability that a low-level jet will be observed, given various predictor variables from the ERA5 data. Because this is a binary outcome (a jet either occurs or not), our model of choice is a logistic regression model, which predicts the probability of a positive outcome as function of one or several predictor variables. The general form of this model is where β i represents the coefficients of the corresponding predictor variables x i . In a short exploratory phase, we experimented with various combinations of predictor variables. We found the best performance for a small set of predictor variables consisting of time of the year, atmospheric stability, and Lamb weather type. This makes sense, as together these variables encompass information about wind speed, direction, and history of the flow, as well as the probability of stable stratification and baroclinic conditions. Indeed, each of these variables alone already provided valuable information in the previous sections. For optimal performance, these variables were preprocessed as follows: to truthfully represent its cyclic nature, date was encoded by splitting the day of year into a sine and cosine contribution. The Lamb weather type is a categorical variable, and to make it suitable for regression it was encoded by converting it to the binary representation of the numbers up to 27 (the total number of weather types) and treating each digit as an individual binary variable. Stability was represented by the difference between the 2 m temperature and sea-surface temperature, which provided better results than the bulk Richardson number. We also experimented with various training algorithms to determine the coefficients β i of the logistic model (intermediate results can be found in the Supplement). In the end, we settled on a stochastic gradient descent algorithm.
First, we took only half of the MMIJ dataset (a bit more than 2 years) to train the model (in other words: we fitted the parameters of our logistic regression model to the first half of the data). The light blue line in Fig. 10b shows the seasonal cycle of low-level jets in those first 2 years of observations. Note that this seasonal cycle is very erratic. This can be expected for such a short period, but the question is whether the additional information contained in the predictor variables enables us to predict the other 2 years, despite the unrepresentative training data. Thus, in the next step, we used our trained model to predict the other half of the dataset. In fact, the model predicts the probability that a low-level jet occurs. An individual jet is predicted only if the probability is higher than 50 %, but this happens only occasionally. Therefore, rather than predicting individual jet events, we used the predicted probabilities directly and computed the monthly mean predicted probability (Fig. 10b, orange line). To evaluate the performance, we compared the predicted seasonal cycle with that based on the true observations during the second part of the dataset (Fig. 10b, light green line). The true seasonal cycle was indeed smoother than in the first 2 years, but it peaked a bit higher and earlier than predicted. To quantify this result, we computed the root mean square error between the monthly means of the predicted and test data and found it to be about 1 % point. This result confirms that the model generalises well to new input data.
We then used the full MMIJ dataset to train the same model. With twice as much training data as before, we were confident that the model would achieve at least a similar performance and thus predict the seasonal cycle to within 1 % 204 P. C. Kalverla et al.: Characterising low-level jets Figure 9. Spatial distribution of the low-level jet rate in ERA5 data. As explained in the text, the values shown here overestimate available observations and should be interpreted with caution. A: anticyclonic; C: cyclonic; U: undefined; N, NE, E, etc., are eight wind direction sectors; combinations of a direction and a rotation type are "hybrid" weather types, while weather types without a dominant rotational component are "pure directional". Streamlines illustrate the dominant large-scale flow pattern (averaged over each LWT). The relative occurrence of each weather type is indicated as well. The figure can be enlarged for more detail or downloaded separately from the article web page. point RMSE (but probably better). The observed seasonal cycle averaged over these 4 years of training data (Fig. 10b, red line) was still clearly affected by the unrepresentative months in the first half of the dataset. Apparently, 4 years of data is still not enough for the climatology to converge. Therefore, in the final step, we used the trained model to predict the 10-year seasonal cycle. The result (Fig. 10b, purple line) is a smooth seasonal cycle which peaks in May at about 9 %. This is our best estimate of the low-level jet seasonal cycle, based on the coalescence of reliable measurements and extensive reanalysis data. Compared to the results presented in Sect. 6, we can conclude that we have adjusted the erratic nature of the short-term observations (Fig. 6e), resulting in a seasonal cycle similar to that shown in Fig. 6a, but with reduced amplitude. Compared to this final result, the crude amplitude adjustment with which we started in Sect. 6 now appears far too strong.
The results presented in this section are intended as proof of principle, and for the purpose of illustration we tried to keep things conceptually simple. With respect to the selection of predictor variables, choice of model, and method of cross-validation, we realise that the possibilities are endless. The availability of sufficient measurement data is key to an exhaustive follow-up study.

Discussion
This paper has demonstrated our efforts to infer reliable lowlevel jet characteristics by combining observations and reanalysis data. We have deliberately chosen to illustrate how the results are impacted by limitations of the data and choices in the analysis. In this section we summarise our work, discuss the implications and offer an outlook for future research directions.
We started with a general validation of the ERA5 data for the observed wind speed at measurement locations in the North Sea. We found that the overall root mean square error is between 1.25 and 1.5 m s −1 . The bias shows a clear discontinuity at 10:00 UTC, which is related to the dataassimilation strategy that was used to produce ERA5. Users of the ERA5 data should consider a suitable bias correction (e.g. Staffell and Pfenninger, 2016), but we strongly sug- gest that future reanalysis products use sliding or at least partly overlapping observation windows. We also demonstrated that the observations alone can also not be relied upon because the limited temporal extent of the measurement data leads to biased climatologies. Thus, in the remainder of the paper we focused on finding a suitable way of combining the two datasets: a procedure similar to measure-correlatepredict methods but tailored to low-level jets instead.
Low-level jet detection is very sensitive to the vertical extent of the data, and this has important implications for the interpretation of all results. Typical jet characteristics like jet height and jet strength cannot be reliably inferred from range-limited observations. With this restriction in mind, we can say that many of the observed jets occurred at heights fully or partly in the range spanned by contemporary wind turbine blades. Moreover, the typical observed jet strength is about 8 m s −1 , which is in the cubic part of the power curves of these turbines. We therefore expect that the lowlevel jet impact on loads and power can be substantial. ERA5 is not able to reliably reproduce these characteristics. There are some indications that the jets are "smeared out": they appear higher and weaker than observed. Given this vertical displacement, a fair comparison between ERA5 and the observations is difficult. Considering the lower 300 m only, ERA5 drastically underestimates the amount of jets, but including heights up to 500 m, ERA5 shows more low-level jets than observed. We decided to include the data up to 500 m because they give a stronger climatological signature.
Even though 1 : 1 correspondence between ERA5 and the observations is poor, both datasets agree on the following climatological characteristics: most jets occur in spring and summer; the diurnal cycle is weak and only around noon are the chances for low-level jets slightly lower; low-level jets are typically associated with stably stratified conditions; the absence of strong large-scale forcing or flow regimes with a pronounced easterly or offshore component are favourable for their formation. From the ERA5 data, we learned that low-level jets are concentrated along the coasts. We then compared the frequency of low-level jets between ERA5 and the observations. In the most general terms, we can state that the mean low-level jet rates based on ERA5 up to 500 m typically overestimate the amount of low-level jets that would have been observed with lidars up to 300 m by a factor of about 2. To improve upon this result we illustrated how a logistic regression model was able to predict the seasonal cycle of low-level jets at MMIJ to within 1 % point RMSE. This is a promising result, and we expect that our results can still be improved upon. Longer measurement datasets would form a major contribution to further advancement as well.
The characteristics identified in this paper provide some clues as to the processes that govern these jets. The academic literature recognises two dominant formation mechanisms, both of which are supported by our results. The first is frictional decoupling (Blackadar, 1957;Van de Wiel et al., 2010). This theory describes a perturbed system attempting to re-establish equilibrium. As the accelerating wind field in the lower atmosphere is deflected by the Coriolis effect, it moves around its new equilibrium in a circular fashion. Over land, frictional decoupling has been linked to the decay of turbulent mixing around sunset, and it has been suggested that a similar situation applies in coastal areas upon the abrupt surface (temperature and roughness) transition (Smedman et al., 1993). This mechanism is supported by our results, which show that low-level jets are frequent for winds directed offshore and in stable conditions. The second mechanism relates low-level jets to horizontal temperature gradients (baroclinity; see Holton, 1967). According to this theory, the tilt of isobaric surfaces leads to a thermal wind component that under certain conditions can manifest itself as a low-level jet. This mechanism has been coupled to low-level jets over gently sloping terrain but equally applies to coastal areas where large horizontal temperature differences can occur due to differential heating between the land and sea surface (Mahrt et al., 2014). The fact that most low-level jets occur in spring and summer supports a baroclinic contribution and possibly an interplay with the evo-lution of sea breezes, which show a similar seasonal cycle (e.g. Steele et al., 2015). In the end, we expect that both processes are likely to contribute to the low-level jet climatology. Finally, we note that we also spotted a low-level jet with a clear frontal structure in the ERA5 data. It is unlikely that such events contribute significantly to the low-level jet climatology, but the characteristics of such jets may be very different and potentially much more harmful for (offshore) wind turbines. Other causes have been described in the literature, such as orographic blocking. We do not expect this to play a major role along the Dutch coast, but for some of the low-level jets that are present in ERA5 along the British and especially the Norwegian coast it may play an important role (Christakos et al., 2014). A more detailed investigation of the ERA5 data may allow us to separate these mechanisms. This is an interesting direction for further research.
With respect to future work, it would also be interesting to look at other datasets. In this paper, we have used ERA5 data to analyse the spatial characteristics of low-level jets directly. However, ERA5 is currently being used to develop higher-resolution, down-scaled reanalysis datasets (e.g. the New European Wind Atlas (Petersen et al., 2013) and the Dutch Offshore Wind Atlas), and it would be worthwhile to see if they improve upon ERA5. Another interesting alternative is COSMO-REA6 (Bollmeyer et al., 2015), which is down-scaled from ERA-interim, but with its resolution of 6 km it might outperform ERA5. The current paper can serve as a guideline for the investigation of other reanalysis datasets.
Finally, a note on dealing with low-level jets in practice. It would be worthwhile to include a low-level jet case as standard inflow field for wake and load simulations. Recent papers have developed affordable methods to provide realistic inflow fields (Gebraad et al., 2014;Englberger and Dörnbrack, 2018). Expensive computational fluid dynamics (CFD) simulations have been used to derive parameterisations to generate realistic inflow fields for wind farm simulations. The second cited paper also includes low-level jet profiles in the early morning. These profiles can be compared with the morphology and frequency distributions detailed in the current paper to optimise yield and lifetime. Since the presence of the coastline turns out to have an important effect on the formation of low-level jets, it would be interesting to perform an additional precursor large-eddy simulation (LES) for such a heterogeneous terrain. This could also shed light on the mechanisms involved in jet formation.

Appendix A: Lidar data
Vertically pointing lidar provides efficient and non-intrusive measurement of boundary-layer winds. Compared to traditional meteorological masts, lidars typically expand the height and vertical sampling frequency of offshore wind measurements. Lidar data from seven measurement sites were used in this study to analyse North Sea LLJ spatiotemporal behaviour. Lidar types used included the WIND-CUBE v2 pulsed lidar (only at LEG) and the Zephir 300s continuous-wave (CW) lidar (all other platforms). The lidars were typically platform mounted, except within the Borssele wind farm and Hollandse Kust wind zones (Noord and Zuid) where the lidar was instrumented atop a floating met ocean buoy. At these locations, two lidar-equipped met ocean buoys were positioned simultaneously.
CW and pulsed wind lidar are coherent systems, meaning they both analyse Doppler shift frequencies to determine an estimate of the radial wind speed (Peña and Hasager, 2015). However, radial velocity and vertical wind profile extraction techniques differ between the two lidar types. Whereas pulsed wind lidars use range gates to near-simultaneously extract radial velocity estimates at multiple points in space, CW wind lidar can only extract a radial velocity estimate at the beam focus length. This beam focus length must be modified in time in order to measure the wind field at varying elevation levels. The radial wind speed is defined as the motion of the wind towards or away from the remote-sensing system, and therefore unless the wind is moving along one of these radials, then the wind speed will not be fully resolved. Consequently, CW and pulsed wind lidar use varying adaptations of conical scanning techniques (Banakh et al., 1995) to resolve the horizontal wind field at varying elevation levels. For brevity, these differences are not detailed here. However, because of these differences, the vertical wind profile was resolved at 17 s intervals for the CW wind lidar and at 4 s intervals for the pulsed wind lidar. These wind profiles are then analysed by the lidar software and output as a 10 min average vertical wind profile. A summary of the lidar measurement heights and data collection periods for all sites is provided in Fig. 2.
Data quality control is imperative to ensure an accurate depiction of the offshore LLJ. The implementation of data quality control varied depending upon the lidar type (i.e. ZephIR 300s versus WINDCUBE v2), although considerations were made to ensure that data quality control was employed relatively uniformly between measurement sites. Wind lidar data from both the Borssele wind farm and Hollandse Kust (Noord and Zuid) wind zones have additionally had quality control measures implemented by Fugro Oceanor. An overview of these quality control procedures can be found online (https://offshorewind.rvo.nl/data-borssele, last access: 22 March 2019). The data quality control procedures implemented are as follows. First, plausible value checks were im-plemented in the wind data. Any 10 min observation that met the following criteria was removed from the data record.
1. The mean wind speed was either greater than the period maximum wind speed or less than the period minimum wind speed.
2. The mean wind speed was less than 0.05 m s −1 .
4. At the measurement height, the value of TI was 10 standard deviations (σ TI ) greater than the mean (µ TI ) TI value (i.e. TI ≥ µ TI + 10σ TI ); µ TI and σ TI were defined as the height-respective value for the entire data collection period. Because TI typically decreases with mean wind speed, this threshold was only imposed if the 10 min mean wind speed exceeded 4 m s −1 .
Specific quality control measures were also applied to the lidar wind data. Any 10 min observation that satisfied the following criteria was removed from the data record.
2. The carrier-to-noise ratio (CNR) was less than −22 (the value of CNR provides a measure of signal strength, i.e. quality). CNR was only outputted by the WINDCUBE v2 wind lidar.
3. Backscatter magnitude was less than 10 −5 or greater than 100 -backscatter served as a proxy for CNR for data reported by the ZephIR 300s lidar.
Prior analyses (e.g. Poveda and Wouters, 2015) demonstrate that the ZephIR 300s lidar can incorrectly measure wind direction by 180 • . Analyses of wind data at MMIJ from 1 January 2012 through 1 January 2014 indicated that approximately 3.6 % of the measured wind data exhibited this flow reversal. Although mitigation (i.e. removal) of this data is possible, it requires independent wind direction measurements from a collocated meteorological mast. Because mast data were not available at each site, these wind direction errors were not removed. However, ZephIR 300s lidar wind direction errors did not appear to impact the measured wind speed, which is the main focus of this paper. In order to account for the wake effect of neighbouring wind farms on wind speed measurements, wind direction sectors were filtered and corresponding data (wind speed and direction) were removed. A generous estimate of 20 km was used to denote the maximum wind farm wake length.