Understanding the impact of data gaps on long-term offshore wind resource estimates

Jonietz Alvarez, Martin Georg; Watson, Warren; Gottschall, Julia

doi:https://doi.org/10.5194/wes-9-2217-2024

Articles | Volume 9, issue 11

https://doi.org/10.5194/wes-9-2217-2024

Articles | Volume 9, issue 11

Research article

27 Nov 2024

Research article |

| 27 Nov 2024

Understanding the impact of data gaps on long-term offshore wind resource estimates

Martin Georg Jonietz Alvarez, Warren Watson, and Julia Gottschall

Abstract

In the context of a wind farm project, the wind resource is assessed to predict the power output and the optimal positioning of wind turbines. This requires taking wind measurements on the site of interest and extrapolating these to the long term using so-called “measure, correlate, and predict” (MCP) methods. Sensor, power supply, and software failures are common phenomena. These disruptions cause gaps in the measured data, which can especially be long in offshore measurement campaigns due to harsh weather conditions causing system failures and preventing servicing and redeployment. The present study investigates the effect of measurement data gaps on long-term offshore wind estimates by analyzing the bias they introduce in the parameters commonly used for wind resource assessment. Furthermore, it aims to show how filling the gaps can mitigate their effect. To achieve this, we perform investigations for three offshore sites in Europe with 2 years of concurrent measurements. We use reanalysis data and various MCP methods to fill gaps in the measured data and extrapolate these data to the long term. Current standards demand high data availability (80 % or 90 %) for wind measurement campaigns, so we expect that the effect of missing data on the uncertainty in long-term extrapolations is of the same order of magnitude as other uncertainty components such as the measurement uncertainty or the inter-annual variability. Nevertheless, our results show that the effects of gaps are considerably smaller than the other uncertainty components. For instance, gaps of 180 d cause an average deviation of the long-term mean wind speed of less than 0.04 m s⁻¹ and a 95th percentile deviation of less than 0.075 m s⁻¹ for all tested sites. Due to the low impact of gaps, gap filling does not have the potential to significantly reduce the uncertainty in the long-term extrapolation.

Download & links

Article (PDF, 4523 KB)

Download & links

How to cite.

Received: 26 Sep 2023 – Discussion started: 06 Dec 2023 – Revised: 01 Jul 2024 – Accepted: 29 Sep 2024 – Published: 27 Nov 2024

1 Introduction

Reliably predicting wind speed and wind direction is necessary to analyze a potential wind farm site and to lower the economic investment risks associated with the project. These forecasts are based on data that are collected on the pre-selected site. The wind resource assessment accuracy increases with the amount of on-site data available. Nevertheless, the cost of the measurement campaign increases with its duration. Therefore, wind farm operators resort to measurement campaigns of 1 to 2 years to save time and costs. These measurements are then extrapolated to the whole expected lifetime of the wind farm, which usually reaches between 20 to 30 years (Rohrig et al., 2017).

The long-term extrapolation (LTE) is done by determining a correlation function that describes the relationship between the measurement data set and an available reference data set, which gives a long-term record of the meteorological conditions from a nearby site. This correlation is established over the training period, in which both data sets are available. In the second step, the correlation function is applied to the reference data in the target period (the period in which there is no measurement available). The methods that follow this principle are known as “measure, correlate, and predict” (MCP) methods (MEASNET, 2016).

For industry applications, the most commonly used MCP method is based on linear regression through concurrent measurement and reference data points (Carta et al., 2013). This procedure can be extended by doing a different regression for each wind direction sector or each wind speed range. Furthermore, MCP methods using multiple regression functions (Beltran et al., 2010), probabilistic distributions (Borujeni et al., 2021), and reference data sources (Carta et al., 2013) have been proposed. Hanslian (2014) classifies the MCP methods into two types: type 1 methods, which are based on time series corrections such as linear regressions and excel in estimating time series, and type 2 methods, which take a probabilistic approach so that they suit the prediction of wind speed distributions and average values. The studies of Schwegmann et al. (2023) and Borujeni et al. (2021) show that various machine learning algorithms can be used as MCP methods as well. Among the algorithms tested by Schwegmann et al. (2023), the K-nearest-neighbor (KNN) regression method performs best in doing 1 d wind speed predictions and is recommended for further applications.

A significant proportion of the wind data collected in measurement campaigns is erroneous or unavailable due to measurement equipment failures or external disturbances. The data gaps generated by these events can have lengths of several months, often depending on how quickly the site can be reached. An overview of offshore met mast measurements by Meyer and Gottschall (2022) shows that gaps of 2 or 3 months are a common phenomenon and that longer gaps are possible as well. As the missing data increase the long-term extrapolation uncertainty, there are multiple guidelines on minimum wind data availability for wind measurement campaigns. For wind resource assessment, availability must surpass the 80 % (FGW, 2020) or even the 90 % mark (MEASNET, 2016). Other purposes, such as verification of floating lidar measurement systems, require availabilities surpassing 95 % (OWA, 2018). Common solutions for preventing or compensating for data losses are robust measurement setups, device monitoring, redundancies, and measurement campaign extensions. These measures are either costly or decrease the suitability of the data set for wind resource assessment. For instance, extending the campaign leads to inhomogeneous data coverage across the seasons, leading to a biased wind resource assessment.

Gaps can be filled to increase the measurement availability and reduce errors in wind resource assessments. A simple approach for filling single-value gaps by interpolating through adjacent time stamps is studied by Pappas et al. (2014). Another option for filling gaps in wind speed and direction measurements is to extrapolate data from other heights through physical wind profile modeling (Landberg, 2015) or by using machine-learning-based methods (Rouholahnejad et al., 2023). If the gaps are longer than a single value and no measurements from other heights are available, using an MCP method is also an option for filling the data gaps (MEASNET, 2016). Nevertheless, Gottschall and Dörenkämper (2021) show that filling the gaps does not always mitigate their effect on an LTE, as this effect is already very small, with a 30 d gap causing an error of approximately 0.01 m s⁻¹ on average for the long-term mean wind speed.

The present study builds on the results of Gottschall and Dörenkämper (2021) with respect to the effect of gaps on long-term extrapolations, extending the investigated cases to measured data with longer and multiple gaps. Furthermore, the impact of the choice of the gap-filling MCP method on the LTE is investigated by analyzing the mitigation of the effect of gaps for different gap-filling methods (including the method used by Gottschall and Dörenkämper, 2021). For the present investigation, we use the same data as Gottschall and Dörenkämper (2021) to ensure the comparability of results. The focus of the present study lies only on wind speed and wind direction data, although results may be valid for other atmospheric parameters as well.

The present work is structured into six sections, including this introduction. Section 2 describes the met mast data from different sites taken as measurement data and the ERA5 reanalysis data for the locations of the met masts taken as reference data for the MCP methods. Section 3 includes a description of the artificial gap introduction into the measurement data, an overview of the MCP methods used, and a description of the method used to measure the effects of gaps and gap filling on long-term extrapolations. Section 4 presents the results of the main research questions addressed:

How do the three implemented MCP methods (linear regression, sector average deviation addition, and KNN regression) compare to each other when filling artificial data gaps?
Is there a correlation between the length of a measurement data gap and its effect on the long-term extrapolation? How does this correlation change for multiple gaps instead of one?
Under which circumstances does gap filling mitigate the effect of gaps on long-term extrapolations?

The results of the research questions are discussed in Sect. 5. Finally, in Sect. 6, the conclusions of the present study are summarized.

2 Data basis

Offshore met mast measurements at three sites and numerical data obtained for the closest available location to each of the met masts are the data sources used for the investigation of the present study. These sites represent various offshore conditions in the North and Baltic seas. We selected the same data sets as Gottschall and Dörenkämper (2021), as the present investigation builds on their studies.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f01

Figure 1Position of the sites FINO2, FINO3, and IJmuiden, including wind roses for the pre-processed met mast data for the period from 1 July 2012 to 30 June 2014. Adapted from Meyer and Gottschall (2022). Made with the Natural Earth package of Cartopy (Met Office, 2010–2015).

2.1 Sites and met mast measurement data

In the present study, met mast measurements are used as training data for the MCP methods. The data sets from three met masts, IJmuiden, FINO2, and FINO3, are pre-processed and used. Therefore, various surrounding conditions are considered, as the sites have different distances to the coast and atmospheric stabilities. These three measurement data sets are publicly available for research purposes. Their positions can be seen in Fig. 1.

The measurements from the met masts are all taken with cup or sonic anemometers and wind vanes. The 10 min averaged values of horizontal wind speed and wind direction are used. The values from the sensors closest to 90 m above mean sea level are taken since this is a common hub height of offshore wind turbines. A simultaneous measurement period of 24 months is considered in the present work: from 1 July 2012 to 30 June 2014. We select this period because of the high data availability of the three sites and because no wind farms were erected or decommissioned nearby during that time (Gottschall and Dörenkämper, 2021).

The data sets of the three masts contain gaps. In the pre-processing step, these gaps are filled before the application of the methodology described in Sect. 3. To complete the wind direction time series, measurements from sensors at lower heights are used. The missing wind speed values are taken from lower heights as well, in this case multiplying by a factor to account for the wind profile as described by Gottschall and Dörenkämper (2021). To reach a 100 % measurement data availability in all sites, the remaining gaps are filled using the KNN MCP method. For this, we set the K parameter to 200 points for wind speed and 700 points for wind direction. A detailed description of the KNN method used can be found in Sect. 3.1.3. The KNN filling is the only difference between the input data used in the present study and the data used by Gottschall and Dörenkämper (2021). The following specifications apply to the met masts and the sites:

The IJmuiden met mast is located in the North Sea approximately 75 km west of the coast of IJmuiden (coordinates: 52°51^′00^′′ N, 3°26^′24^′′ E). It provides measurements at several heights and is described in more detail by Poveda et al. (2015). For the analysis in the present study, the wind speed measurement at 92 m (cup anemometer) and wind direction measurement at 87 m are used. For the measurement period and heights used, the mean wind speed at this site is 9.88 m s⁻¹, and the mean wind direction is 233.4°.
The FINO2 met mast is situated in the central southern Baltic Sea (coordinates: 55°00^′25.2^′′ N, 13°09^′14.4^′′ E) and is thus affected by distances to land of less than 50 km in most directions. FINO2 provides wind measurements at various heights between 32 and 102 m above sea level, technically described in Leiding et al. (2012). The wind speed and direction measurements from the sensors mounted at a 92 m altitude (cup anemometer and vane, respectively) are considered in the present investigation. For the measurement period and heights used, the mean wind speed at this site is 9.59 m s⁻¹, and the mean wind direction is 228.3°.
FINO3 is a met mast located in the North Sea, 80 km west of Sylt (coordinates: 55°12^′00.0^′ N, 7°09^′36.0^′′ E). No land influences the main wind direction sectors from the south to northwest. Leiding et al. (2012) offer detailed descriptions and data analyses of the FINO measurements, as well as technical information about this met mast. The measurements of wind speed and direction from the sensors at the heights of 92 and 101 m, respectively, are used in the present work. For the measurement period and heights used, the mean wind speed at this site is 9.60 m s⁻¹, and the mean wind direction is 243.6°.

As the reanalysis data used as a reference for the MCP methods are only available with hourly resolution, we use only the 10 min met mast measurement values that are time-stamped at whole hours.

2.2 Reanalysis reference data

In contrast to the generally short-term and expensive met mast measurements, reanalysis data are available globally, for periods reaching back to the year 1950, and without gaps. Nevertheless, reanalyses are currently not capable of capturing weather conditions at a specific location as accurately as a met mast or a lidar due to their limited spatial and temporal resolution. Therefore, the sole use of reanalysis data for wind site assessment is disregarded. Nevertheless, the data are still commonly used in the industry as reference data for MCP methods (Gottschall and Dörenkämper, 2021). The fifth major global reanalysis (ERA5) is the most recent generation of reanalysis data issued by the European Centre for Medium-Range Weather Forecasts (ECMWF) (Hersbach et al., 2020). The time resolution for the atmospheric parameters is 1 h, and the spatial resolution is 0.25° in latitude and 0.25° in longitude. For each met mast, we take the ERA5 data set that results from the spatial bi-linear interpolation between the four grid points that are closest to the met mast location. The period between the years 1994 and 2014 is used, as this is the target period of the long-term extrapolations done in the present work.

We are aware of the existence of other sources with higher spatial and temporal resolutions that can be taken as references for the MCP methods as well. Some examples of these are the data from the New European Wind Atlas (NEWA) (Dörenkämper et al., 2020) and the mesoscale modeling data optimized with the Weather Research and Forecasting (WRF) model (Gottschall and Dörenkämper, 2021). Nevertheless, both of these sources have a worse correlation with the met mast measurements used in the present work than ERA5 data, which makes them less suitable as references for the MCP methods (Meyer and Gottschall, 2022). Therefore, in the present study, ERA5 is used as reference data for the MCP methods for gap filling and long-term extrapolating.

The ERA5 reference data sets show a strong linear correlation with the respective met mast measurements. The coefficients of determination (R²) between ERA5 and met mast wind speed data are 0.89 for the sites of FINO2 and FINO3 and 0.93 for the site of IJmuiden.

3 Applied methodology

The methods used to introduce artificial gaps into the met mast measurements; fill these gaps; and extrapolate the original, gapped, and filled data sets to the long term are described in the following subsections. Furthermore, the methods used to evaluate the performance of MCP methods and the effect of gaps on long-term extrapolations are presented.

3.1 MCP methods

In the present study, met-mast-measured data sets are extended in time, either to fill data gaps or to extrapolate them to the long term. For these applications, we use the following MCP methods:

sector-wise linear interpolation (SLI) as described in Sect. 3.1.1
sector average deviation (SAD) as described in Sect. 3.1.2
K-nearest-neighbor (KNN) regression as described in Sect. 3.1.3.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f02

Figure 2Workflow of a time series correction MCP method used to obtain a corrected reference time series. Reference data are marked in gray, measured data in blue, and corrected reference data in orange.

Download

The three implemented MCP methods correct the reference time series when applied. MCP methods that follow this principle are classified as type 1 methods by Hanslian (2014). According to Hanslian (2014), type 1 methods are the most suitable for predicting time series, while type 2 MCP methods are the most suitable for predicting distributions. Type 1 methods include the creation of a correlation function between the concurrently measured data and reference data and the application of the correlation function to correct the reference time series of the target period. The KNN method has type 2 features, as it classifies the wind speed data before applying the correction. Figure 2 shows a flowchart of the time series correction MCP methods.

3.1.1 Sector-wise linear interpolation (SLI) MCP method

The most commonly used MCP method for extrapolating wind speed measurements to the long term is the simple linear interpolation method (Carta et al., 2013). Given its widespread use and the high correlation between measurement and reference data in all investigated sites, we consider this method in the present work.

The correction is done separately for each 30° wind direction sector to account for inhomogeneous surrounding conditions. As recommended by Carta et al. (2013), we add a Gaussian noise term, although the option without a noise term is investigated in Sect. 4.1. In the following, this MCP method is abbreviated as SLI (sector-wise linear interpolation).

3.1.2 Sector average deviation (SAD) MCP method

Generally, the long-term wind direction is assumed to be the same as the wind direction recorded over a year-long period at the same site (Carta et al., 2013). However, several studies reviewed by Carta et al. (2013) propose MCP methods for estimating the long-term wind direction. One of the most common wind direction MCP correction methods is the sector-wise deviation correction. It is used, for instance, in the studies of Gottschall and Dörenkämper (2021) and the SpeedSort method presented by King and Hurley (2005).

In the sector-wise deviation MCP method, the average wind direction deviation between the concurrently measured data and reference data is calculated for each wind sector. These deviations are added to the reference data from the target period to obtain the MCP-corrected time series. In the present work, a sector size of 30° is chosen. We do not consider data pairs for which the reference wind speed value is less than 3 m s⁻¹, as we estimate the wind direction measurements to be inaccurate for such low wind speeds. A Gaussian noise term is applied to avoid empty wind direction ranges in between the sectors (the option without a noise term is investigated in Sect. 4.1 as well). In the following, this MCP method is abbreviated as SAD (sector average deviation).

3.1.3 K-nearest-neighbor (KNN) MCP method

In Sect. 4.1 and 4.3, we use the novel K-nearest-neighbor (KNN) regression MCP method in addition to the traditional MCP methods to fill the gaps. In the studies of Schwegmann et al. (2023), the KNN method shows the best results among several other MCP methods for filling gaps between 1 and 23 h. Schwegmann et al. (2023) test the performance of several MCP methods by calculating the root mean square error (RMSE), the coefficient of determination (R²), and the Jensen–Shannon distance between the original measurement and the MCP-filled data. We implement this algorithm using the Python package scikit-learn (Pedregosa et al., 2011). Contrary to the SLI and SAD, we use this algorithm to correct wind speed and direction.

To correct each reference data point from the target period, the KNN algorithm finds the K reference points from the training period with the shortest distance to the point to be corrected (K-nearest neighbors). The distance between reference data points is calculated in the dimensions of the physical parameters included in the algorithm (also called features). A regression through the measurement values concurrent with the K-nearest neighbors gives the target value (Cover and Hart, 1967; Schwegmann et al., 2023). Cyclical parameters such as the wind direction are split into their sine and cosine values to include them in the feature space.

Schwegmann et al. (2023) found the feature combination that results in the lowest RMSE between the filled and the original wind speed time series: wind speed, wind direction, surface pressure, surface latent heat flux, sea surface temperature, and the temperature difference between the sea surface and 2 $m a . s . l .$ These features are also used in the present work. Furthermore, several settings influencing the KNN regression algorithm (also called hyperparameters) are specified. Schwegmann et al. (2023) selected the hyperparameter combination that gives the lowest RMSE between the original and predicted data for a testing subset. In the present work, the hyperparameters are selected as follows:

Number of neighbors (K). The K that results, on average, in the lowest RMSE between the filled and original measurement data when filling single 30 d gaps with shifting gap starting dates (see Sect. 3.3.1) is used.
Dimension of the distance calculation in the feature space. The two-dimensional (Euclidean) distance is used.
Weighting of the features. Uniform weighting is used.
Other hyperparameters affecting computation time, such as leaf size. Default values of scikit-learn are taken.

Table 1Mean and standard deviation (SD) of K for which the RMSE between the originally measured and the KNN-filled data is minimized. Values are obtained for 30 d gaps. Results are shown for wind speed and wind direction for all investigated sites. Values are rounded to the closest integer.

Download Print Version | Download XLSX

The one-dimensional Nelder–Mead simplex algorithm (Arora, 2017) is used to find the optimum K for each site. Table 1 lists the average and the standard deviation across all introduced gaps of the optimum K for predicting wind speed and direction.

In Table 1, the standard deviation is higher than the average for every parameter and site. Therefore, the optimal selection of K varies drastically depending on the data that are predicted, and no optimum exists for the general case. However, we use the mean values listed in Table 1 when filling the gaps with the KNN method, as they are estimates of the optimal K.

3.1.4 MCP method performance metrics

There are multiple ways to evaluate the performance of an MCP method. One approach is calculating the statistics that compare the originally measured data with the predicted data value by value. The RMSE (Schwegmann et al., 2023; Hanslian, 2014) and the R² (MEASNET, 2016; Schwegmann et al., 2023) are the most widely used statistics for this. Nevertheless, long-term extrapolations aim to estimate overall statistics, such as long-term mean wind speed and long-term wind speed distribution. For this reason, comparing the statistics of the predicted and the original measurement data is a method used in many studies as well (Gottschall and Dörenkämper, 2021; Hanslian, 2014; Schwegmann et al., 2023).

For the present study, we use three overall time series statistics and two value-by-value statistics to evaluate the performance of the SLI, SAD, and KNN methods:

absolute mean wind speed (MWS) difference between the predicted and original data (overall statistic)
wind speed distribution error (DE) between the predicted and original data quantified by the chi-squared test with wind speed bins of 1 m s⁻¹ (overall statistic)
absolute mean wind direction (MWD) difference between the predicted and original data (overall statistic) (the wind direction differences are in the range of −180 to 180°, as they are calculated using the shortest path within the 360° circle)
RMSE between the predicted and original wind speed data (value-by-value statistic)
RMSE between the predicted and original wind direction data (value-by-value statistic).

The lower the value of the performance statistics, the better the performance for all metrics. In Sect. 4.1, we calculate these statistics for 105 artificially introduced 120 d gaps with different start dates (see Sect. 3.3.1 for more details on gap introduction). The averages of the MCP performance statistics over all introduced gaps are taken as the criteria to describe the performance of each MCP method.

3.2 Long-term extrapolations

In the present work, we calculate long-term extrapolations (LTEs) to a target period of 20 years between 1 July 1994 and 30 June 2014 with ERA5 as reference data. The following subsections contain details on the MCP method used for long-term extrapolating and a description of how the effect of gaps on long-term extrapolations is evaluated.

3.2.1 Long-term extrapolation MCP method

MCP methods based on linear regression are the most commonly used for extrapolating measured wind speed time series to the long term. The sector average deviation MCP method (refer to Sect. 3.1.2) is commonly employed for wind direction extrapolation (Carta et al., 2013). Therefore, we use the SLI and SAD MCP methods for the long-term extrapolations to obtain the results shown in Sect. 4.2 and 4.3. For the LTE, both MCP methods include the Gaussian noise term.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f03

Figure 3Workflow of a time series correction MCP method used for long-term extrapolation. Case with gapped measurement. Reference data are marked in gray, measured data in blue, and corrected reference data in orange. Periods for which data are not available in the respective data set are marked in white. The long-term-extrapolated measurement is composed of corrected reference and measured data.

Download

In the present work, the gaps are included in the target period when doing a long-term extrapolation of a gapped data set. Therefore, the gaps are filled with the corrected long-term reference time series obtained when extrapolating. Figure 3 shows a schematic representation of this process.

The gaps are filled before the long-term extrapolations to calculate the results for the LTE of filled data shown in Sect. 4.3. In these cases, the filled gaps are considered part of the measurement data and belong to the training period instead of the target period.

3.2.2 Evaluation of the effect of gaps on long-term extrapolations

The energy yield of a wind turbine is calculated using the wind speed distribution. Therefore, this is the most relevant output of a long-term extrapolation, along with the mean wind speed (MWS) and mean wind direction (MWD) (MEASNET, 2016). Hence, Gottschall and Dörenkämper (2021) consider that the effect of gaps on an LTE correlates with the mean wind speed, mean wind direction, and distribution deviations between the extrapolated gapped and original data. Therefore, we use the MWS and MWD difference between the long-term-extrapolated gapped and original data to quantify the effect of gaps on an LTE. The wind direction differences are calculated using the shortest path with the 360° circle. Additionally, we use the wind speed distribution error (DE) between the long-term-extrapolated gapped and original data to evaluate the effect of gaps on the LTE. We calculate the DE using the chi-square test with wind speed bins of 1 m s⁻¹. Analogously, we quantify the effect of gap filling on an LTE through the deviations between the statistics of the extrapolated filled and original data.

To calculate the long-term MWS, MWD, and distribution, the extrapolated data set includes the measurement from the training period and the corrected reference data from the target period (see Fig. 3). Analyses are repeated for gaps with different starting dates (see Sect. 3.3) for each gap duration investigated in Sect. 4.2 and 4.3. To generalize the results over all gaps with the same gap duration introduced, we use two metrics:

The first is the RMSE between the gapped and the original long-term-extrapolated MWS, MWD, and distributions. We adopt this metric from Gottschall and Dörenkämper (2021) to assess the average effect of gaps on long-term extrapolations.
The second is the 95th percentile (P95) of the absolute deviation between the gapped and the original long-term-extrapolated MWS, MWD, and distributions. We use this metric to highlight the highest 5 % of all analyzed gap effects.

We use the same metrics for the deviations between the filled and original long-term statistics to analyze the effect of gap filling on long-term extrapolations.

3.3 Gap generation

An artificially gapped data set is needed to evaluate the effect of gaps on a long-term extrapolation as described in Sect. 3.2.2. We do this by replacing the originally measured wind speed and direction values with non-numerical (NaN) values. A gap consists of one or multiple consecutive NaN values. We define each gap by setting the starting time stamp and the number of the consecutive NaN values. The following subsections describe the two gap types investigated in the present work.

3.3.1 Single gap with shifting start date

The single-gap introduction follows the procedure used by Gottschall and Dörenkämper (2021). It is composed of the following steps:

Step 1. Introduce a gap with a defined starting date and duration into the measurement.
Step 2. Derive a data set from the gapped data (filling the gap and/or extrapolating to the long term).
Step 3. Calculate statistics of the derived data set (mean wind speed, distributions, etc.).
Step 4. Shift the gap starting date forward by 7 d.
Step 5. Repeat all previous steps with the shifted gap.

The gap is shifted through the 2-year measurement, resulting in a total of 105 gapped data sets analyzed. For gaps whose ending date surpasses the end of the measurement period, the corresponding number of time stamps is deleted at the beginning of the data set. In Sect. 4.1, results are shown for gaps with a duration of 120 d. Section 4.2 and 4.3 include gaps with durations ranging from 0 to 180 d in steps of 30 d. Considering a 2-year measurement period, 180 d missing implies roughly 75 % data availability.

3.3.2 Multiple gaps with random start dates

We developed a process to introduce multiple gaps in a data set. These gaps represent unforeseeable availability losses in a more realistic manner than a single gap. This method includes the following steps:

Step 1. Introduce multiple gaps with defined single and combined lengths into the measurement. The starting date of each gap is selected randomly.
Step 2. Derive a data set from the gapped data (filling the gap and/or extrapolating to the long term).
Step 3. Calculate statistics of the derived data set (mean wind speed, distributions, etc.).
Step 4. Repeat all previous steps until the average over all repetitions of the statistics calculated in Step 3 converges.

For gaps that extend beyond the measurement ending time stamp, a corresponding number of NaN values is introduced at the start of the measurement time series. Overlapping gaps are merged into a single gap with their combined length.

We use this method to analyze the effect of multiple gaps on long-term extrapolations in Sect. 4.2. For this, combinations of gaps with total durations ranging from 0 to 180 d in steps of 30 d are introduced. Every single gap within those combinations is 30 d long, as this is a plausible time window between the failure of an offshore wind measurement system and its redeployment. Pre-defining the single and combined gap lengths constrains the number of introduced gaps.

For the investigation shown in Sect. 4.2, we require the convergence of the effect of gaps on long-term extrapolations. Therefore, the parameters presented in Sect. 3.2.2 must converge. We only use the long-term mean wind speed RMSE for determining convergence. Accordingly, the repetition of Steps 1 to 4 is terminated when the long-term mean wind speed RMSE stays within a range of 0.0001 m s⁻¹ for the last 100 gap combinations introduced. This criterion leads to stable results for all parameters describing the effect of gaps on long-term extrapolations.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f04

Figure 4Workflow of a time series correction MCP method used to fill a measurement data gap. Reference data are marked in gray, measured data in blue, and corrected reference data in orange. Periods for which data are not available in the respective data set are marked in white. The filled measurement is composed of corrected reference and measured data.

Download

3.4 Gap filling

The results shown in Sect. 4.1 and 4.3 involve filling artificial gaps in the measured data using the MCP methods described in Sect. 3.3. When filling a gap, the measurement–reference data pairs for the period outside of the gap are used for training, and the reference data from the period covered by the gap are corrected. This procedure is illustrated in Fig. 4.

4 Results

The following subsections show the results that aim to answer the research questions stated in the Introduction. We obtain these results using the data and the methods described in Sects. 2 and 3.

Table 2Performance statistics of the sector-wise linear interpolation (SLI), sector average deviation (SAD), and K-nearest-neighbor (KNN) MCP methods for the FINO3 site. The statistics shown are the mean wind speed (MWS) and mean wind direction (MWD) average deviations, the average wind speed distribution error (WS DE), and the average wind speed and wind direction root mean square errors (WS and WD RMSEs). Deviations between the MCP-predicted and the original statistics are calculated over 120 d periods, with starting dates spaced by 7 d along the measured period. The lowest value of each column is written in bold.

Download Print Version | Download XLSX

Table 3Performance statistics of each MCP method for the FINO2 site. Abbreviations of statistics and MCP methods are as described in Table 2. The lowest value in each column is written in bold.

Download Print Version | Download XLSX

Table 4Performance statistics of each MCP method for the IJmuiden site. Abbreviations of statistics and MCP methods are as described in Table 2. The lowest value in each column is written in bold.

Download Print Version | Download XLSX

4.1 Comparison between MCP methods

The average performance of each MCP method, analyzed as described in Sect. 3.1.4, is shown in Tables 2 to 4.

On average, the KNN method performs slightly better than the SLI and SAD methods (for wind speed and wind direction, respectively) when measured by the RMSE over the gaps. By this metric, the performance of the MCP methods with a noise term is considerably worse than the performance of the same methods without a noise term. On the contrary, the distribution error when using the SLI method is lower with than without a noise term. It can also be noted that the MWS and MWD deviations are almost identical for the SLI and SAD options with and without a noise term. When comparing the performance of the KNN method to the performance of the SLI and SAD methods (both with a noise term), differences between the three sites can be seen. For the FINO3 data, the KNN method performs better than the SLI and SAD methods for the MWS deviation, MWD deviation, and DE. For the FINO2 data, the KNN method performs best for the MWS deviation, but the SLI method with a noise term shows a lower DE, and the SAD method shows a lower MWD deviation. For the IJmuiden data, the KNN method shows the highest MWS deviation of all MCP methods. The distribution error is lower for the SLI method with a noise term than for the KNN method. The KNN method shows the lowest MWD deviation.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f05

Figure 5Long-term mean wind speed (subplot a), mean wind direction (subplot b), and Weibull parameters A (subplot c) and k (subplot d) of the original and gapped data depending on the gap starting date. Results for the original data are shown in solid black lines and for gap durations of 30, 60, and 90 d in purple, teal, and light green lines, respectively. Long-term extrapolations are done with the sector-wise linear interpolation (for wind speed) and sector average deviation (for wind direction) MCP methods with a noise term. Results are shown for the FINO3 site.

Download

4.2 Effect of gaps on long-term extrapolations

To analyze the impact of single gaps on long-term extrapolations, we follow the procedure outlined in Sect. 3.2.2: comparing the long-term statistics of the gapped data with the long-term statistics of the original data. In Fig. 5 we show the long-term mean wind speed, mean wind direction, and the Weibull parameters A and k depending on the gap starting date. The Weibull parameters are obtained by fitting the wind speed distribution with a Weibull distribution function. Figure 5 shows results for the FINO3 site for a measurement time series with no gaps and for a time series with a gap of 30, 60, and 90 d.

For all long-term statistical parameters shown in Fig. 5, the deviation between the gapped and the original parameters varies depending on the gap starting date. These differences increase with increasing gap duration. The increase in the effect of gaps is particularly pronounced for the starting dates for which the smallest (30 d) gaps already have their highest impact on the LTE statistics. We obtain similar results for FINO2 and IJmuiden data. Note that the variability of the LTE statistics of the original data seen in Fig. 5 is due to the noise term, as a new long-term extrapolation of the original data is done for each gap starting date.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f06

Figure 6Long-term mean wind speed (subplots a and b), mean wind direction (subplots c and d), and distribution (subplots e and f) RMSEs between the original and gapped data calculated over all introduced gaps for each gap duration. Results for one gap with a shifting start date are on the left (subplots a, c, and e), and results for multiple gaps with random start dates are on the right (subplots b, d, and f). Long-term extrapolations are done with the sector-wise linear interpolation (for wind speed) and sector average deviation (for wind direction) MCP methods with a noise term. Results for the FINO2, FINO3, and IJmuiden sites are shown in blue, orange, and green, respectively.

Download

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f07

Figure 7Long-term mean wind speed (subplots a and b), mean wind direction (subplots c and d), and distribution (subplots e and f) P95s of the absolute deviations between the original and gapped data calculated over all introduced gaps for each gap duration. Results for one gap with a shifting start date are on the left (subplots a, c, and e), and results for multiple gaps with random start dates are on the right (subplots b, d, and f). Long-term extrapolations are done with the sector-wise linear interpolation (for wind speed) and sector average deviation (for wind direction) MCP methods with a noise term. Results for the FINO2, FINO3, and IJmuiden sites are shown in blue, orange, and green, respectively.

Download

To generalize over all gap starting dates, the RMSE between the gapped and original LTE statistics (see Sect. 3.2.2) is shown in Fig. 6 for all sites. The root mean squared distribution error between the original and gapped LTE is shown instead of the Weibull parameter deviations. Analogously to Fig. 6, Fig. 7 shows the P95 of the absolute deviations between the gapped and original LTE statistics.

Figure 6a, c, and e show the results for a single gap with different gap lengths, introduced as described in Sect. 3.3.1. An almost-linear correlation between the statistics measuring the effect of gaps and the gap length can be recognized. For all sites, a gap of 0 d affects the LTE due to the noise term. It leads to a mean wind speed RMSE of roughly 0.005 m s⁻¹, a mean wind direction RMSE of 0.05°, and a distribution RMSE of 0.02 % to 0.025 %. For gaps longer than 0 d, there are differences between the sites, which increase with increasing gap size:

For the MWS (Fig. 6a), results for the FINO2 and FINO3 sites show a higher mean wind speed RMSE between the gapped and original LTE data (around 0.037 m s⁻¹ for 180 d gaps) than for the IJmuiden site (roughly 0.02 m s⁻¹ for 180 d gaps).
For the MWD (Fig. 6c), results for the FINO3 and IJmuiden sites show higher RMSEs (approximately 0.7° for 180 d gaps) than for the FINO2 site (nearly 0.3° for 180 d gaps).
For the wind speed distribution (Fig. 6e), there are few differences between the results for each site, with IJmuiden showing the lowest distribution RMSE (roughly 0.047 % for 180 d gaps) and FINO3 showing the highest (nearly 0.065 % for 180 d gaps).

These effects on the long-term extrapolation are negligible (< 1 %), considering that the 2-year mean wind speed in all measurement sites is between 9.5 and 10 m s⁻¹ and that the wind direction has a range of 360°. This is true even for 180 d gaps, which implies that roughly 25 % of the measured data are missing.

Figure 6b, d, and f show the RMSE of the long-term statistics over all introduced 30 d gap combinations depending on the combined gap length (for more information on the multiple-gap introduction, see Sect. 3.3.2). Similar to the single-gap case, a linear relationship between the effect of gaps and combined gap length can be seen. Furthermore, the differences between the sites are similar for the multiple- and the single-gap case. Nevertheless, when measured using the RMSE between the gapped and original long-term statistics, the effect of multiple gaps is smaller than the effect of the single gap for all sites and gap lengths.

As expected, the P95 linearly increases with gap size and is higher than the RMSE for all sites. Figure 7 shows the following effects of gaps on long-term extrapolations:

For the MWS (Fig. 7a), there is a higher P95 of deviations between the gapped and original LTE data for FINO2 and FINO3 (around 0.07 m s⁻¹ for 180 d gaps) than for IJmuiden (roughly 0.04 m s⁻¹ for 180 d gaps).
For the MWD (Fig. 7c), the FINO3 and IJmuiden sites show a higher P95 of deviations (approximately 1.2° for 180 d gaps) than FINO2 (nearly 0.5° for 180 d gaps).
For the wind speed distribution (Fig. 7e), only slight differences are observed between the sites, with IJmuiden showing the lowest P95 of distribution deviations (roughly 0.075 % for 180 d gaps) and FINO3 showing the highest (approximately 0.1 % for 180 d gaps).

The values of the P95 of the deviations are roughly double the values of the RMSE for all sites, parameters, and gap lengths. The P95 values are still below 1 % for the 180 d gaps for all assessed parameters. As for the RMSE, the P95 deviations are higher for single gaps than for multiple gaps for equal combined durations.

https://wes.copernicus.org/articles/9/2217/2024/wes-9-2217-2024-f08

Figure 8Long-term mean wind speed (subplots a and b), mean wind direction (subplots c and d), and distribution (subplots e and f) RMSEs between the original and gapped data (continuous lines) and between the original and filled data (dotted lines). RMSEs are calculated over all introduced gaps for each gap duration for single gaps with shifting start dates. Gap filling is done with the KNN MCP method on the left (subplots a, c, and e) and with the sector-wise linear interpolation (SLI; for wind speed) and sector average deviation (SAD; for wind direction) MCP methods with a noise term on the right (subplots b, d, and f). Long-term extrapolations are done with the SLI and SAD methods with a noise term. Results for the FINO2, FINO3, and IJmuiden sites are shown in blue, orange, and green, respectively.

Download

4.3 Impact of gap filling on long-term extrapolations

In a final step, we use single gaps (introduced as described in Sect. 3.3.1) to evaluate the effect of gap filling on long-term extrapolations with the method described in Sect. 3.2.2. Figure 8 summarizes the results of this investigation by showing the variation in the RMSE calculated over all gap starting dates between LTE statistics of the original and gapped data (continuous lines) and between the original and filled data (dotted lines).

Figure 8b, d, and f show the effect of filling the wind speed gaps with the SLI MCP method and the wind direction gaps with the SAD MCP method. It can be seen that filling the gaps with these methods does not affect the long-term extrapolations for any of the sites and analyzed parameters. The noise term used when filling and extrapolating causes the minimal deviations between the gapped and filled LTEs.

Figure 8a, c, and e show an approximately linear correlation between the effect of gaps and the gap length for the KNN-filled gaps for all sites. According to these subplots, filling with the KNN method has a different impact on the long-term extrapolations depending on the site and the evaluated statistic:

For the FINO3 data, filling with the KNN method reduces the RMSE for all metrics.
For the FINO2 data, filling with the KNN method reduces the mean wind speed RMSE but increases it for the mean wind direction and the wind speed distribution.
For the IJmuiden data, filling with the KNN method reduces the mean wind direction RMSE but increases it for the mean wind speed and wind speed distribution.

It must be noted that the sites and statistics for which filling with the KNN method reduces the effect of gaps on the long-term extrapolations are the same as the cases found in Sect. 4.1, for which the KNN method shows a better performance than the SLI and SAD methods with noise terms. We discuss this and other considerations given to the results in Sect. 5.

5 Discussion

When comparing the performance of the MCP methods in Sect. 4.1, we find that they perform differently depending on the site and the metric used. The KNN MCP method excels when estimating each single data point because the selection of K is optimized for this purpose (reduction in the RMSE between measurement and prediction; see Sect. 3.1.3). This agrees with the results shown by Schwegmann et al. (2023). The methods with noise terms perform the worst by this measure, as they introduce an artificial error into each data point. Nevertheless, the addition of the noise term does not affect the mean wind speed and direction values, as the artificial error is averaged to 0.

Regarding the wind speed distribution, Hanslian (2014) points out that type 1 MCP methods, such as linear regression, are best suited for predicting wind speed time series but distort distributions. Tables 2 to 4 show a lower distribution error in the original data when the noise term is added than when it is not. Therefore, we conclude that the noise term partly compensates for the distortion of the wind speed distribution induced by the linear regression for the cases tested in the present work. The noise term consists of random samples of a Gaussian distribution added to each wind speed value. Therefore, the distribution distortion is only compensated for if the distribution of the measured data used for testing is more similar to a Gaussian distribution compared to the distribution of the predicted data. This is the case for the investigated sites, but the contrary is possible for other sites or periods, for which the addition of the noise term would increase the distribution distortion. Therefore, we dismiss the Gaussian noise term as a universal solution for this issue and introduce the KNN method as an alternative. This is an analog MCP method by the criteria of Hanslian (2014). These methods combine type 1 and type 2 features and can also distort distributions, as the analogs found (neighbors in the case of KNN) tend to have values closer to the mean as opposed to further away from the mean (Hanslian, 2014). No clear choice between the KNN method and the linear interpolation with a noise term can be derived from the results in Sect. 4.1 for predicting wind speed distributions. Given the error potentially introduced by the noise term, we recommend using either distribution-based MCP methods, such as matrix methods (see Hanslian, 2014), or analog methods, such as KNN, when predicting wind speed distributions. All methods studied in Sect. 4.1 have similar performances in predicting wind speed and direction averages. We assume that the linear interpolation and sector average deviation MCP methods are sufficient for this purpose, although more complex methods might give slightly better results. If an accurate point-by-point prediction is the aim, the RMSE between the predicted and the testing measured data is to be reduced. Given the results obtained in Sect. 4.1 and in the work of Schwegmann et al. (2023), we consider that training a KNN or other machine learning models such as those shown by Schwegmann et al. (2023) is the best solution.

It must be noted that the statistics listed in Tables 2 to 4 are an averaged value over the results obtained for each introduced gap. Therefore, the comparison between the performance statistics for one specific gap may differ from the averages shown. The classification of MCP methods based on their performances is only valid for the data, gap introduction procedure, and metrics used in the present work.

In the present study, we apply the metric proposed by Gottschall and Dörenkämper (2021) to describe the effect of gaps on long-term extrapolations with the linear regression MCP method with a noise term. The different results for the long-term extrapolations are due to using different wind sector divisions, regression functions, and target periods compared to Gottschall and Dörenkämper (2021). Nevertheless, the finding of the low and seasonally changing effect of gaps aligns between Gottschall and Dörenkämper (2021) and the present study. We build on the study of Gottschall and Dörenkämper (2021) by increasing the gap length and by introducing multiple gaps. The increase in the effect of gaps with increasing gap size is expected, as the gapped and original data sets increasingly differ. Nevertheless, even 180 d gaps (corresponding to approximately 75 % availability) show a small effect. These gaps cause long-term mean wind speed errors of only 0.07 m s⁻¹ and 0.037 m s⁻¹ on average for the P95 at FINO2 and FINO3. The 0.037 m s⁻¹ value is roughly 0.38 % of the mean wind speed measured. This added uncertainty is minor compared to other long-term uncertainty sources such as the uncertainty in the wind speed measurement, which can surpass 3 % (Pulo et al., 2021). We show that the effect of gaps is even smaller for multiple gaps with the same combined gap length. We assume that this is because the single gap is longer and more likely to cut out an entire season so that climatic effects specific to that season are ignored.

Figure 5 shows an example of the seasonality of the effect of gaps for FINO3. In this case, the long-term extrapolations are most sensitive to gaps covering the spring and autumn months. We could observe a similar seasonality for the other analyzed sites. The results of Gottschall and Dörenkämper (2021) also show the largest impact of gaps when they cover spring and autumn months for all sites. Nevertheless, the differences between the seasons in that study are slight and therefore non-conclusive because only 30 d gaps are considered. As the sensitivity of long-term extrapolations to the season of the gap is not the object of the present study, we do not investigate this topic further. However, a follow-up study with a more extensive analysis might be of interest to the stakeholders involved in wind resource assessment.

Given the small effect of gaps on long-term extrapolations, we recommend lowering the requirements of data availability of over 80 % or 90 % given in current guidelines (MEASNET, 2016; FGW, 2020) for offshore measurement campaigns. This will reduce the cost of obtaining on-site wind data while not impacting the wind resource assessment significantly. Furthermore, we align with the method used to assess effects of gaps on long-term extrapolations recommended by Gottschall and Dörenkämper (2021) and used in the present investigation. This can be a valid method not only for further investigations into the effect of gaps but also for estimations of the effect of a real gap on a wind resource assessment scenario.

Even though gaps have a small effect on the long-term extrapolations in the investigated cases, we show how gap filling can change the effect of gaps on the LTE. When filling with the linear regression method, no difference between the filled and gapped long-term extrapolations can be seen. This is because the same MCP method and reference data sets are used for filling and extrapolating. In this case, the training measurement data and the correction function of the MCP method are the same for filling the gaps and for extrapolating the gapped data set. Therefore, the correction function of the long-term extrapolation already contains the information that is obtained by filling the gaps and stays unchanged when the gap-filling data are factored in.

Gap filling might decrease the bias of the LTE caused by gaps when done with an MCP method other than the extrapolating method. We investigate this using the KNN method for filling and the linear interpolation (for wind speed) and sector average deviation (for wind direction) MCP methods with noise terms for extrapolating. With this setting, gap filling does not always mitigate the effect of gaps on the LTE (see Fig. 8a, c, and e). Nevertheless, the effect of gaps is mitigated for the predicted parameters and sites, for which KNN performs better for gap filling than the linear interpolation and sector average deviation MCP methods with noise terms (see Sect. 4.1). We presume that the gap-filling performance and the reduction in the effect of gaps on the LTE correlate for each predicted statistical parameter and site. This can be investigated further with various gap-filling and long-term extrapolating MCP methods and different data sets. However, neither the performance of a gap-filling method nor the reduction in the effect of gaps on the LTE can be calculated for a real gap in the measured data. Therefore, introducing and analyzing the effect of filling artificial gaps are the only ways to estimate the effect of filling real gaps. Hereby, the period cut out by the real gaps has to be considered (periods with very high or low wind speeds might have the largest effects when cut out). Artificially cutting out periods with a similar wind climate to the period cut out by the real gaps might give insight into how the real gap affects the long-term extrapolation. If redundant measurements are available (for example, data from another height or a nearby deployed floating lidar system), gaps should be filled with those redundancies as a reference instead of with reanalysis data. As less correction is needed for reference data from redundant measurements, we expect gap filling to reduce the uncertainty in long-term extrapolations in these cases.

6 Conclusions

Since data gaps are common in offshore wind measurements, multiple guidelines limit the proportion of missing data allowed. Therefore, wind-measuring stakeholders resort to expensive wind measurement campaigns due to monitored redundant systems and prolonged measurement periods. One of the goals of the present study was to find out whether the current industry requirements of measurement data availability are justifiable or too conservative for offshore measurement campaigns. To answer this question, we built on the research of Gottschall and Dörenkämper (2021) and investigated the effect of gaps on long-term (20-year) extrapolations in multiple settings. We analyzed the effects of gaps with various lengths and numbers for the same sites as those investigated by Gottschall and Dörenkämper (2021): met mast measurements from FINO2, FINO3, and IJmuiden from the period between 2012 and 2014. Throughout the present study, we used the linear regression MCP method with ERA5 as reference data for long-term extrapolating. The metric we used for evaluating the effect of gaps on the extrapolations is the RMSE between the measured and the gapped long-term mean wind speed, mean wind direction, and distribution over all introduced gaps. The study on the effect of gaps on long-term extrapolations yielded the following results:

In alignment with the results of Gottschall and Dörenkämper (2021), we found that gaps have a minor impact on long-term extrapolations. Even for data availability of 75 %, the deviation between the gapped and the original long-term mean wind speed does not surpass 0.075 m s⁻¹ in 95 % of the cases for any of the analyzed sites. We obtained similar results for the long-term mean wind direction and wind speed distribution.
A single gap has a larger effect on the long-term extrapolation than multiple gaps with the same combined length. We assume this is because a single wind data gap is more likely to cut out a climatic event than several shorter gaps spread throughout the time series.

In addition to investigating the effect of gaps on long-term extrapolations, we analyzed the possibility presented by Gottschall and Dörenkämper (2021) of filling the gaps to decrease their effect on the extrapolation. We introduced the KNN, linear interpolation, and sector average deviation MCP methods for filling data gaps and compared their performance. Furthermore, we evaluated the relationship between the performance of a gap-filling MCP method and its reduction in the effect of gaps on a long-term extrapolation. We obtained the following results:

The linear interpolation MCP method distorts wind speed distributions. Adding a Gaussian noise term reduces the distribution distortion for sites with bell-shaped distributions. However, this reduces the accuracy of predicting a wind speed time series value by value.
The KNN MCP method shows good results (compared to the linear regression and sector average deviation methods) when predicting mean wind speed, mean wind direction, wind speed distributions, and wind speed and wind direction time series.
Filling the gaps does not impact the long-term extrapolation if the filling and extrapolating processes are done with the same MCP method and reference data.
The effect of gaps on long-term extrapolations is reduced through filling with the KNN method for the same parameters and locations for which the KNN method outperforms the linear interpolation and sector average deviation MCP methods in terms of gap filling.

According to the present findings, the minimum data availability acceptable for an offshore measurement campaign should be lower than the 90 % currently demanded by most standards. As gaps have a small effect on extrapolations, gap filling does not significantly reduce the effect of gaps. Nevertheless, data from a different measurement height or a nearby deployed device are often available. The data taken nearby have a lower uncertainty and better correlation with the analyzed measurement compared with modeled data such as ERA5. Therefore, we expect that using the nearby data as reference data for the gap-filling MCP method reduces the effects of gaps on the long-term extrapolation, even if this effect is minor.

Data availability

The ERA5 reanalysis data were downloaded from the Copernicus Climate Change Service Climate Data Store (https://doi.org/10.24381/cds.adbb2d47, Hersbach et al., 2023). The met mast data can be accessed for scientific purposes by contacting BSH (for FINO2 and FINO3) and TNO (for IJmuiden).

Author contributions

MGJA analyzed and processed the data and implemented the methods. WW contributed to several parts of the code used and participated in the interpretation of the data and results. JG provided the wind data and supervised the study. MGJA wrote the first draft of the manuscript, and all authors participated in reviewing it. All authors have approved the content and agreed to be held accountable for it.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Wind Energy Science. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

We appreciate the technical and academic advice of Sandra Schwegmann and Martin Dörenkämper. Furthermore, we thank BSH for providing access to the data measured at FINO2 and FINO3, TNO for the data of the IJmuiden met mast, and ECMWF for providing open access to the ERA5 data.

Financial support

This research has been supported by the Bundesministerium für Wirtschaft und Klimaschutz (grant no. 03EE3024).

Review statement

This paper was edited by Jakob Mann and reviewed by two anonymous referees.

References

Arora, J. S.: Chapter 11 – More on Numerical Methods for Unconstrained Optimum Design, in: Introduction to Optimum Design, fourth edition, edited by Arora, J. S., Academic Press, Boston, https://doi.org/10.1016/B978-0-12-800806-5.00011-1, pp. 455–509, 2017. . a

Beltran, J., Cosculluela, L., Pueyo, C., and Melero, J.: Comparison of measure-correlate-predict methods in wind resource assessments, European Wind Energy Conference and Exhibition 2010, Warsaw, Poland, 20–23 April 2010, EWEC 2010, 5, https://www.researchgate.net/publication/266242232_Comparison_of_measure-correlate-predict_methods_in_wind_resource_assessments (last access: 1 August 2023), 2010. a

Borujeni, M. S., Dideban, A., and Foroud, A. A.: Reconstructing long-term wind speed data based on measure correlate predict method for micro-grid planning, J. Amb. Intel. Hum. Comp., 12, 10183–10195, https://api.semanticscholar.org/CorpusID:234309341 (last access: 21 November 2023), 2021. a, b

Carta, J. A., Velázquez, S., and Cabrera, P.: A review of measure-correlate-predict (MCP) methods used to estimate long-term wind characteristics at a target site, Renew. Sust. Energ. Rev., 27, 362–400, https://doi.org/10.1016/j.rser.2013.07.004, 2013. a, b, c, d, e, f, g

Cover, T. and Hart, P.: Nearest neighbor pattern classification, IEEE T. Inform. Theory, 13, 21–27, https://doi.org/10.1109/TIT.1967.1053964, 1967. a

Dörenkämper, M., Olsen, B. T., Witha, B., Hahmann, A. N., Davis, N. N., Barcons, J., Ezber, Y., García-Bustamante, E., González-Rouco, J. F., Navarro, J., Sastre-Marugán, M., Sīle, T., Trei, W., Žagar, M., Badger, J., Gottschall, J., Sanz Rodrigo, J., and Mann, J.: The Making of the New European Wind Atlas – Part 2: Production and evaluation, Geosci. Model Dev., 13, 5079–5102, https://doi.org/10.5194/gmd-13-5079-2020, 2020. a

FGW: Technical Guidelines for Wind Turbines Part 6 (TG 6) Determination of Wind Potential and Energy Yields, Tech. rep., FGW e. V. Fördergesellschaft Windenergie und andere Dezentrale Energien, https://wind-fgw.de/themes/working-on-the-guidelines/?lang=en (last access: 8 August 2023), 2020. a, b

Gottschall, J. and Dörenkämper, M.: Understanding and mitigating the impact of data gaps on offshore wind resource estimates, Wind Energ. Sci., 6, 505–520, https://doi.org/10.5194/wes-6-505-2021, 2021. a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y

Hanslian, D.: Analysis of wind measurement results. Dissertation thesis, supervisor Kalvová, Jaroslava, Prague, Czech Republic: Charles University, Faculty of Mathematics and Physics, Department of Atmospheric Physics, http://hdl.handle.net/20.500.11956/68952 (last access: 24 November 2024), 2014. a, b, c, d, e, f, g, h, i

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., and Thépaut, J.-N.: ERA5 hourly data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.adbb2d47, 2023. a

King, C. and Hurley, B.: The SpeedSort, DynaSort and Scatter Wind Correlation Methods, Wind Engineering, 29, 217–242, https://doi.org/10.1260/030952405774354868, 2005. a

Landberg, L.: The Wind Profile, chap. 4, pp. 67–93, John Wiley and Sons, Ltd, https://doi.org/10.1002/9781118913451.ch4, 2015. a

Leiding, T., Tinz, B., Gates, L., Rosenhagen, G., Herklotz, K., Senet, C., Outzen, O., Lindenthal, A., Neumann, T., Frühman, R., Wilts, F., Bégué, F., Schwenk, P., Stein, D., Bastigkeit, Ilona nd Lange, B., Hagemann, S., Müller, S., and Schwabe, J.: Standardisierung und vergleichende Analyse der meteorologischen FINO-Messdaten (FINO123), Technical Report – available online, Final Report – FINOWind Research Project, Deutscher Wetterdienst (DWD), https://www.dwd.de/DE/forschung/projekte/fino_wind/fino_wind_node.html (last access: 2 March 2023), 2012. a, b

MEASNET: Evaluation of Site Specific Wind Conditions, Tech. rep., Measurement Network of Wind Energy Institutes, Madrid, Spain, http://www.measnet.com/wp-content/uploads/2016/05/Measnet_SiteAssessment_V2.0.pdf (last access: 15 September 2023), 2016. a, b, c, d, e, f

Met Office: Cartopy: a cartographic python library with a Matplotlib interface, Met Office, Exeter, Devon, https://scitools.org.uk/cartopy (last access: 24 November 2024), 2010–2015. a

Meyer, P. J. and Gottschall, J.: How do NEWA and ERA5 compare for assessing offshore wind resources and wind farm siting conditions?, J. Phys.-Conf. Ser., 2151, 012009, https://doi.org/10.1088/1742-6596/2151/1/012009, 2022. a, b, c

OWA: Carbon Trust Offshore Wind Accelerator Roadmap for the Commercial Acceptance of Floating LiDAR Technology, Tech.rep., The Carbon Trust, https://ctprodstorageaccountp.blob.core.windows.net/prod-drupal-files/documents/resource/public/Roadmap%20for%20Commercial%20Acceptance%20of%20Floating%20LiDAR%20REPORT.pdf (last access: 15 September 2023), 2018. a

Pappas, C., Papalexiou, S., and Koutsoyiannis, D.: A quick gap filling of missing hydrometeorological data, J. Geophys. Res.-Atmos., 119, 9290–9300, https://doi.org/10.1127/metz/2018/0908, 2014. a

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.: Scikit-Learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2825–2830, https://doi.org/10.48550/arXiv.1201.0490, 2011. a

Poveda, J. M., Wouters, D., and Nederland, S.: Wind measurements at meteorological mast IJmuiden, Tech. rep., ECN – Energy Center of the Netherlands, https://publicaties.ecn.nl/PdfFetch.aspx?nr=ECN-E--14-058 (last accessed: 1 February 2023), 2015. a

Pulo, A., Sargin, O., Schmidt, S., Schlez, W., and Stoelinga, M.: Ten noorden van de Waddeneilanden Wind Farm Zone – Wind Resource Assessment, Tech. rep., Netherlands Enterprise Agency, https://offshorewind.rvo.nl/file/download/50c6a6f8-5f01-41a8-ad0a-702101e4aa49/tnw_22020706_guidehouse_wra_update-june-2022.pdf (last access: 3 September 2023), 2021. a

Rohrig, K., Berkhout, V., Callies, D., Durstewitz, M., Faulstich, S., Hahn, B., Jung, M., Pauscher, L., Seibel, A., Shan, M. andSiefert, M., Steffen, J., Collmann, M., Czichon, S., Dörenkämper, M., Gottschall, J., Lange, B., Ruhle, A., Sayer, F., Stoevesandt, B., and Wenske, J.: Powering the 21st century by wind energy–Options, facts, figures, Appl. Phys. Rev., 6, 031303, https://doi.org/10.1063/1.5089877, 2017. a

Rouholahnejad, F., Santos, P., Hung, L.-Y., and Gottschall, J.: Machine learning for predicting offshore vertical wind profiles, J. Phys.-Conf. Ser., 2626, 012023, https://doi.org/10.1088/1742-6596/2626/1/012023, 2023. a

Schwegmann, S., Faulhaber, J., Pfaffel, S., Yu, Z., Dörenkämper, M., Kersting, K., and Gottschall, J.: Enabling Virtual Met Masts for wind energy applications through machine learning-methods, Energy and AI, 11, 100209, https://doi.org/10.1016/j.egyai.2022.100209, 2023. a, b, c, d, e, f, g, h, i, j, k, l, m

Articles

Short summary

Offshore wind measurements are often affected by gaps. We investigated how these gaps affect wind resource assessments and whether filling them reduces their effect. We find that the effect of gaps on the estimated long-term wind resource is lower than expected and that data gap filling does not significantly change the outcome. These results indicate a need to reduce current wind data availability requirements for offshore measurement campaigns.