Supplementary material 1/6

This notebook serves as supplementary material to the manuscript on low-level jets submitted to Wind Energy Science. The supplementary material consists of 6 notebooks:

  1. A very quick overview of the measurement data from the LiDARs.
  2. Overview of the ERA5 data and especially the procedure of aligning it with the observations.
  3. Some error statistics on wind speed from ERA5 and observations
  4. Main results on low-level jets
  5. Reconstruction of seasonal cycle using a machine learning package
  6. Spatial figures from ERA5 (data-intensive).

In this notebook (1/6), we provide a brief introduction into the observation data.

Getting up to speed with the observation data

The observation data come from 7 sites. LiDARs were installed at each site, and three sites were equipped with two LiDARs. These are the abbreviations.

  • BWF - Borssele Wind Farm (Zuid)
  • ELP - Europlatform
  • HKN - Hollandse Kust Noord
  • HKZ - Hollandse Kust Zuid
  • K13 - K13a oil platform
  • LEG - Lichteiland Goeree
  • MMIJ - Met Mast IJmuiden

A website dedicated to these measurement is available at https://windopzee.net/en/home/. The data are stored as NetCDF format, 1 file per platform:

In [29]:
ls ../../data/external/ECN/*.nc
../../data/external/ECN/BWFZwinds_lot1.nc  ../../data/external/ECN/HKZAwinds.nc
../../data/external/ECN/BWFZwinds_lot2.nc  ../../data/external/ECN/HKZBwinds.nc
../../data/external/ECN/EPLwinds.nc        ../../data/external/ECN/K13winds.nc
../../data/external/ECN/HKNAwinds.nc       ../../data/external/ECN/LEGwinds.nc
../../data/external/ECN/HKNBwinds.nc       ../../data/external/ECN/MMIJwinds.nc

Inspecting one of the files

The postprocessing of the platform data is described in the appendix of the paper. The resulting output files all have the following format.

In [16]:
import xarray as xr
ds = xr.open_dataset('../../data/external/ECN/BWFZwinds_lot2.nc')
ds
Out[16]:
<xarray.Dataset>
Dimensions:               (height: 10, time: 20996, yyyy-mmm-dd hh:mm: 16)
Dimensions without coordinates: height, time, yyyy-mmm-dd hh:mm
Data variables:
    WS                    (height, time) float64 ...
    WD                    (height, time) float64 ...
    Measurement Levels    (height) int32 ...
    Site Latitude         float64 ...
    Site Longitude        float64 ...
    UTC Time              (yyyy-mmm-dd hh:mm, time) |S1 ...
    Model Reference Time  (time) float64 ...
    File Reference Time   (time) float64 ...
Attributes:
    creation_date:                   09-Oct-2018 14:37:54
    Measurement Location:            Borssele Wind Farm Zone -- Lot 2
    Measurement Period:              2016-02-12 16:20 UTC --2016-07-07 11:30 UTC
    Wind Speed Instrument Info:      LiDAR - Zephir 300
    Wind Direction Instrument Info:  LiDAR - Zephir 300

Some notes/log

The output files were produced using matlab, while the further analysis in done in python. To achieve good compatibility, I wrote a preprocessing function, which addresses some remaining inconsistencies.

In [17]:
def clean(ds):
    """ This function modifies the platform datasets on load to addresses the following:
        - Convert the units of the time arrays in order to read them properly
        - Change variable names (remove spaces etc.)
        - Discard UTC time and model reference time (not necessary).
        - Set time and height variables as coordinates rather than variables
        - Re-index time array to have uniform spacing (convert gaps to NaN values)
    """
    import xarray as xr
    from pandas import date_range
    
    # Convert units of time arrays (convert capitals to lower case)
    ds['Model Reference Time'].attrs['units'] = ds['Model Reference Time'].attrs['units'].lower()
    ds['File Reference Time'].attrs['units'] = ds['File Reference Time'].attrs['units'].lower()

    # Rename variables
    ds.rename({
        "File Reference Time": "time",
        "Measurement Levels": "height",
        "Site Latitude": "latitude",
        "Site Longitude": "longitude",
        "WS": "wspd",
        "WD": "wdir"}, inplace=True)

    # Discard redundant time arrays
    ds = ds.drop(['Model Reference Time', 'UTC Time'])

    # This last function correctly parses the time arrays and makes the time equidistant (filling up with nan)
    ds = xr.decode_cf(ds) #.resample(time='10min').asfreq() # Equivalent function, but slower (I think)
    return ds.reindex(time=date_range(ds.time.values[0], ds.time.values[-1], freq='10min'))
In [18]:
ds = xr.open_dataset('../../data/external/ECN/BWFZwinds_lot1.nc')
clean(ds)
Out[18]:
<xarray.Dataset>
Dimensions:    (height: 10, time: 90250)
Coordinates:
  * time       (time) datetime64[ns] 2015-06-11T18:20:00 2015-06-11T18:30:00 ...
  * height     (height) int32 30 40 60 80 100 120 140 160 180 200
Data variables:
    wspd       (height, time) float64 11.66 11.25 11.31 11.43 11.37 10.31 ...
    wdir       (height, time) float64 41.72 42.42 46.64 43.48 43.48 42.07 ...
    latitude   float64 51.71
    longitude  float64 3.035
Attributes:
    creation_date:                   09-Oct-2018 14:37:53
    Measurement Location:            Borssele Wind Farm Zone -- Lot 1
    Measurement Period:              2015-06-11 18:20 UTC --2017-02-27 11:50 UTC
    Wind Speed Instrument Info:      LiDAR - Zephir 300
    Wind Direction Instrument Info:  LiDAR - Zephir 300

Load all stations at once

We now load all the datasets into one dictionary. This code is copied to the code base for later re-use (code/datasets/ecn.py)

In [22]:
data_path = '../../data/external/ECN/'

files = {
    'BWF1': data_path+'BWFZwinds_lot1.nc',
    'BWF2': data_path+'BWFZwinds_lot2.nc',
    'EPL': data_path+'EPLwinds.nc',
    'HKNA': data_path+'HKNAwinds.nc',
    'HKNB': data_path+'HKNBwinds.nc',
    'HKZA': data_path+'HKZAwinds.nc',
    'HKZB': data_path+'HKZBwinds.nc',
    'K13': data_path+'K13winds.nc',
    'LEG': data_path+'LEGwinds.nc',
    'MMIJ': data_path+'MMIJwinds.nc'}

# Save the platform data in a dictionary
obs = {}
for name, file in files.items():
    print(name)
    obs[name] = clean(xr.open_dataset(file))
BWF1
BWF2
EPL
HKNA
HKNB
HKZA
HKZB
K13
LEG
MMIJ

Quickly inspecting the data

In [24]:
import matplotlib.pyplot as plt
plt.xkcd()

fig, ax = plt.subplots()
for name, ds in obs.items():
    ax.plot(ds.wspd.mean(dim='time').values, ds.height.values, 'o-', label=name)
    
ax.legend(bbox_to_anchor=(1.05, 1))
ax.set_xlabel('mean wind speed (m/s)')
ax.set_ylabel('altitude (m)')
ax.set_title('Vertical wind profiles for each of the LiDARs')
plt.show()

Notes

  • LEG shows a lot of shear in the upper layer
  • Large variability between different platforms
  • Some platforms observe up to higher altitudes than others.
In [26]:
fig, ax = plt.subplots()
for name, ds in obs.items():
    ax.plot(range(24), ds.wspd.mean(dim='height').groupby('time.hour').mean(dim='time'), label=name)
    
ax.legend(bbox_to_anchor=(1.05, 1))
ax.set_ylabel('height-/time-averaged wind speed (m/s)')
ax.set_xlabel('hour of the day')
ax.set_title('diurnal cycle of wind speed')
plt.show()

Notes

  • K13, LEG and MMIJ are further offshore.
  • Consequently, these have larger wind speeds
  • Borssele have a stronger diurnal cycle --> more impact of the surrounding land areas.

Barcode plots

The figure below shows the time/height evolution of wind speed for each dataset. It illustrates the temporal extent and overlap of the datasets, as well as the missing data.

In [9]:
fig, axs = plt.subplots(11, 1, figsize=(12, 16), sharex=True, sharey=True)
axs = axs.flatten()
for i, (name, ds) in enumerate(obs.items()):
    mesh = axs[i].pcolormesh(ds.time, ds.height, ds.wspd, vmin=0, vmax=30)
    axs[i].set_title(name)
    
plt.tight_layout()
axs[-2].xaxis.set_tick_params(labelbottom=True)
cax = fig.add_axes(axs[-1].get_position())
axs[-1].set_axis_off()
plt.colorbar(mesh, cax=cax, orientation='horizontal', label='wind speed (m/s)')
plt.show()

Notes

  • MMIJ is by far the longest measurement record.
  • HKN does not cover a complete season
  • BWF2 very poor time extent

Wind directions

For wind direction, I want a cyclic colormap The code below is not relevant for understanding the data, but is necessary for the subsequent visualization.

In [27]:
# For wind direction, I want a cyclic colormap
# Use function from great answer: https://stackoverflow.com/a/34557535/6012085
import numpy as np
import matplotlib.colors as col
!pip install hsluv
import hsluv

##### generate custom colormaps
def make_segmented_cmap(): 
    white = '#ffffff'
    black = '#000000'
    red = '#ff0000'
    blue = '#0000ff'
    anglemap = col.LinearSegmentedColormap.from_list(
        'anglemap', [black, red, white, blue, black], N=256, gamma=1)
    return anglemap

def make_anglemap( N = 256, use_hpl = True ):
    h = np.ones(N) # hue
    h[:N//2] = 11.6 # red 
    h[N//2:] = 258.6 # blue
    s = 100 # saturation
    l = np.linspace(0, 100, N//2) # luminosity
    l = np.hstack( (l,l[::-1] ) )

    colorlist = np.zeros((N,3))
    for ii in range(N):
        if use_hpl:
            colorlist[ii,:] = hsluv.hpluv_to_rgb( (h[ii], s, l[ii]) )
        else:
            colorlist[ii,:] = hsluv.hsluv_to_rgb( (h[ii], s, l[ii]) )
    colorlist[colorlist > 1] = 1 # correct numeric errors
    colorlist[colorlist < 0] = 0 
    return col.ListedColormap( colorlist )

N = 256
hpluv_anglemap = make_anglemap( use_hpl = True )
Requirement already satisfied: hsluv in /nfs/home6/peter919/miniconda3/envs/thesis/lib/python3.7/site-packages (0.0.2)
twisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.
mkl-random 1.0.1 requires cython, which is not installed.
mkl-fft 1.0.4 requires cython, which is not installed.
cftime 1.0.0b1 requires cython, which is not installed.
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Wind direction bar codes

The plot below is similar to the previous one for wind speed, but now for wind direction. This is not used in the paper so it is less relevant. In this case, the time axes are not shared between the axis, so it is in fact a 'zoomed in' version of the previous plot.

In [28]:
fig, axs = plt.subplots(11, 1, figsize=(12, 16), sharex=False, sharey=False)
axs = axs.flatten()
for i, (name, ds) in enumerate(obs.items()):
    mesh = axs[i].pcolormesh(ds.time, ds.height, ds.wdir, vmin=0, vmax=360, cmap=hpluv_anglemap)
    axs[i].set_title(name)
    
plt.tight_layout()
axs[-2].xaxis.set_tick_params(labelbottom=True)
cax = fig.add_axes(axs[-1].get_position())
axs[-1].set_axis_off()
plt.colorbar(mesh, cax=cax, orientation='horizontal', label='wind direction')
plt.show()

Notes

  • Collocated datasets show similar time evoluation as expected.
  • Borssele has more North-Easterly winds, because other directions have been masked due to wind farm wake effects.

Save cleaned data in intermediate files

In [12]:
for name, ds in obs.items():
    ds.to_netcdf('../../data/interim/ECNplatforms/'+name+'.nc')

Conclusion

So far so good! Some of the scripts used in this notebook have been refactored to a shared code base in order to load the data more quickly in other notebooks.