Machine-learning-based estimate of the wind speed over complex terrain using the long short-term memory (LSTM) recurrent neural network

Leme Beu, Cássia Maria; Landulfo, Eduardo

doi:https://doi.org/10.5194/wes-9-1431-2024

Articles | Volume 9, issue 6

https://doi.org/10.5194/wes-9-1431-2024

Articles | Volume 9, issue 6

Research article

27 Jun 2024

Research article |

| 27 Jun 2024

Machine-learning-based estimate of the wind speed over complex terrain using the long short-term memory (LSTM) recurrent neural network

Cássia Maria Leme Beu and Eduardo Landulfo

Abstract

Accurate estimation of the wind speed profile is crucial for a range of activities such as wind energy and aviation. The power law and the logarithmic-based profiles have been widely used as universal formulas to extrapolate the wind speed profile. However, these traditional methods have limitations in capturing the complexity of the wind flow, mainly over complex terrain. In recent years, the machine-learning techniques have emerged as a promising tool for estimating the wind speed profiles. In this study, we used the long short-term memory (LSTM) recurrent neural network and observational lidar datasets from three different sites over complex terrain to estimate the wind profile up to 230 m. Our results showed that the LSTM outperformed the power law as the distance from the surface increased. The coefficient of determination (R²) was greater than 90 % up to 100 m for input variables up to a 40 m height only. However, the performance of the model improved when the 60 m wind speed was added to the input dataset. Furthermore, we found that the LSTM model trained on one site with 40 and 60 m observational data and when applied to other sites also outperformed the power law. Our results show that the machine-learning techniques, particularly LSTM, are a promising tool for accurately estimating the wind speed profiles over complex terrain, even for short observational campaigns.

Download & links

Article (PDF, 2823 KB)

Download & links

How to cite.

Received: 23 Aug 2023 – Discussion started: 09 Oct 2023 – Revised: 13 Mar 2024 – Accepted: 12 May 2024 – Published: 27 Jun 2024

1 Introduction

Machine-learning techniques are increasingly being adopted as powerful tools in environmental sciences. We see many examples of this method applied for different purposes to forecast meteorological variables and their derivative products (Musyimi et al., 2022; Jiang et al., 2022; Mustakim et al., 2022; Jesemann et al., 2022). However, the use of the machine-learning techniques is not restricted to the local or regional scales. Liu et al. (2022), for example, proposed a multi-level circulation pattern classification to identify large-scale weather or climate disaster events. The forecasting and monitoring disasters were also the subject of Soria-Ruiz et al. (2022). They got high performance by applying machine-learning algorithms to remote sensing datasets to detect the recurrent floods over the Gulf of Mexico coastline and the central and southeastern part of Mexico. Among the methods evaluated, Song and Wang (2020) concluded that the neural networks are superior to produce monthly wildfire predictions 1 year in advance, providing thus a valuable information for long-range fire planning and management. Adding the principal component analysis (PCA), Zhang et al. (2022) improved the accuracy for the visibility prediction at Sichuan (China). Among the six machine-learning algorithms evaluated, they found that the neural network performed best. Cheng and Tsai (2022) proposed a hybrid methodology based on variable selection and autoregressive distributed lag to forecast the pollutant concentrations, which improved the results when compared to the full and without-lag dataset. The support vector regression (SVR), which is a supervised algorithm, performed better than the other four algorithms tested. Those are only a few examples of innovative works adopting the machine-learning techniques in the environmental sciences.

Wind forecasts underpin wind power prediction, which is essential to support wind energy production in the short term. Although winds have been traditionally forecasted with numerical weather prediction models, the use of machine learning has become more widespread not only to correct the biases derived from the highly variable nature of the winds, but also as stand-alone prediction models. Wang et al. (2021) showed that their multi-layer cooperative combined forecasting system, which is based on a novel adaptive weighting scheme, overcame the limitations of the current single and combined forecasting methods and provided a more accurate and stable forecast. In their review paper, Bali et al. (2019) analyzed a few studies produced during this century and concluded that the techniques for the wind speed forecast have limitations, such as low efficiency and high computational cost. They proposed the use of long short-term memory (LSTM) to improve wind speed forecasting for power prediction. Tukur et al. (2022) analyzed works produced between 2010 and 2020 and concluded that ensemble and hybrid methods achieve high accuracy because they present more abilities to model complex functions than the linear models. They agreed with Bali et al. (2019) that the LSTM looks promising in forecasting the wind speed whilst recommending further investigation on the capabilities of hybrid model approaches. Dalton and Bekker (2022) showed the improvement when considering other meteorological variables in the modeling. Their results pointed to the vertical wind and divergence as important predictors to the wind speed. In this way, He et al. (2022) included the 2 m temperature and surface pressure to train their dual-attention mechanism multi-channel convolutional LSTM model with the ERA5 dataset to forecast the 10 m wind speed. Zhou et al. (2023) also used the ERA5 dataset to investigate the grid-to-site conversion models, considering altitude, land use and seasonality effects. The deep learning models outperformed the linear interpolation and the regression models to estimate the 10 m wind speed. The aforementioned works briefly exemplify that efforts have been made with the wind speed forecast theme; however, the methods to estimate its vertical profile are still limited.

According to Pintor et al. (2022), extrapolating the wind speed to higher heights is still a challenge, and of the two most widely used methods (the power law and the logarithmic-based profile) they found that the power law is more accurate for a wide variety of landscapes. The Met Office (United Kingdom) developed the Virtual Met Mast (VMM) tool (Standen et al., 2016) to assess the wind profile; however, this technique requires high-spatial-resolution weather numerical prediction (Schwegmann et al., 2023). Only recently have machine-learning techniques been used to forecast the wind speed profile. Türkan et al. (2016) evaluated seven different machine-learning methods to estimate the 30 m wind speed at Kütahya (Türkiye) and concluded that the SVR produced the most realistic results compared to the other six. Al-Shaikhi et al. (2022) proposed the particle swarm optimization (PSO) with the LSTM method and compared their results with other optimization algorithms for an experiment carried out at Dhahran (Saudi Arabia). Their model needs at least four different levels of observational data as input. Similarly, Nuha et al. (2022) proposed the regularized extreme learning machine (RELM) to extrapolate the wind speed to higher heights. With the same dataset of Dhahran, Mohandes and Rehman (2018) used the restricted Boltzmann machine (RBM) method and observations at four different heights as input. They showed that their method improved the wind speed forecast. Bodini and Optis (2020 a) and Bodini and Optis (2020 b) found that random forests outperform standard wind extrapolation approaches, using a round-robin validation method. They highlighted the benefits of including observational data capturing the diurnal variability of the atmospheric boundary layer, namely the Obukhov length, turbulence kinetic energy and time of the day, all of them measured at a 4 m height. Vassallo et al. (2020) also improved their results, including meteorological variables in the input dataset of their artificial neural network (ANN) model, advising to carefully select the input data and emphasizing the importance of normalization. Even the VMM data are improved with machine-learning methods (Schwegmann et al., 2023). Bodini and Optis (2020 a) and Bodini and Optis (2020 b) conducted their experiments over low-complexity terrain (Great Plains – US) and stressed the need of performing the same kind of analysis in more complex terrains. To the best of our knowledge, most studies on vertical wind speed extrapolation were conducted for low-complexity orographies, except for Vassallo et al. (2020), who analyzed different types of terrain complexity, and Standen et al. (2016) and Schwegmann et al. (2023), who conducted their studies through the VMM tool.

2 Data and methods

2.1 The LSTM recurrent neural network

Recurrent neural networks (RNNs) are a type of artificial neural network where the output of one time step is used as an input in the subsequent time step to then build a memory of time series events. The RNNs are specifically designed to work, learn and predict sequential data (Medsker and Jain, 1999). Long short-term memory (LSTM) is a type of RNN that is considered a state-of-the-art tool for processing sequential and temporal data nowadays. The main advantage of the LSTM over the other RNNs is that the presence of internal memory allows maintaining long-term dependencies, avoiding the vanishing- or exploding-gradient problems (Smagulova and James, 2019). This was done by introducing a forget gate into the standard recurrent sigma cell of the RNNs. The forget gate can decide what information will be discarded (Yu et al., 2019) and makes the LSTM system a robust model that compensates for the imperfections in the input data (Sherstinsky, 2020). The LSTM cells are mathematically expressed by

\begin{array}{l} (1) & f_{t} = σ (W_{f h} h_{t - 1} + W_{f x} x_{t} + b_{f}), \\ (2) & i_{t} = σ (W_{i h} h_{t - 1} + W_{i x} x_{t} + b_{i}), \\ (3) & {\tilde{c}}_{t} = \tanh (W_{\tilde{c} h} h_{t - 1} + W_{\tilde{c} x} x_{t} + b_{\tilde{c}}), \\ (4) & c_{t} = f_{t} \cdot c_{t - 1} + i_{t} \cdot {\tilde{c}}_{t}, \\ (5) & o_{t} = σ (W_{o h} h_{t - 1} + W_{o x} x_{t} + b_{o}), \\ (6) & h_{t} = σ_{t} \tanh (c_{t}), \end{array}

where x_t and h_t are the inputs and the recurrent information at time t; c_t is the cell state of the LSTM; f_t, i_t and o_t are the forget, input and output gates; W_f, W_i, $W_{\tilde{c}}$ and W_o are the weights; b is the bias; and the operator “⋅” is the pointwise multiplication of two vectors. Figure 1 illustrates the LSTM compounds and architecture.

https://wes.copernicus.org/articles/9/1431/2024/wes-9-1431-2024-f01

Figure 1LSTM schematical diagram (Yu et al., 2019).

We run the LSTM using the Keras library (version 2.9) from Python (version 3.8.16) through Colab (the Google Research platform). The missing data were interpolated using the interpolate Pandas function through a linear method. Afterwards, the data were normalized through the StandardScaler function from the Sklearn library (Pedregosa et al., 2011). The StandardScaler function normalizes by removing the mean and scaling to the standard deviation:

\begin{matrix} (7) & z = (x - u) / s, \end{matrix}

where x is observed data, u is the mean, s is the standard deviation and z is the normalized data.

We identified the optimal hyperparameters by using the KerasTuner (O'Malley et al., 2019) with the Hyperband algorithm. Table A1 exhibits the tuned hyperparameters for each experiment. We maintained the default configuration of Keras for the other LSTM arguments (Keras, 2023). See Table A2.

2.2 Doppler lidar

We employed the Windcube v2 Doppler lidar, from Leosphere, during the field campaigns at three different sites. For the Windcube v2 technical specifications, see Beu and Landulfo (2022). The information of the field campaigns is listed in Table 1.

Table 1Information of the field campaigns.

Download Print Version | Download XLSX

The lidar was set up for 12 levels, as follows: 40, 60, 80, 100, 120, 140, 160, 180, 200, 230, 260 and 290 m; it was also set up to retrieve information every 10 min. The Windcube v2 system automatically discards data when the carrier-to-noise (CNR) ratio is under −23 dB, and we removed data that presented availability less than 80 % over 10 min.

See in Table A3 that the data availability is over 99 % for all the three sites up to a 160 m height. Above 160 m, the availability decreases to 98 % at Site 2 and 94 % at Site 3 at a 230 m height.

We considered the observed data at 40 m to estimate the wind speed at higher heights (from 60 up to 230 m). Beyond the 10 min mean wind speed (v40), we also considered the wind direction (dir40), the hour, and the standard deviation of the horizontal (σ_u+σ_v) and vertical (σ_w) wind speed to forecast the wind speed at higher heights. With the wind speed standard deviation, we estimated the turbulence kinetic energy (TKE), which is the sum of the wind speed variances (Stull, 1988) and is expressed by

\begin{matrix} (8) & TKE = \frac{1}{2} (σ_{u}^{2} + σ_{v}^{2} + σ_{w}^{2}) . \end{matrix}

As already discovered, including cyclical variables improves the wind speed forecast (Bodini and Optis, 2020 a, b; Baquero et al., 2022). The diurnal cycle is a strong feature of the sites under research, and we will discuss this further. Since surface observations are not available, the 40 m TKE could indirectly transmit information related to temperature and stability, improving the modeling with respect to diurnal variability. This step is referred to as Experiment 1. Afterwards, we also added the 60 m wind speed as input to forecast the heights above, and this step is referred to as Experiment 2. Following the advice of Bodini and Optis (2020 a) and Bodini and Optis (2020 b), we conducted two more experiments (Experiment 3 and Experiment 4), which consisted in swapping a trained model for another environment and evaluating its performance. In this way, the trained model for Site 1 was applied to Sites 2 and 3. In addition, the trained model for Site 2 was applied to Sites 1 and 3, and the trained model for Site 3 was used for Sites 1 and 2. In Table A4, we summarize the input variables of each experiment.

2.3 The power law

According to Pintor et al. (2022), the power law (PL) is the simplest and generally the most effective way to extrapolate the wind speed. The PL is given by

\begin{matrix} (9) & V = V_{r} {(\frac{z}{z_{r}})}^{α}, \end{matrix}

where V and V_r are the wind speed at height z and at reference height z_r, respectively. α is the wind shear coefficient. The authors state that α<0.1 corresponds to unstable conditions, $0.1 < α < 0.2$ is typical of the neutral profile and α>0.2 describes a stable atmosphere.

2.4 Evaluation

For evaluating the model performances, we chose verification metrics like those used in Zhou et al. (2022) and Baquero et al. (2022), because those metrics have been largely applied to wind forecast through machine-learning methods. For further information on these metrics, see Zhou et al. (2022) and Baquero et al. (2022).

Coefficient of determination (R²): the R² tells us how much the model differs from the original data, and it is related to the correlation coefficient.
$\begin{matrix} (10) & R^{2} = 1 - \frac{Σ_{i = 1}^{N} {(y_{i} - \hat{y_{i}})}^{2}}{Σ_{i = 1}^{N} {(y_{i} - \overline{y})}^{2}} \end{matrix}$
Mean squared error (MSE):
$\begin{matrix} (11) & MSE = \frac{1}{N} Σ_{i = 1}^{N} {(y_{i} - \hat{y_{i}})}^{2} . \end{matrix}$
Root mean squared error (RMSE):
$\begin{matrix} (12) & RMSE = \sqrt{\frac{1}{N} Σ_{i = 1}^{N} {(y_{i} - \hat{y_{i}})}^{2}} . \end{matrix}$
Mean absolute error (MAE):
$\begin{matrix} (13) & MAE = \frac{1}{N} Σ_{i = 1}^{N} | y_{i} - \hat{y_{i}} | . \end{matrix}$
Mean absolute percentage error (MAPE):
$\begin{matrix} (14) & MAPE = \frac{100 %}{N} Σ_{i = 1}^{N} \frac{y_{i} - \hat{y_{i}}}{\max (ϵ, | y_{i} |)} . \end{matrix}$

Here y_i, $\overline{y}$ and $\hat{y_{i}}$ are the actual value, the mean of the observed data and the predicted value. N is the total number of data points, and ϵ is an arbitrarily small but strictly positive number to avoid undefined results when y_i is zero.

Lastly, we applied the bootstrapping technique (Efron and Tibshirani, 1994) to estimate the error bars for R². For this purpose, we used the bootstrap function from the SciPy library (Virtanen et al., 2020), with a confidence level of 0.95 and number of resamples equal to 100 times the data points.

2.5 Observational campaigns

The observational campaigns took place over a 3-year period (Table 1) on the southeastern portion of Brazil (Fig. 2). All three observational sites are within 140 km from the coast and clearly marked on the map. Despite the proximity between sites (see the description of Fig. 2), the types of terrain are completely different, namely the height and surface roughness (Table 1). Site 1 is inside the Metropolitan Region of São Paulo, which is characterized by a densely mixed urban matrix.

https://wes.copernicus.org/articles/9/1431/2024/wes-9-1431-2024-f02

Figure 2Sites of the observational campaigns. The distance (yellow line) is 47 km between Sites 1 and 2, 131 km between Sites 2 and 3, and 90 km between Sites 1 and 3. Distance estimated by the Google Earth tool (© Google Earth 2023).

Site 2 is a coastal municipality called Cubatão. Beyond the industrial zone, Cubatão is surrounded by natural parks of the Atlantic Rain Forest (Morellato and Haddad, 2000), residential areas and a high mountain range, called Serra do Mar, on its north boundary. At this point, Serra do Mar rises sharply, up to more than a 700 m height; is 5 km wide across; and acts as an important barrier to the atmospheric circulation. Vieira and Gramani (2015) provide a technical description of the Cubatão and Serra do Mar features.

Site 3, the Iperó municipality, is more than 130 km away from the coast, as shown in Fig. 2. It is inside a predominantly rural area and about 10 km away from the urban zone of the Sorocaba municipality. Another important characteristic of this site is the Araçoiaba hill to the southeast, rising up to more than a 300 m height up to 900 m altitude. The Araçoiaba hill is inside a Federal Conservation Unit called Ipanema National Forest.

3 Results

The surface strongly affects the atmospheric circulation within the planetary boundary layer (PBL). Thus, we plotted the wind rose for the first observational level (40 m) as an attempt to identify similarities and differences among the three sites. In this study, the wind rose shows the direction where the wind blows from (as typically used in meteorology). The circulation patterns are similar between Sites 1 and 3 (Figs. 3 and 5). Both of them present a diurnal cycle of winds turning 360°. We see this diurnal cycle in Fig. A1, which illustrates a 30 d wind direction temporal series. Most of the time, the wind turns throughout the day, except for short periods identified by the red circles, when the winds remain mainly from south–southeast and are related with postfrontal events. The sea breeze (southeast wind) is one of the main reasons for the pattern of Fig. 3 at Site 1 (Ribeiro et al., 2018). According to Ribeiro et al. (2018), there are two main conditions that inhibit the sea breeze reaching the São Paulo Metropolitan Region (SPMR): the prefrontal circulation and the cloudiness. The cloudiness decreases the thermal contrast between the sea and the land, and the prefrontal circulation is opposed to the sea breeze. Thus, excluding those two conditions, the sea breeze advances over the SPMR often throughout the year and justifies the wind rose pattern (Fig. 3). Even at 40 m above the surface, the winds are weak and rarely reach 8 m s⁻¹. However, the low-level jet (LLJ) is a typical feature of the SPMR (Sánchez et al., 2022), and the power and logarithmic law fail in extrapolating the wind speed profile in the LLJ environment when compared to machine-learning methods (Bodini and Optis, 2020 a, b).

https://wes.copernicus.org/articles/9/1431/2024/wes-9-1431-2024-f03

Figure 3Observed wind at 40 m – Site 1 (normalized wind rose). The wind speed is indicated by the legend (m s⁻¹).

Machine-learning-based estimate of the wind speed over complex terrain using the long short-term memory (LSTM) recurrent neural network

2.1 The LSTM recurrent neural network

2.2 Doppler lidar

2.3 The power law

2.4 Evaluation

2.5 Observational campaigns

3.1 Experiment 1

3.2 Experiment 2

3.3 Experiment 3

3.4 Experiment 4