the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Analysing uncertainties in offshore wind farm power output using measure–correlate–predict methodologies
Michael Denis Mifsud
Tonio Sant
Robert Nicholas Farrugia
This paper investigates the uncertainties resulting from different measure–correlate–predict (MCP) methods to project the power and energy yield from a wind farm. The analysis is based on a case study that utilises shortterm data acquired from a lidar wind measurement system deployed at a coastal site in the northern part of the island of Malta and longterm measurements from the island's international airport. The wind speed at the candidate site is measured by means of a lidar system. The predicted power output for a hypothetical offshore wind farm from the various MCP methodologies is compared to the actual power output obtained directly from the input of lidar data to establish which MCP methodology best predicts the power generated.
The power output from the wind farm is predicted by inputting wind speed and direction derived from the different MCP methods into windPRO^{®} (https://www.emd.dk/windpro, last access: 8 May 2020). The predicted power is compared to the power output generated from the actual wind and direction data by using the normalised mean absolute error (NMAE) and the normalised meansquared error (NMSE). This methodology will establish which combination of MCP methodology and wind farm configuration will have the least prediction error.
The best MCP methodology which combines prediction of wind speed and wind direction, together with the topology of the wind farm, is that using multiple linear regression (MLR). However, the study concludes that the other MCP methodologies cannot be discarded as it is always best to compare different combinations of MCP methodologies for wind speed and wind direction, together with different wake models and wind farm topologies.
 Article
(9623 KB)  Fulltext XML
 BibTeX
 EndNote
The measure–correlate–predict (MCP) methodology introduces uncertainty due to its inherent statistical nature. Recent developments have seen the introduction of new computational regression techniques such as artificial neural networks (ANNs) and machine learning, which include decision trees (DTs) and support vector regression (SVR). In a previous study, light detection and ranging (lidar) data were used to compare the results of the various regression methodologies at different lidar measurement heights (Mifsud et al., 2018), with the reference site being Malta International Airport (MIA), Luqa, and the candidate site being a coastal watch tower at Qalet Marku on the northern part of the island. This study uses the same wind data for the year 2016 to construct the MCP models. However, this time the prediction is carried out for both wind speed and wind direction. Wind speed and direction are then predicted for the period June–December 2015. This is done for the different MCP models. The predicted wind speed and wind direction time series are then fed into a wind farm model implemented in windPRO^{®} version 2.7 to model the overall energy yield, considering wake losses. The power output for various wind farm configurations is obtained for each methodology. As the lidar is sited on the roof of a coastal tower, at a height of 20 m above mean sea level, the wind data measured at a height of 80 m would be equivalent to a wind turbine (WT) hub height of 100 m above the sea surface.
The power output in each case is compared to that obtained when the actual wind data are fed to the wind farm model. Thus, the NMAE, the NMSE and the percentage error in the overall energy yield are compared for the various methodologies and wind farm topologies. This is therefore a study about the uncertainties introduced by the various statistical methods, which are then further complicated by the wind farm layout. It is innovative due to the use of an MCP methodology to predict both the wind speed and the wind direction. The following literature review describes different MCP methodologies, four of which are then used in the prediction of wind speed and wind direction. The wake models are also described. This is followed by a description of the methodology used in the study, together with a description of the hypothetical wind farm used as a basis for this study. Finally, the results are presented and discussed.
The first MCP methods estimated the mean longterm annual wind speed (Carta et al., 2013). MCP methods later made use of simple linear regression (SLR) (Rogers et al., 2005a) to establish a relationship between hourly wind characteristics of the candidate and the reference sites. A multiple linear regression is a regression model that involves more than one regressor variable (Montgomery et al., 2006). The regression is carried out using concurrent wind speed and wind direction data at the reference and the candidate sites. The reference site is normally the closest meteorological station, e.g. airports, and the candidate site is the location chosen for the wind farm. When the model is created, hence establishing a relationship between the wind speed at both sites, the longterm wind data at the reference site can be used to predict the longterm wind speed at the candidate site. More recent models established nonlineartype relationships (Clive, 2004; Carta and Velazquez, 2011) by employing statistical learning (Hastie et al., 2009). Amongst these are algorithms such as artificial neural networks (ANNs) (Bilgili et al., 2007; Monfared et al., 2009) and the more recent machinelearning (ML) techniques, which include support vector regression (SVR) (Oztopal 2006; Zhao et al., 2010; Scholkopf and Smola, 2002; Alpaydin, 2010) and decision trees (DTs) (James et al., 2015; Alpaydin, 2010).
A study (Carta et al., 2013) reviewed many MCP methodologies. These included the method of ratios, firstorder linear regression, higherthanfirstorder linear methods, nonlinear methods and probabilistic methods. The authors were also concerned with the uncertainties associated with MCP methodologies and argued that users of MCP methodologies have little information with which to determine the uncertainty of the methodology. One methodology to measure this uncertainty is to use the full set of data from the concurrent period to train the model and assess its quality.
Another study by Rogers compared four different MCP methodologies (Rogers et al., 2005a). These included a linear regression model, the distributions of ratios of the wind speeds at the two sites, an SVR model and another method based on the ratio of the standard deviations of the two data sets. The authors concluded that SVR gave the best results. In a different study, the same authors (Rogers et al., 2005b) also analysed the uncertainties introduced with the use of MCP techniques. They concluded that linear regression methodologies could seriously underestimate uncertainties due to serial correlation of data. Another study shows that a proper assessment of uncertainty is critical for judging the feasibility and risk of a potential wind farm development, and the authors describe the risk of oversimplifying and assuming uncertainties (Lackner et al., 2012).
A hybrid MCP method (Zhang et al., 2014), which involved adding different weights depending on the distance and elevation of the candidate site to the reference sites, was applied to the input of five MCP methodologies. The methods used consisted of the linear regression, variance ratio, Weibull scale, ANNs and SVR. The results were assessed in terms of metrics such as the meansquared error and mean absolute error. Other authors (Perea et al., 2011) evaluated three methodologies. One method included a linear regression, which was derived from the bivariate normal joint distribution and the Weibull regression method. The other method was based on conditional probability density functions applied to the joint distributions of the reference and the candidate sites. The results from these two methodologies were in turn compared to SVR. Although the conclusion was that the SVR method predicted all the parameters very accurately, the probability density function based on the Weibull distribution was better in terms of prediction accuracy.
The ability of ANNs to recognise patterns in complex data sets means that they can also be used to correlate and predict wind speed and wind direction (Zhang et al., 2014). A neural network contains an input layer, one or more hidden layers of neurons and an output layer. A learning process updates the weights of the interconnections and biases between the neurons in the various layers. The Levenberg–Marquardt (Principe et al., 2000) algorithm may be used for this purpose. The regression is performed by means of feedforward networks (Alpaydin, 2010) with multilayer perceptrons (MLPs).
Another study (Velazquez et al., 2011) utilised wind speed and direction from various reference stations. These were introduced into the input layer of an ANN. It was concluded that when wind direction was used as an angular magnitude to the input signal, the model gave better results. Estimation errors also decreased as the number of reference stations was increased. The authors concluded that ANNs are superior to other methods for predicting longterm wind data.
The use of ANNs for longterm predictions was also investigated by Bechrakis et al. (2004) using wind speed and direction measurements from just one reference station and compared these to standard MCP algorithms. This resulted in an improved prediction accuracy of 5 % to 12 %. Unfortunately, many models that use various reference stations use only the recorded wind speeds as input. The topologies of the ANNs used have only a single neuron in the input layer, with the output signal being the wind speed at the candidate site (Monfared et al., 2009; Oztopal, 2006; Bilgili et al., 2009).
Data from meteorological stations possessing long measurement periods provide a large number of potential inputs for MCP methods. Apart from wind speed and direction, inputs can also include other climatological variables such as air temperature, relative humidity and atmospheric pressure. Hence, a multivariate MCP methodology may be utilised (Patane et al., 2011). This technique considers all the inputs and extracts the maximum amount of information at the sites. Since some input variables may be intercorrelated, or may not provide information about the target site wind characteristics, the methodology is a twostage process. Input variables are analysed, and those that contain little or redundant information about the candidate site wind characteristics are discarded, after which a multivariate regression is performed. It was concluded from the results of the tests made that the methodology was more accurate than standard MCP methods, with the quality of the estimation of the longterm wind resource increasing by 19 %.
SVR is the adaptation of support vector machines to the regression problem. This technique was developed by Vapnik (Vapnik, 1995; Vapnik et al., 1998) to solve classification problems. SVR (Alpaydin, 2010) is popular within the renewable energy community since it is a unique way to construct smooth and nonlinear regression approximations (Diaz et al., 2017). The analysis of MCP models using SVR techniques shows that SVR is one of the techniques which best represents the ML state of the art (Diaz et al., 2017). This is not only due to its prediction capability, but also to its property of universal approximation to any continuous function and an efficient and stable algorithm that provides a unique solution to the estimation problem (Diaz et al., 2017). Different hyperparameters were used to study the SVR methodology. Other studies describe how SVR may be adapted to wind speed prediction (Zhao et al., 2010).
Another recent study shows the importance of DTs in improving the regression results for MCP (Diaz et al., 2018). The study applied five different MCP techniques to mean hourly wind speed and direction, together with air density, using the data from 10 weather stations in the Canary Islands. The study showed that the models using SVR and DTs provided better results than ANNs. A DT is a hierarchical data structure which implements the “divide and conquer” rule, and it may also be applied to the regression problem (Hastie et al., 2009; Alpaydin, 2010; James et al., 2015).
The use of lidar for wind resource assessment (Probst and Cardenas, 2010) shows a distinct advantage of this method over the traditional cup and wind vane measurements. This is demonstrated by studies carried out using different MCP methods such as SLR and ratio analysis. However, no analysis with ANNs, DTs or SVR is carried out. A more recent study (Mifsud et al., 2018), which utilised the same data as this current study, analysed the accuracy of different MCP methodologies and their capability according to lidar measurement height. The study concluded that the MCP accuracy depended on both methodology and measurement height at the candidate site. Other studies using lidar at the same measurement site were also carried out. These analysed the turbulent behaviour of the wind data (Cordina et al., 2017).
The issue of wake losses in a wind farm has been described by several authors and can be minimised by optimising the layout of the wind farm (Manwell et al., 2009). A short literature review of wake models is now presented.
Wake models are classified into four categories (Manwell et al., 2009) which are surface roughness models (Bossanyi et al., 1980), semiempirical models (Lissaman and Bates, 1977; Vermeulen, 1980), eddy viscosity models (Ainslie, 1985) and Navier–Stokes solutions (Crespo and Hernandez, 1986, 1993). A review of wind turbine wake models (Sanderse, 2009) shows the effects of reduced power production due to lower incident wind speed and the effect on the wind turbine rotors due to increased turbulence. The author presents a number of reasons on why the focus on numerical simulation is preferred to experimentation; this is mainly due to the use of computational fluid dynamics (CFD). One study presents the mathematical theory behind a simple wake model and that for a multiple wake model (GonzalezLongatt et al., 2012) while another study (Churchfield, 2013) describes a hierarchy of wake models ranging from the empirical to largeeddy simulation (LES). Some of the models compared include Ainslie's model (Ainslie, 1985), Frandsen's model (Frandsen, 2005) and Jensen's model (Jensen, 1983). The dynamic wake meandering model is another method which is described (Larsen et al., 2008) and also validated (Larsen et al., 2013) in a study carried out on the Egmond aan Zee offshore wind farm. Another study (Barthelmie et al., 2006) compares wake model simulations for offshore wind farms, with the wake profiles measured by sonic detection and ranging (sodar). In this case, the models gave a wide range of predictions, and it was not possible to identify a model with superior projections with respect to the measurements.
In some studies, it is necessary for any wake model used to be straightforward, dependent on relatively few wake measurements and economic in terms of the necessary computing power. Despite their relative simplicity, these models tend to give results which are in reasonable agreement with the available data in the case of a single wake within a small wind farm and a simple meteorological environment. In addition, a comparison of different wake models does not suggest any particular difference in terms of accuracy between the sophisticated and simplified models (Manwell et al., 2009).
The use of wake models can also be illustrated by considering a semiempirical model (Katić et al., 1986) that is often used for wind farm output predictions. This model attempts to characterise the energy content in the flow field whilst ignoring the details of the exact nature of the flow field, which is assumed to consist of an expanding wake with uniform velocity deficit that decreases with distance downstream (Manwell et al., 2009).
The N.Ø. Jensen wake model (Jensen, 1983) is a simple wake model based on the assumption of a wake with a linear wake cone. The results from this model are comparable to experimental results.
Several metrics may be used to evaluate the accuracy of the models (Rogers et al., 2005a), and it is important to employ more than one metric (SantamariaBonfil et al., 2016) to perform the evaluation. The lower the value of the metric, the better the performance of the model. In this case the NMAE and the NMSE were used to quantify the performance of the model. The purpose of using normalised values is to provide results which are independent of wind farm sizes (Madsen et al., 2005).
The NMAE is suitable to describe the errors which are uniformly distributed around the mean, also revealing the average variance between the true value and the predicted value (Hu et al., 2013). The NMAE applies the same weight to the individual errors. The NMSE is a measure of the extent of the dispersion of the errors around the mean and gives a higher weight to larger errors. It assumes that the errors are unbiased and follow a normal distribution (SantamariaBonfil et al., 2016). The percentage error of the energy yield gives an estimate of the accuracy of the model for predicting the total energy generated by the wind farm over the period of evaluation. Since each metric has disadvantages that can lead to inaccurate evaluation of the results, it is not recommended to depend only on one measure (Shcherbakov et al., 2013)
MCP methods are based on regression techniques. Regression can be performed by using MLR. However, as mentioned above, several more powerful techniques exist, amongst which are ANNs, SVR and DTs. While MCP methodologies have been developed for wind speed, they cannot be directly used for predicting wind direction (Bosart and Papin, 2017). Nothing has been found in literature on MCP techniques that explicitly mentions prediction of wind direction at that candidate site. The use of wind speed vectors is a way of using a regression methodology to predict the wind direction, by breaking the wind speed vector into its respective components. MCP methodologies are normally used to predict the wind speed magnitude at the candidate site, but not the direction. Wind velocity may be negative (if one considers it as a vector), and the MCP methodology normally considers the positive value of the wind, i.e. magnitude. The methodology used creates a regression model using the wind velocity vector components to predict the wind vector components at the candidate site (Bosart and Papin, 2017).
The methodology is based upon a simple relationship between the meteorological wind direction θ_{met} and the mathematical wind direction θ_{math} such that
in which the wind speed vector V_{i} can be broken down into its vector components such that
in which case the values of u_{i} and v_{i}, which may be either positive or negative depending on the direction of the wind (the value of θ_{met}), are the wind components in the north (y) and the east (x) directions (axes). The relationship is shown in Fig. 1.
Also,
The regression is carried out between the respective components of the wind velocity in the y and x directions, hence establishing a relationship between the components at both sites. The forecasted wind direction at the candidate site is then obtained from the forecasted wind components using the relationship in Eq. (5):
The value of the angle ${\mathit{\theta}}_{{\mathrm{met}}_{{i}_{\mathrm{p}}}}$ depends on the direction of ${\mathit{u}}_{{i}_{\mathrm{p}}}$ and ${\mathit{v}}_{{i}_{\mathrm{p}}}$ as shown in Fig. 2 and in accordance with the relationships shown in Eq. (6),
and Eq. (7),
The results are compared by using the NMAE and the NMSE of the residuals, using Eqs. (8) to (12). The residuals e_{i} are the errors between the predicted and the actual output power values from the wind farm,
The formula used to calculate the NMAE is shown in Eq. (9), whereby the errors are normalised by dividing by the average power production over the whole period of evaluation (Madsen et al., 2005):
The NMSE is given by
where
and
The percentage error in overall energy yield is given by Eq. (13), where
4.1 The reference and candidate sites
The reference site employed in this study is the Meteorological Office at Malta International Airport (MIA), Luqa, and the candidate site is comprised of data collected by a ZephIR 300 lidar (https://www.zxlidars.com/windlidars/zx300/, last access: 10 May 2020) unit administered by the University of Malta's Institute for Sustainable Energy. The unit was situated on the roof of a coastal watch tower at Qalet Marku, situated in the northern part of the island of Malta (Mifsud et al., 2018). The relative location of the two sites is shown in Fig. 3, while Fig. 4 shows a satellite image of the location of the coastal watch tower.
Tables 1 and 2 show the properties of the candidate and the reference sites respectively (Cordina et al., 2017; Mifsud et al., 2018). In this case the wind data measured by the lidar at a height of 80 m would be equivalent to a cumulative height of 100 m above sea level, which would be the hub height of the wind turbines in the wind farm. This is because the lidar is situated on the rooftop of a coastal tower at a height of 20 m above sea level, as shown in Table 3.
4.2 The available wind data
The measurement campaign at the candidate site started on 1 July 2015 and ended on 31 December 2016. Hourly wind data were available for this time period from both the reference and candidate sites. The ideal number of data points used to create the MCP models is thus 8784, i.e. the number of hours in 2016. Following analysis and filtration of the wind speed data at the reference site, 98 % of the data were considered suitable for the creation of the model. The data at the reference site were all considered suitable. Hence, the regression model was created using the concurrent 8616 wind speed and direction values. For the year 2015, 95.6 % of the data were considered valid (the measurement campaign started on 26 June 2015; hence there were 4368 h of wind speed and direction measurements of which 4176 were valid data points).
The MCP analysis was carried out using both wind speed and wind direction. The data from the reference site were used as the independent data set. The models were created using the data for the year 2016, while the reference site wind data for 2015 were used to create the predicted wind speed and wind direction as inputs to the wind farm model.
4.3 The wind farm design in windPRO^{®}
The hypothetical wind farm is located opposite the coastal watch tower of Qalet Marku (35.945892^{∘} N, 14.452498^{∘} E). WindPRO^{®} 2.7 was used to render an image of the wind farm onto an image of the lidar unit taken from the watch tower. This gives an indication as to the extent of the wind farm. This is shown in Fig. 5 while Fig. 6 shows the satellite imagery of the wind farm, showing a 250 MW capacity wind farm. The wind farm faces the northwest direction, which is the prevailing wind direction.
The wind farm is made up of 50 wind turbines. There are 10 wind turbines in a row, having a crosswind spacing of five rotor diameters (5D). The distance between the successive rows of wind turbines, or the downwind spacing, is eight rotor diameters (8D). Thus, considering wind turbines with a rotor diameter, D, of 126 m (for a 5 MW wind turbine), the distance between the turbines in the crosswind direction is 630 m, and the distance between successive rows of wind turbines in the downwind direction is 1008 m. The wind turbine selected for use in windPRO^{®} is the REpower 5 MW wind turbine whose parameters are shown in Table 3.
Figure 7 shows the methodology applied in this paper. The study is divided into three steps as follows.

Step 1. The various MCP methodologies are used to compute the MCP model. For wind speed, the models are trained using wind speed and direction data at candidate and reference sites for the year 2016. For the wind direction the input training data are the wind velocity vector component in the north or east direction at the candidate site, and the output of the model is the respective component at the candidate site. The models are summarised in Table 4. Table 4 describes the inputs used to train the respective models, for both wind speed and wind direction. It also shows the parameters of the models and the algorithms used to train the model, such as least squares for MLR and the Levenberg–Marquardt algorithm for ANNs.

Step 2. The 2015 wind speed and wind direction are predicted using the models computed in Step 1. The predicted and actual wind speed and wind direction are used to compute the power output from the wind farm. This is done by feeding the wind speed and direction data into the windPRO^{®} model.

Step 3. Compute and compare the normalised meansquared error (NMSE), normalised absolute error (NMAE) and percentage error in the power.
The combinations of lidar measurement heights and MCP methodologies are shown in Table 5.
Regression models were created for the MCP methodologies using the reference and candidate wind speed and direction for the year 2016. These regression models were created using MLR, ANNs, DTs and SVR. A model was created for both wind speed and direction.
The wind speed and wind direction for 2015 were then predicted with the models by feeding the speed and direction values from the reference site from the year 2015. Thus, a sequence of predicted wind speeds and wind direction time series could be compared to the actual speed and direction measured at the candidate site for the year 2015. The models for the wind speed and the wind direction are independent of each other.
In the case of wind direction, the MCP methodologies are applied as shown in Figs. 8 and 9. Figure 8 shows that two regressions are carried out: one for the magnitude of the wind component in the north direction and one for the wind component in the east direction. Thus, two models are created using the wind speed and direction data of the reference and the candidate sites for 2016. The two models are then used to derive the predicted wind direction for 2015 at the candidate site as shown in Fig. 9, by using the wind components at the reference site for 2015 as inputs to the respective models. The values of the wind speed in the north direction and the east direction are first predicted, and the wind direction at the candidate site for 2015, ${\mathit{\theta}}_{{\mathrm{met}}_{\mathrm{p}}}$, is then derived from the mathematical relationships given in Eqs. (6) and (7).
The sequences of wind speed and wind directions (both actual and predicted) were fed into the wind farm model. This was done for different combinations of methodology and wind farm (250, 200, 150, 100 and 50 MW) configurations. The results were compared to determine which combination of MCP methodology and wind farm capacity would give the lowest prediction error. The prediction error for the power output from the wind farm is analysed using the normalised meansquared error (MSE), the normalised mean absolute error (NMAE) and the percentage error in the overall energy yield for the period of analysis. The results are shown in the following section.
A summary of the results is shown below where sequences of data for a specific period of 2015 are compared. These sequences are for wind speed, wind direction and power output. All NMSE, NMAE and percentage errors in the overall energy yield are then shown in the following tables.
6.1 Wind speed and wind direction with MCP methodology
6.1.1 Wind speed with MCP methodology
Figures 10 to 13 show the wind speed from the period 23–30 November 2015. The particular period is chosen because of the high availability of wind. The actual wind data are compared with those predicted by the MLR, ANN, DT and SVR methodologies. The predicted wind values closely follow the actual wind values, for all the MCP methodologies applied.
6.1.2 Wind direction with MCP methodology
Figures 14 to 17 show the wind direction from the period 23–30 November 2015. As above, the actual wind direction at the candidate site is compared to that predicted by the MLR, ANN, DT and SVR methodologies. Again, as in the case for wind speed, there is a similarity between the actual and predicted wind direction values, in all cases.
6.2 Wind farm power output with MCP methodology, for a wind farm capacity of 250 MW
Figures 18 to 21 compare the output power from the wind farm, which is derived from the actual wind speed and wind direction, to the power output derived from the predicted wind speed and direction. This comparison is carried out for the MLR, ANN, DT and SVR methodologies. The results for a wind farm capacity of 250 MW are being shown. As in the case for wind speed and direction, the predicted power output closely follows that obtained with the actual wind speed and direction.
A wind data analysis, carried out using windPRO^{®}, is shown in the next section. The results presented are a Weibull distribution for wind speed and the wind rose. These charts are computed from the wind speed and direction which are predicted by using the MLR, ANN, DT and SVR MCP methodologies. Thus, the predicted wind speed and direction are compared with the results computed from the actual wind data.
6.3 The actual wind data for 2015 measured by the lidar system
Figure 22 shows the wind data analysis report from windPRO^{®} for the actual lidar data measured at the 80 m level height (equivalent to a hub height of 100 m). The images show the Weibull distribution for the wind speed and the wind rose. The reports are used to compare the properties of the actual wind measurements and the predicted wind speed and direction.
A wind data analysis, carried out using windPRO^{®}, is shown in the next section. The results presented are a Weibull distribution for wind speed and the wind rose. These charts are computed from the wind speed and direction which are predicted by using the MLR, ANN, DT and SVR MCP methodologies. Thus, the predicted wind speed and direction are compared with the results computed from the actual wind data.
6.4 Wind speed and direction predicted using the MCP methodologies
Figures 23 to 26 represent the Weibull distribution and the wind rose for the wind speed and direction predicted by the MLR, ANN, DT and SVR MCP methodologies respectively at the hub height of 100 m. There is a similarity between the Weibull plots for the actual wind data and those for the predicted wind speed, for the same measurement period. Meanwhile, the wind direction predicted by the ANN and DT methodologies shows a higher resemblance to that of the actual wind direction than that predicted by the MLR or SVR methodologies. Hence it is expected that the ANN and the DT methodologies would yield the least error in the predicted power output from the wind farm.
The results for the NMAE, the NMSE and the percentage error in the overall energy yield are summarised in Tables 6 to 8. The tables show that the MLR and ANN methodologies have the best performance in NMAE, NMSE and percentage error for energy yield. The results are consistent for all wind farm capacities under consideration. ANNs are better than MLR in the case of NMAE, while MLR is slightly better than ANNs in the case of the 50 MW wind farm capacity. MLR is superior to ANNs in the case of NMSE for all wind farm capacities. However, the differences between the MLR and the ANN methodologies are minimal, and both methodologies show a better performance than the DT or SVR methodologies, especially in the case of the overall energy yield as shown in Table 8. Graphical results are also shown in Figs. 27 to 29.
The ANN methodology also shows the best similarity to the actual wind speed and wind direction, as seen in Fig. 24. In the case of the overall energy yield, the MLR and ANN methodologies show a significant improvement in percentage error over the DT and SVR methodologies. The ANN methodology is only better than the MLR methodology for the 250 MW wind farm capacity. The MLR methodology has better results in the case of 200, 150, 100 and 50 MW wind farm capacities, with the percentage error being 3.65 % at a wind farm capacity of 50 MW, when compared to an error of 7.3 % obtained with the ANN methodology.
Thus, the metrics show that the best methodology for predicting the output power from the wind farm is therefore MLR, closely followed by ANNs.
The above research has combined the use of MCP methodologies for wind speed and used a different method for predicting the wind direction at a candidate site. Three of the four MCP methodologies used are based on modern statistical learning methodologies. The data were collected from a reference site which is the island of Malta's international airport, while the candidate site data have been collected by means of a lidar wind measurement system placed on the rooftop of a coastal building.
The wind direction at the candidate site was predicted with the various MCP methodologies by breaking down the wind velocity vector into its respective north and east direction components. The regression analysis was then carried out on the respective components at the reference and the candidate sites. The wind speed is predicted by using the magnitude of the wind speed at the respective sites for creating the regression model. The projected wind speed and direction time series were applied to a hypothetical wind farm. Thus, the error introduced by the four MCP methods could be measured. This was done by calculating the NMAE, the NMSE and the percentage error in the wind farm's energy yield. The results show that the NMAE, NMSE and percentage error in energy yield depend on the MCP methodology and the wind farm capacity and can be used to establish an optimal MCP methodology.
In this case, the best MCP method was that which used MLR. Although other MCP methodologies gave larger errors, they cannot be totally discarded. It is always best to compare methodologies, comparing results by analysing residuals and errors and then choosing the best methodology on a casebycase basis. In this case the results from the ANN methodology were very close to those of the MLR methodology, while the DT and SVR methodologies gave larger errors.
Unless actual wind data are available, one cannot carry out this analysis, as the uncertainty is obtained by comparing the energy from the wind farm with predicted and actual wind data. The above analysis could be done because 18 months of data were available, rather than the normal 12 months, which is usual for a wind resource assessment which uses MCP methodologies.
The above study was limited to using the same MCP methodology for both the wind speed and the direction and to the N.Ø. Jensen methodology for wake losses. The layout chosen was one that ensured a recommended minimum distance between the wind turbines. Different combinations of MCP methodologies for wind speed and direction can be examined.
In this case, an MCP model was created for wind speed, and two more MCP models were created for wind speed components, which were then used to calculate the wind direction. Another possible method is to calculate the magnitude of the wind speed from the models used to calculate the wind direction. This was done, but the results from the first method were by far superior to those from the latter method. The reason why still needs to be investigated as part of future work is are not presented in this paper. Having three models also allows the possibility of using different combinations of MCP methodologies, i.e. using MLR for wind speed and ANNs for wind direction. This was also performed for a limited number of combinations and is also the subject of further research.
Another area which warrants further study is trying out different wind farm topologies or selecting different wind turbines and different hub heights. It would also be of interest to study the application of different wake methodologies as a possible means of decreasing the uncertainties.
ANN  Artificial neural network 
CFD  Computational fluid dynamics 
DTs  Decision trees 
Lidar  Light detection and ranging 
LES  Largeeddy simulation 
MCP  Measure–correlate–predict 
MIA  Malta International Airport 
MLR  Multiple linear regression 
MLP  Multilayer perceptron 
MSE  Meansquared error 
NMAE  Normalised mean absolute error 
NMSE  Normalised meansquared error 
SLR  Simple linear regression 
Sodar  Sonic detection and ranging 
SVR  Support vector regression 
WT  Wind turbine 
V_{i}  Magnitude of wind speed in metres per second 
${e}_{{\mathrm{norm}}_{i}}$  Normalised residual 
e_{eng}  Percentage error in energy yield 
e_{i}  Residual, MW 
${\mathit{u}}_{{i}_{\mathrm{p}}}$  Predicted component of wind speed vector in the easterly direction at the candidate site in metres per second 
${\mathit{u}}_{{i}_{\mathrm{ref}}}$  Component of wind speed vector in the easterly direction at the reference site in metres per second 
${\mathit{u}}_{{i}_{\mathrm{ref}}}$  Component of wind speed vector in the easterly direction at the reference site in metres per second 
u_{i}  Component of wind speed vector in the easterly direction in metres per second 
${\mathit{v}}_{{i}_{\mathrm{can}}}$  Component of wind speed vector in the northerly direction at the candidate site in metres per second 
${\mathit{v}}_{{i}_{\mathrm{p}}}$  Predicted component of wind speed vector in the northerly direction at the candidate site in metres per second 
${\mathit{v}}_{{i}_{\mathrm{ref}}}$  Component of wind speed vector in the northerly direction at the reference site in metres per second 
v_{i}  Component of wind speed vector in the northerly direction in metres per second 
z_{0}  Surface roughness 
V_{i}  Wind speed vector (speed in metres per second and wind direction in degrees) 
${\mathit{\theta}}_{{\mathrm{math}}_{{i}_{\mathrm{p}}}}$  Predicted mathematical wind direction at the candidate site in degrees 
${\mathit{\theta}}_{{\mathrm{met}}_{{i}_{\mathrm{p}}}}$  Predicted meteorological wind direction at the reference site in degrees 
${\mathit{\theta}}_{{\mathrm{met}}_{\mathrm{can}}}$  Meteorological wind direction at the candidate site in degrees 
${\mathit{\theta}}_{{\mathrm{met}}_{\mathrm{ref}}}$  Meteorological wind direction at the reference site in degrees 
θ_{math}  Mathematical wind direction 
θ_{met}  meteorological wind direction 
D  Wind turbine diameter, m 
N  Number of data points 
P  Predicted power output from wind farm, MW 
P_{act}  Actual power output from wind farm, MW 
The data used in this research are available upon request.
TS and RNF contributed to the preparation of the manuscript and the research methodology.
The authors declare that they have no conflict of interest.
This article is part of the special issue “Wind Energy Science Conference 2019”. It is a result of the Wind Energy Science Conference 2019, Cork, Ireland, 17–20 June 2019.
Joseph Schiavone from the Meteorological Office at Malta International Airport, Luqa, is acknowledged for providing the data for the Luqa MIA weather station. The authors would like to express their sincere gratitude to Manuel Aquilina, lab officer at the University of Malta, for technical assistance in collecting and organising the data from the Institute for Sustainable Energy's lidar system at Qalet Marku. Thanks also goes to Din L'Art Helwa for permitting and facilitating the installation of the lidar unit on the Qalet Marku Tower.
The lidar system was purchased through the European Regional Development Fund (grant no. ERDF 335), partially financed by the European Union. The windPRO^{®} 2.7 software was funded by the project “Setting up of Mechanical Engineering Computer Modelling and Simulation Laboratory”, partially financed by the European Regional Development Fund (ERDF project no. 79) – Investing in Competitiveness for a Better Quality of Life, Malta 2007–2013.
This paper was edited by Zhen Gao and reviewed by two anonymous referees.
Ainslie, J.: Calculating the Flowfield in the Wake of Turbines, J. Wind Eng. Ind. Aerodyn., 27, 216–224, 1985.
Alpaydin, E.: Introduction to Machine Learning, 2nd Edn., Massachusetts Institute of Technology, MIT Press, Cambridge, Massachusetts, chap. 9, 185–207, 2010.
Barthelmie, R., Folkrts, G., Larsen, G., Rados, K., Pryor, S., Frandsen, S., Lange, B., and Schepers, G.: Comparison of Wake Model Sumulations with Offshore Wind Turbine Wake Profiles Measured by Sodar, J. Atmos. Ocean. Technol., 23, 888–901, 2006.
Bechrakis, D., Deane, J., and MCKeogh, E.: Wind Resource Assessment of an Area using ShortTerm Data Correlated to a LongTerm DataSet, Sol. Energ., 76, 724–32, 2004.
Bilgili, M., Sahlin, B., and Yasar, A.: Application of Artificial Neural Networks for the Wind Speed Prediction of Target Station Using Artificial Intelligent Methods, Renew. Energ., 32, 2350–2360, 2007.
Bilgili, M., Sahin, B., and Yaser, A.: Application of Artificial Neural Networks for the Wind Speed Prediction of Target Station using Reference Stations Data, Renew. Energ., 34, 845–848, 2009.
Bosart, L. and Papin, P.: Statistical Summary, ATM 305, available at: https://www.atmos.albany.edu/daes/atmclass/atm305/2017 (last access: 11 May 2020), 2017.
Bossanyi, E., Maclean, C., Whitle, G., Dunn, G., Lipman, N., and Musgrove, P.: The Efficiency of Wind Turbine Clusters, Proceedings of the Third International Symposium on Wind Energy Systems Lyngby, DK, 1980.
Carta, J. and Velazquez, S.: A New Probabilistic Method to Estimate the LongTerm Wind Speed Characteristics at a Potential Wind Energy Conversion Site, Energy, 36, 2671–2685, 2011.
Carta, J., Velazquez, S., and Cabrera, P.: A Review of MeasureCorrelatePredict (MCP) methods used to Estimate LongTerm Wind Characteristics at a Target Site, Renew. Sustain. Energ. Rev., 27, 362–400, 2013.
Churchfield, M.: A Review of Wind Turbine Wake Models and Future Directions, Boulder, Colorado, National Renewable Energy Laboaory, 2013.
Clive, J.: Nonlinearity of MCP with Weibull Distributed Wind Speeds, Wind Eng., 28, 213–24, 2004.
Cordina, C., Farrugia, R., and Sant, T.: Wind Profiling using LiDAR at a Costal Location on the Mediterranean Island of Malta, 9th European Seminar OWEMES, Bari, Italy, 2017.
Crespo, A. and Hernandez, J.: A Numerical Model of Wind Turbine Wakes and Wind Farms, Proceedings of the 1986 European Wind Energy Conference, Rome, 1986.
Crespo, A. and Hernandez, J.: Analytical Correlations for Turbulence Characteristics in the Wakes of Wind Turbines, Proceedings of the 1993 European Community Wind Energy Conference, Lubeck, 1993.
Diaz, S., Carta, J., and Matias, J.: Comparison of Several MeasureCorrelatePredict Models using Support Vector Regression Techniques to estimate wind power densities, A case study, Energ. Conv. Manag., 140, 334–354, 2017.
Diaz, S., Carta, J., and Matias, J.: Performance Assessment of Five MCP Models Proposed for the Estimation of Longterm Wind Turbine Power Outputs at a Target Site Using Three Machine Learning Techniques, Appl. Energ., 209, 455–477, 2018.
Fransden, S.: Turbulence and TurbulenceGenerated Structural Loading in Wind Turbine Clusters, Riso National Laboratory, RISOR1188(EN), Roskilde, Danmark, 2005.
GonzalezLongatt, F., Wall, P., and Terzija, V.: Wake effect in wind farm performance: Steady State and Dynamic Behaviour, Renew. Energ., 39, 329–338, 2012.
Hastie, T., Tibshirani, R., and Friedman, J.: The Elements of Statistical Learning, Data Mining, Infernence and Prediction, 2nd Edn., Springer Series in Statistics, New York, USA, 2009.
Hu, J., Wang, J., and Zeng, G.: A Hybrid Forecasting Approach Applied to Wind Speed Time Series, Renew. Energ., 60, 185–194, https://doi.org/10.1016/j.renene.2013.05.012, 2013.
James, G., Witten, D., Hastie, T., and Tibshirane, R.: An Introduction to Statistical Learning with Applications in R, Springer Texts in Statistics, New York, 2015.
Jensen, N.: A note on Wind Generator Interaction, Riso National Laboratory, RISOM2411, Riso National Laboratory, 4000 Roskilde, Denmark, 1983.
Katić, I., Hojstrup, J., and Jensen, N. O.: A Simple Model for Cluster Efficiency, in: EWEC'86 Proceedings, edited by: Palz, W. and Sesto, E., Vol. 1, 407–410, 1986.
Lackner, M., Rogers, A. L., and Manwell, J.: Uncertainty Analysis in Wind Resource Assessment and Wind Energy Production Estimation, 45th AIAA Aerospace Sciences and Exhibition, Reno, Nevada, American Institue of Aeronautics and Astronautics, Inc., Reno, Nevada, https://doi.org/10.2514/6.20071222, 2012.
Larsen, G., Madsen, H. A., Larsen, T. J., and Troldborg, N.: Wake Modelling and Simulation, Riso National Laboratory for Sustainable Energy, Technical University of Denmark, 2008.
Larsen, T. J., Madsen, H. A., Larsen, G. C., and Hansen, K. S.: Validation of the Dynamic Wake Meander Model for Loads and Power Production in the Egmond ann Zee Wind Farm, Wind Energ., 16, 605–624, 2013.
Lissaman, P. and Bates, E.: Energy Effectiveness of Arrays of Wind Energy Conversion Systems, AeroVironment Report, Pasadena, CA, 1977.
Madsen, H., Pinson, P., Kariniotakis, G., Nielsen, H., and Nielsen, T.: Standardizing the Performance Evaluation of ShortTerm Wind Power Prediction Models, Wind Eng., 29, 475–489, https://doi.org/10.1260/030952405776234599, 2005.
Manwell, J., McGowan, J., and Rogers, A.: Wind Energy Explained, 2nd Edn., John Wiley and Sons Ltd., West Sussex, England, chap. 9, 407–448, 2009.
Mifsud, M., Sant, T., and Farrugia, R.: A Comparison of MeasureCorrelatePredict Methodologies using LiDAR as a Candidate Site Measurement Device for the Mediterranean Island of Malta, Renew. Energ., 127, 947–959, 2018.
Monfared, M., Rastegar, H., and Kojabadi, H.: A New Strategy for Wind Speed Forecasting Using Artificial Intelligent Methods, Renew. Energ., 34, 845–848, 2009.
Montgomery, D., Peck, E., and Vinning, G.: Introduction to Linear Regression Analysis John Wiley and Sons, Inc., Hoboken, New Jersey, chap. 2 and 3, 12–122, 2006.
Oztopal, A.: Artificial Neural Network Approach to Spatial Estimation of Wind Velocity, Energ. Conv. Manag., 47, 395–406, 2006.
Patane, D., Benso, M., Hernandez, C., de la Blanca, F., and Lopez, C.: Long Term Wind Resource Assessment by means of Multivariate CrossCorrelation Analysis, Proceedings of the European Wind Energy Conference and Exhibition, Brussels, Belgium, 2011.
Perea, A., Amezucua, J., and Probst, O.: Validation of Three New MeasureCorrelate Predict Models for the LongTerm Prospection of the Wind Resource, J. Renew. Sustain. Energ., 3, 1–20, 2011.
Principe, J., Euliano, N., and Curt Lefebvre, W.: Neural and Adaptive Systems: Fundamentals Through Simulations, John Wiley and Sons, Inc., New York, chap. 3, 100–172, 2000.
Probst, O. and Cardenas, D.: State of the Art and Trends in Wind Resource Assessment, Energies, 3, 1087–1141, 2010.
Rogers, A., Rogers, J., and Manwell, J.: Comparison of the Performance of four MeasureCorrelatePredict Models for LongTerm Prosepection of the Wind Resource, J. Wind Eng. Ind. Aerodyn., 93, 243–264, https://doi.org/10.1016/j.jweia.2004.12.002, 2005a.
Rogers, A., Rogers, J., and Manwell, J.: Uncertainties in Results of MeasureCorrelatePredict Analyses, Am. Wind Energ. Assoc., Denver Colorado, available at: https://www.researchgate.net/publication/237439775_Uncertainties_in_Results_of_MeasureCorrelatePredict_Analyses (last access: 13 May 2020), 2005b.
Sanderse, B.: Technical Report ECNE09016, Netherlands, available at: https://publications.tno.nl/publication/34628948/7k46Ov/e09016.pdf (last access: 14 May 2020), 2009.
SantamariaBonfil, G., ReyesBallestros, A., and Gershenson, C.: Wind Speed Forecasting for Wind Farms: A Method Based on Support Vector Regression, Renew. Energ., 85, 790–809, https://doi.org/10.1016/j.renene.2015.07.004, 2016.
Scholkopf, B. and Smola, A.: Learning with Kernels – Support Vector Machines, Regularisation, Optimisation and Beyond, Cambridge, Massachusetts, The MIT Press, chap. 1, 1–22, 2002.
Shcherbakov, M., Brebels, A., Shcherbakova, N., Tyukov, A., and Janovsky, T.: A Survey of Forecast Error Measures, World Appl. Sci. J., 24, 171–176, 2013.
Vapnik, V.: The Nature of Statistical Learning Theory, NY, Springer, 123–167, 1995.
Vapnik, V., Golowich, S., and Smola, A.: A Support Vector Method for Function Approximation, Regression Estimation and Signal Processing, Adv. Neural Inf. Proc. Syst., 9, 281–287, 1998.
Velazquez, S., Carta, J., and Matias, J.: Comparision between ANNs and Linear MCP Algorithms in the LongTerm Estimation of the Cost per kW h Produced by a Wind Turbine at a Candidate Site: A Case Study in the Canary Islands, Appl. Energ., 88, 3869–3881, 2011.
Vermeulen, P.: An Experimental Analysis of Wind Turbine Wakes, Preceedings of the Third International Symposium on Wind Energy Systems, 431–450, Lyngby, DK, 1980.
Zhang, J., Chowdhury, S., Messac, A., and Hodge, B.M.: A Hybrid MeasureCorrelatePredict Method for LongTerm Wind Condition Assessment, Energ. Conv. Manag., 87, 697–710, 2014.
Zhao, P., Xia, J., Dai, Y., and He, J.: Wind Speed Prediction Using Support Vector Regression, The 5th IEEE Conference in Industrial Electronics and Applications (ICIEA), Taiwan, IEEE, 2010.
 Abstract
 Introduction
 Literature review
 Theoretical background
 A case study – site conditions and the modelled offshore wind farm
 Methodology
 Results
 Conclusions
 Appendix A: Nomenclature
 Data availability
 Author contributions
 Competing interests
 Special issue statement
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Literature review
 Theoretical background
 A case study – site conditions and the modelled offshore wind farm
 Methodology
 Results
 Conclusions
 Appendix A: Nomenclature
 Data availability
 Author contributions
 Competing interests
 Special issue statement
 Acknowledgements
 Financial support
 Review statement
 References