Development of an image processing method for wake meandering studies and its application on data sets from scanning wind lidar and large-eddy simulation

Wake meandering studies require knowledge of the instantaneous wake shape and its evolution. Scanning lidar data are used to identify the wake pattern behind offshore wind turbines but do not immediately reveal the wake shape. The precise detection of the wake shape and centerline helps to build models predicting wake behavior. The conventional Gaussian fit methods are reliable in the near-wake area but lose precision with the distance from the rotor and require good data resolution for an accurate 5 fit. The thresholding methods usually imply a fixed value or manual selection of a threshold, which hinders the wake detection on a large data set. We propose an automatic thresholding method for the wake shape and centerline detection, which is less dependent on the data resolution and can also be applied to the image data. We show that the method performs reasonably well on large-eddy simulation data and apply it to the data set containing lidar measurements of the two wakes. Along with the wake detection method, we use image processing statistics, such as 10 entropy analysis, to filter and classify lidar scans. The image processing method is developed to reduce dependency on the supplementary reference data such as wind speed and direction. We show that the centerline found with the image processing is in a good agreement with the manually detected centerline and the Gaussian fit method. We also discuss a potential application of the method to separate the near and far wakes and to estimate the wake direction.

Wake meandering studies require knowledge of the instantaneous wake shape and its evolution. Scanning lidar data are used to identify the wake pattern behind offshore wind turbines but do not immediately reveal the wake shape. The precise detection of the wake shape and centerline helps to build models predicting wake behavior. The conventional Gaussian fit methods are reliable in the near-wake area but lose precision with the distance from the rotor and require good data resolution for an accurate 5 fit. The thresholding methods usually imply a fixed value or manual selection of a threshold, which hinders the wake detection on a large data set. We propose an automatic thresholding method for the wake shape and centerline detection, which is less dependent on the data resolution and can also be applied to the image data.
We show that the method performs reasonably well on large-eddy simulation data and apply it to the data set containing lidar measurements of the two wakes. Along with the wake detection method, we use image processing statistics, such as entropy analysis, to filter and classify lidar scans. The image processing method is developed to reduce dependency on the supplementary reference data such as wind speed and direction. We show that the centerline found with the image processing is in a good agreement with the manually detected centerline and the Gaussian fit method. We also discuss a potential application of the method to separate the near and far wakes and to estimate the wake direction.

15
A wake is a complex dynamic structure forming behind a wind turbine due to the kinetic energy extraction from the incoming wind flow. The wake region is characterized by the decreased wind speed and the increased turbulence intensity. The relative velocity deficit rapidly decreases to 20% at the downstream distance of five rotor diameters (5D). Further downstream, the recovery to the free flow is considerably slowed down; at the same time, the wake width increases up to 3D according to in situ observations (Aitken et al. (2014)). The typical turbine spacing in the wind farms is usually 8D, although the optimal spacing 20 is estimated to be higher in order to reduce the wake effect on downstream turbines (Meyers and Meneveau (2012) ;Stevens (2016)). Since the generated wind power is proportional to the cube of the wind speed U 3 , the power production gradually decreases if the incoming wind speed drops below the rated wind speed. The increased turbulence intensity negatively affects the turbine fatigue loads (Lee et al. (2012)). Studying the wake behavior is hence crucial to estimating both the actual power production and the overall lifetime of a wind farm. 25 Not only the wake expands, but it is also subjected to the wake meandering -oscillations along the rotor axis caused by the movement of large eddies (Larsen et al. (2007(Larsen et al. ( , 2008). While the near wake remains primarily stable and follows the wind direction, the far wake oscillates randomly in the horizontal plane with an amplitude exceeding 0.5D (Howard et al. (2015);Foti et al. (2016)). The far wake also oscillates in the vertical plane, although the velocity fluctuations there are weaker (España et al. (2011)). As a result, a downstream turbine is exposed to intermittent flow and, consequently, unequal fatigue loads (Muller et al. 30 (2015); Moens et al. (2019)). Additionally, the wake in the Northern hemisphere slightly turns clockwise due to the Coriolis effect (Abkar and Porté-Agel (2016); van der Laan and Sørensen (2017)) adding more complexity to the wake evolution over time. Knowing only the velocity deficit at a certain downstream distance is insufficient since the wake meandering strength is characterized by the standard deviation of the wake center. Therefore, the wake meandering analysis requires the knowledge of the wake centerline to quantify the instantaneous wake effect on the downwind structures. An appropriate detection method 35 should be able to separate the wake from the free flow and estimate the wake centerline and its statistical characteristics. The method application and their capabilities are highly dependent on the data available.
In situ measurements using scanning lidars provide the most relevant data on the wind flow in a particular wind farm (Trujillo et al. (2010(Trujillo et al. ( , 2011; Herges et al. (2017)). Due to the technical restrictions and costs of lidar installation, it is complicated to obtain a three-dimensional scan of the flow around the whole wind farm, although the flow can be reconstructed for a 40 single turbine (Beck and Kühn (2019)). Still, the measurement campaigns span along several months and require data preprocessing to sort out invalid measurements. A controlled experiment can be performed on a wind tunnel for model validation or reproduction of specific flow conditions (Snel et al. (2007); Chamorro and Porté-Agel (2010)). The particle image velocimetry (PIV) provides good spatial and temporal resolution of the measured wind field but still deals with the scaled models and has to account for their limitations. A different approach is running a large-eddy simulation (LES) of a wind turbine or a wind 45 farm. While LES provides a wide range of possibilities to simulate atmospheric conditions and wind farm configuration, its representation of a wake strongly depends on the implemented turbulence closure (Moriarty et al. (2014); Mehta et al. (2014); Martínez-Tossas et al. (2018)) and wind turbine model (Porté-Agel et al. (2011);Martínez-Tossas et al. (2015)). A relatively new development is quantitative study of wind farm wakes from the satellite data (Ahsbahs et al. (2020)). The satellites generally have a lower spatial resolution than scanning lidars and measure wind speed on the horizontal near-surface plane but 50 still provide general information on the flow around wind farms.
Several wake detection methods exist, varying in the complexity and input data requirements (Quon et al. (2020)). Among the variety of wake detection methods, we focus on the thresholding and Gaussian fitting because they are applicable to a 2D lidar scan in a horizontal or inclined plane. The most common wake detection method is fitting a Gaussian distribution to the velocity deficit across the wake at various downstream positions and get estimations of the wake center and width from the 55 fitted function (Fleming et al. (2014); Vollmer et al. (2016); Krishnamurthy et al. (2017)). The method can be applied both to the averaged and instantaneous wake, although the irregular wake shape of the latter complicates the detection. For better accuracy, the fitting requires wind speed data in a fine spatial resolution. A sufficient spatial resolution is achieved by large-A fixed threshold value does not account for wind speed distribution and data quality which may be a common issue for a lidar scan. The fitting methods also depend on the data quality and spatial resolution and may require additional data such as wind direction. We, therefore, refer to image processing techniques that depend less on the data availability and can be applied both to the image or processed wind speed data. We propose an image processing method with an automated threshold estimation, previously developed for the whitecaps detection -Adaptive Thresholding Segmentation method (ATS) (Bakhoday-70 Paskyabi et al. (2016)). We adapt the method for the wake applications and develop new routines to estimate the wake centerline without a priori knowledge of the wind direction.
This study focuses on the technical aspect of the ATS method and discusses its advantages and limitations. The method is applied to the scanning lidar data set containing wakes from two wind turbines and various wake-wake or wake-turbine interactions. The measurement site and lidar setup are described in Sect. 2, and the data preprocessing is reviewed in Sect. 3. In 75 the same section, we preview diagnostic techniques using entropy to evaluate and classify the data. In addition to the lidar data, we also describe wake detection based on LES data as a proof of concept. The application of the image processing method to the wake and centerline detection is detailed in Sect. 4. We then apply the image processing techniques to the lidar data and compare the result with the manual wake detection and the Gaussian fit method in Sect. 5. The findings are summarized in Sect. 6. In the Appendix, we briefly discuss the differences between wake detection from the lidar scan data and the respective 80 grayscale image.

Site description and measurement setup
In this study, we use radial wind speed data recorded with a scanning Doppler wind lidar Leosphere WindCube 100S during the Offshore Boundary-Layer Experiment at FINO1 (OBLEX-F1) campaign. The FINO1 platform is located in the North Sea at 54 • 00' 53.5" N 6 • 35' 15.5" E, 45 km to the north of the German island of Borkum. The installed lidar is oriented towards 85 the alpha ventus wind farm.
The alpha ventus wind farm consists of 12 wind turbines arranged in a rectangular pattern (Fig. 1). The wind turbines AV1−AV6 are of the type Repower 5M with a hub height of 92 m and a rotor diameter of 126 m; AV7−AV12 are of the type AREVA M5000 with a hub height of 91.5 m and a rotor diameter D of 116 m. The row and column distances between the turbines vary within 800−850 m, approximately seven rotor diameters, 7D. The distance between FINO1 and the closest wind 90 turbine, AV4, is 405 m. The closest scanned wind turbine, AV7, is located at 919 m or 7.92D from FINO1. The lidar is installed at 23.5 m above sea level and operates in a Plan Position Indicator (PPI) scanning mode. In this mode, the azimuth of the lidar beam changes between 131.5 • and 179.5 • at an elevation angle of 4.62 • . The lidar scans the southwestern sector of the alpha ventus wind farm and captures wake patterns from two wind turbines, AV7 and AV10. The third wind turbine, AV11, stays outside of the lidar range in most scans, but a part of its wake is visible for the specific wind 95 directions. The wind turbine AV7 is scanned near the hub height at approximately 97 m. The farther wind turbines AV10 and AV11 are scanned above the top of the blade tip at 158 m and 188 m, respectively.
In addition to the lidar data, we use the wind speed and wind direction time series from the FINO1 meteorological mast as a reference. The wind speed and direction are measured at 90 m and 100 m, respectively.

Lidar measurements at alpha ventus
The lidar measurements partially cover the day of September 24, 2016 and capture a variety of wake-wake interactions. The data set used in this study contains the first 20−22 minutes of each hour. The consecutive lidar scans are separated by approximately 45 s -the time required for the lidar to finish one scan. Overall, the data set contains 600 lidar scans, which are split into 24 subsets of 25 scans each. For the simplicity of presentation and referring, we number the lidar scans from 1 to 600 (Table 1).

105
The image processing algorithm accepts the input data as a grayscale image. The wind speed data is adjusted to the range of [0, 1] to imitate the grayscale intensity as where U is the wind speed value measured at a point, U min and U max are minimum and maximum wind speed registered in a particular lidar scan. For the lowest wind speed U = U min (potential wake points), I = 1 denotes the points with the 110 highest intensity. Similarly, the highest wind speed for U = U max (free-flow points), I = 0 indicates the points with the lowest intensity. Figure 2 presents an example lidar scan plotted in polar coordinates and then transformed into Cartesian coordinates. The wake detection is performed on the data stored in the polar coordinates. The resulting data are converted to the Cartesian coordinates for a better presentation.

115
Working with lidar scans, we encounter two types of noise: wind speed fluctuations not caused by the wake and high wind speed due to the measurement error. The lidar measures radial velocity, which can be represented through three directional wind speed components u, v, and w, and the information on the line of sight of the lidar beam, given by the azimuth φ and elevation angle θ: When the wind blows along the lidar's line of sight, the measured radial velocity is essentially the horizontal wind speed. In the case of crosswind -the wind direction is close to perpendicular to the line of sight -the radial velocity tends to zero. The lidar cannot measure the crosswind, and measurements taken during the crosswind event are prone to errors compared to other wind directions.
When plotted against the reference wind direction, the reference wind speed and mean wind speed of a lidar scan show  Occasionally, we also observe weaker spikes in the wind speed value, most of which are localized at the position of a wind 130 turbine AV10, implying a measurement error due to the lidar beam reflection from rotating blades. The reference and mean radial wind speeds remain in the good agreement for the wind directions below 210 • despite containing spikes in the wind speed data. Nevertheless, the outliers cause an intensity skew when the wind speed data are normalized to the range of [0, 1] ( Fig. 4). The intensity distribution peak moves to the right, with the far left tail containing occasional low bumps caused by the spikes (Fig. 4d).

135
In the example, the middle scan (Fig. 4b) has a wind speed spike of 15 m s −1 , while the reference wind speed reaches 5.8 m s −1 . The wind speed measured in the spike region stays below 7 m s −1 . The resulting lidar scan image is less contrast compared to the adjacent lidar scans.
To preserve the uniformity between consequent lidar scans of the same subset, we perform despiking -detection and removal of the spikes. The spikes are detected based on the wind speed value and the difference with the adjacent points. We delete 140 all values higher than 30 m s −1 and check the remaining data for the local maximums. We consider a wind speed of 7 m s −1 enough to designate a local maximum as a spike. When a spike consists of a single or double point, it is replaced by a NaN value and the resulting gap is filled by interpolation to retain the continuous wind field. Larger clusters of high wind speed values are considered noise, gap filling after removal is not performed.
Since the lidar is oriented towards the closest wind turbine, a string of missing values -a wind turbine 'shadow' -is always 145 present in the lidar scans regardless of the wind direction. The shadow rarely crosses wind turbine wakes and did not noticeably affect the performance of the wake detection methods. Hence we do not perform a gap filling to remove the shadow in addition to the despiking.

Information entropy and data classification
We introduce entropy criteria as an alternative to using reference data for quality control. The entropy application ranges from 150 finding a threshold (Pun (1981)) to object classification in an image (e.g., satellite map segmentation by Long and Singh (2013)). Here, we calculate it primarily for the diagnostic purposes and data classification into subsets.
The information entropy is a measure of noise in the data. It can be calculated for the whole data set as well as across the rows or columns of a rectangular matrix containing 2D data. We apply Shannon entropy S (Shannon (1948)) as follows:

155
where P (x i ) is the probability density function (PDF) of the variable x i (here -intensity) to occur in the data. If the entropy tends to zero, it indicates uniform data. High entropy value implies disturbances in the lidar scan due to wakes or noise.
However, too strong disturbances caused by measurement errors are reflected as entropy decrease due to the approach we use.
As shown earlier, the outliers cause intensity skew after normalization to the range of [0, 1] (Fig. 4b). With the maximum values of 100−1000 m s −1 , the intensity skew after normalization becomes even stronger. The valid wind speed measurements become indistinguishable and the lidar scan in question is perceived to be uniform, hence its entropy is lower than for the scans containing mostly valid measurements.
For the directional entropy, we select either wind speed values across the beam range or across the azimuth and calculate a PDF for this sample to pass it to the entropy function. An example directional entropy is presented in Fig. 5. The top and the left parts of the scan in polar coordinates do not contain wakes, hence the entropy calculated for the respective rows and columns 165 is lower than for the wake regions. The entropy calculated across the azimuth (Fig. 5a) is higher for columns containing both wakes instead of one due to higher disturbance rate. An additional entropy increase near the azimuth of 130−140 • can be explained by high noise at the lidar scan border. The entropy calculated across the beam range (Fig. 5c) has a prominent peak for the AV7 wake. The AV10 wake produces two less prominent peaks indicating thinner wake spread along longer distance.
We can utilize changes in the entropy to detect common features in the lidar scans. We calculate the entropy across the azimuth and beam range for all lidar scans before preprocessing and combine them into two plots (Fig. 6). The respective wind turbine positions are marked on the right axis. The lower color bar limit is adjusted for better presentation, while the actual entropy values reach zero for certain lidar scans. The entropy calculated across the beam range highlights several lidar scans with a significant entropy decrease compared to other cases (Fig. 6a). Those scans are also characterized by the measurements corrupted due to the crosswind effect. The spiked data also leads to entropy decrease 175 although not as gradual. The corresponding scans can be seen as occasional vertical stripes in (Fig. 6), e.g., series of scans after #400.
The remaining subsets have a similar entropy distribution across the beam range. A wake from the wind turbine AV7 can be seen as an increase of entropy near the turbine's position. A weaker increase of entropy can be also seen for AV10, for example, in scans #51−175.

180
The entropy calculated across the azimuth ( The low entropy criteria agree well with the crosswind criteria on which scans are likely to contain a high amount of corrupted data. In general, the scans with a high data corruption rate can be easily identified based on the percentage of the data points exceeding a specific wind speed limit. The total number of corrupted scans is five subsets, each containing 25 scans, 190 i.e., about 1/5 of the total number of scans. Classification of the remaining valid scans requires either prior knowledge of the reference wind direction (which may be unavailable if we work with the image data) or visual evaluation of the wake features (which may be complicated for a large data set). Entropy criteria can potentially simplify the classification. Using the entropy and intensity histograms, we classify the subsets into the following groups: 1. Parallel wakes subset, Fig. 7a: The wakes do not interact with each other. Some noise may occur at the lidar scan's 195 border due to the wind direction approaching the value when the crosswind effects start. Since the wakes propagate towards this border, the entropy calculated across the azimuth shows a consistent increase near azimuth of 131 • . The entropy calculated across the beam range shows a distinctive peak due to AV7 wake. The intensity histogram tends to be more symmetrical than in other subsets and has a peak close to the average intensity of 0.5. Parallel wakes are the most common case for this data set.

200
2. Aligned wakes subset, Fig. 7b: The wind blows along the line connecting wind turbines AV7 and AV10 so that the former is subjected to a wake. Compared to the parallel wakes subset, the histogram peak is shifted to the left. The histogram peak may split into two narrow peaks located close to each other when the wakes are not perfectly aligned.
The entropy across both the azimuth and the beam range is more uniform than for the parallel wakes subset due to the absence of the border noise. The overview of the subsets and reference values is presented in Fig. 7 and Table 1 containing wind speed, wind direction 215 and entropy averaged over each subset. A sample histogram averaged for a typical subset from each group is shown in Fig. 8.

LES setup
We have also performed a large-eddy simulation to verify the performance of the image processing algorithm and compare it against the Gaussian method described further in Sect. 4.3. For that purpose, we use the PALM LES code with a built-in actuator disc with rotation (ADR) wind turbine model (Maronga et al. (2020)). The results produced with the model were 220 shown to reproduce the wake shape rather accurately (Vollmer et al. (2015)). Particularly, the model resolves the double peak in the wake deficit distribution near the rotor.
The domain contains 2304×576×192 points and has the horizontal grid spacing of 4 m. The vertical spacing below 600 m is also 4 m, above 600 m the vertical spacing is stretched with a factor of 1.08, capped at maximum 8 m grid cell height. The reference NREL 5MW wind turbine has a hub height of 102 m and a diameter of D r =126 m and is placed in the center of the domain so that the wake length can reach up to 20D r . The surface temperature is 277 K and increases by 1 K per 100 m.
Neither heat flux nor surface heating are activated. During the simulation the turbulence intensity reaches 6.6%.
The LES is used solely to generate idealized wake data. No direct comparison to the lidar data is performed. The automatic thresholding methods aim to split an image into background (in our case -free flow) and foreground (wake).
Despite lidar data having large amount of disturbances in the free flow, the wind speed distribution in a single scan tends to have one peak. A single peak limits the applicability of the common thresholding methods that search for the local minimum of a bimodal histogram (Otsu (1979) to the normalized wind field, a binary matrix W P is constructed from I as follows: The intensity threshold can be converted back to the wind speed threshold U th by: The normalized wind speed data can be represented as an intensity histogram (Fig. 9c). Let H(x) for x ∈ [0, 1] be the cumulative distribution function (CDF) of the intensity data. Then H (x) and H (x) are its first and second derivatives, 245 respectively. With respect to the definition of intensity I in Eq. (1), the wake points are located in the histogram's tail, while the free-flow points form a peak on the left side. The transition region where the peak tends to the tail is a good choice to look for a threshold. We detect the threshold at the point where the CDF slope is close to constant, i.e., the curvature C(x) approaches zero.

250
The curvature graph tail (Fig. 9d) may fluctuate and complicate the detection of the zero curvature. Instead, we look at the first and second derivatives separately. The threshold value is selected as an inflection point at the right side of the second derivative graph (Fig. 9e). A similar point in the first derivative graph can be used as a complementary value to improve accuracy. In the case of the lidar data, the derivative plots may have strong oscillations. Therefore, we fit a polynomial function on the range between intensity I k corresponding to the most prominent local extremum and maximum intensity I max = 1. We The estimated inflection points should lie close to each other. We select the threshold as an average value between first and second derivative inflection points to smooth the detection outcome. Alternatively, for example in the case of a smooth one-peak histogram, only a second derivative inflection point can be used.

Wake detection using image processing techniques
In the thresholded image, it can be expected that the near wake will be the largest continuous structure of all detected shapes due to the highest wake deficit. The presumed wake shape is also most likely to contain a wind turbine within it. Points identified as a wake can be extracted for further analysis, either as a shape or a borderline contour. The wake centerline is then defined as a middle line of the wake shape. To fully automate the centerline detection, we require the wind turbine coordinates, and 265 preferably, the wind direction. The wind direction is particularly useful to distinguish wake shapes from noise, as it allows to exclude shapes detected in the upwind direction as false positives. Nevertheless, the centerline can be also estimated without the prior knowledge of wind direction.
We demonstrate the centerline detection on a simple case of 10-minute averaged LES wake (Fig. 10a). After the threshold is applied, the wake is presented as a continuous structure (Fig. 10b). In the example, the wake expansion is not detected 270 for x/D > 10 due to the wind speed recovery to the free flow near y/D ≈ 2. Nevertheless, the detected shape is suitable for the centerline detection. Assuming the outline of the shape as wake boundaries, we estimate the wake centerline using the following algorithm: 1. We start by drawing a circle of radius 1D around the wind turbine and mark points where the circle crosses the borders of the wake shape. If the circle happens to lie within the wake shape completely, the initial radius should be increased.

275
2. The midpoint of the arc inside the wake presumably indicates the wake direction and is stored as the centerline midpoint.
3. We increase the circle diameter and repeat steps 1−2 until the end of the wake shape is reached (Fig. 10c). In an irregular wake shape, the circular lines may cross the wake boundaries several times. For the initial step, it may be difficult to resolve the ambiguity, if the wind direction is unknown. Considering the near wake to be wide and continuous, we assign the midpoint of the longest arc inside the wake shape as the centerline point. We also assume that the wake does not 280 turn gradually further downstream. Therefore, the segment between the last known and unknown midpoint should turn by a relatively small angle compared to the previous segment. The new midpoint also has to lie within the wake shape.
Generally, this centerline-search method does not require knowledge of the wind direction. However, the aligned wakes subset (Fig. 8b) and, to a certain extent, bimodal subset (Fig. 8c) introduce ambiguity in a wake direction for the downstream wind turbine AV7. The ambiguity can be resolved either by applying the reference wind direction or starting the detection from 285 the upstream wind turbine AV10.
The estimated centerline is further used to calculate the wake direction. The wake from the wind turbine AV11 is usually weak and easily confused with the noise; we do not consider this wake in our analysis. We convert the coordinates of the centerline points for AV7 and AV10 wakes to the Cartesian system and subtract the respective wind turbine positions to get a set of the relative centerline coordinates. We assume a centered data set and add a point (0, 0) corresponding to the relative 290 wind turbine position. The composed data set is fitted with the linear regression, and the fitted line indicates the estimated wake direction.

Wake detection using the Gaussian method
The wake deficit distribution is similar to the Gaussian distribution in the far wake (Ainslie (1988)) and often shows a double Gaussian peak in the near wake (Magnusson (1999)). This feature makes a base for a widely used method to detect wake bound-295 aries and centerline (Vollmer et al. (2016); Krishnamurthy et al. (2017)). The method requires the data in a two-dimensional horizontal plane, which makes it versatile and practical to use for the wake detection. A small plane inclination can be allowed as long as the wake profile is not subjected to the wind shear effects.
A normalized wake deficit distribution is fitted with the Gaussian function: 300 where the amplitude A, mean value µ, and standard deviation σ are the parameters to fit; variable y is a coordinate across the rotor axis.
Assuming equal width of the two peaks, the double wake can be fitted with the symmetric double Gaussian function: where µ 1 and µ 2 denote positions of each peak (Aitken et al. (2014)).

305
Due to the lidar elevation angle, AV10 is scanned near the top tip and does not show a double wake. The scan resolution near AV7 is not always sufficient to resolve a pronounced double wake. Therefore, we fit only a single Gaussian function as in Eq. (7) and start fitting from 0.5D to avoid uncertainties caused by a weak double wake.
For a wake deficit distribution, the fitted single Gaussian function F (y) reaches maximum at y = µ, i.e., the estimated mean µ gives the wake center position across the rotor axis. The wake boundaries are defined through the mean value µ and standard 310 deviation σ as µ ± 2 ln 2σ so that the velocity deficit at wake boundaries is 5% of the velocity deficit at the wake center (Aitken et al. (2014)).
Unlike the centerline detection using image processing, the Gaussian method should be applied to the data extracted along the straight-line. We attempt fitting for the wake deficit distributions up to 15D downstream distance covering the length of most wakes in the lidar data set. 315 We run the Gaussian fit method in an automatic mode. The wake deficit distribution is extracted for a straight line perpendicular to a pre-defined search direction, usually the reference wind direction. The Gaussian fit algorithm thus requires knowledge of the wind direction before the fitting. The algorithm is also dependent on the accuracy of the wind direction measurements and the similarity between reference wind and actual wake direction. To reduce the influence of a possible discrepancy between wind and wake direction, we recalculate the search direction after enough points are accumulated. For example, the fitting starts 320 from at 0.5D with a step of 0.1D along the search direction. After the wake deficit at 2D is fitted, we fit the linear regression to the previously found center points and recalculate the search direction. The process is then repeated for each step along the wake.
It should be noted that a wake usually expands beyond rotor diameter size. Hence, it is important to have more data points in the wake deficit distribution to improve fitting results. At the same time, the extracted line should not be too long to exclude 325 disturbance from the other wakes, if present, or low wind-speed streaks.

Wake deficit threshold
In addition to the Gaussian fit, we apply a threshold based on the wake deficit criteria. The method assumes that the point belongs to the wake if the wind speed at it is less or equal to 95% of the free-flow wind speed (España et al. (2011)). Here we assign the reference wind speed as the free flow speed. The deficit-based threshold is presented as the wind-speed threshold. For

Manual wake detection
To evaluate the performance of the ATS method, we perform a manual segmentation to select an optimal threshold for each 335 lidar scan and use it as a reference. Since the available scans represent different wake-wake and wake-turbine interactions, the criteria for a reasonable threshold varies over the subsets. In order to reduce human error, we use the following qualitative criteria: 1. The shape of the wake should be distinguishable well enough not to be mistaken with noise.
2. The noise should be reduced near wind turbines AV7 and AV10 but is allowed near AV11 since its wake has low 340 importance in this study.
3. The shapes of wakes from AV7 and AV10 should not merge to ease the centerline detection.
We also perform a manual centerline detection. A presumed centerline is drawn over the lidar scan as a line or series of points. For further comparison with wake detection methods, it is converted to the Cartesian coordinates using a plot digitizer.
Unlike the manual detection of the wake shape, the manual centerline detection is more prone to errors, especially in the far-345 wake region, where the wake becomes less distinguishable from the free flow. Due to ambiguity and complexity of the manual detection, we select only few lidar scans to demonstrate the methods' performance.

Results
In this section, we first present a proof of concept ATS detection on less noisy LES data and compare it to the Gaussian method.
For the lidar data, we perform an extensive comparison to the manual wake detection and evaluate the accuracy of the ATS 350 method. We further compare the performance of the ATS and Gaussian methods and discuss the application of the ATS method in the centerline detection.

LES wake detection
Unlike a 10-minute averaged LES wake, the instantaneous wake reveals more complex spatial features to be detected, although its intensity histogram remains rather smooth (Fig. 11a). Figure 11b compares the centerline and wake shape detected by the 355 Gaussian and ATS methods. Overall, the Gaussian method performs well in the range of 2 < x/D < 10. Additional errors may occur in the far wake (x/D > 10), where the wake recovers to the free flow and the wake deficit function becomes too weak to fit accurately.
The ATS method detects a continuous structure in the near wake, while the far wake is represented as series of small disconnected structures (Fig. 11c). Nevertheless, those structures primarily lie within the wake detected by the Gaussian method.

360
The Gaussian centerline passes through the centers of the ATS-detected structures. A good agreement between methods can be explained by the fact that the ATS method searches for regions of high intensity, i.e., low wind speed. At the same time, the Gaussian method approximates a wake center at the point of high wake deficit, which also corresponds to low wind speed. While the centerline extraction using image processing works well with the continuous shapes, correct identification of the downstream disconnected shapes as a part of the wake requires further modification of the algorithm, that is not discussed in 365 this study.
We have used the LES data as a proof-of-concept to show the capability of the ATS method to detect wakes from idealized data. The further sections focus on the wake detection from the lidar data and challenges caused by less uniform free flow.

Comparison of the ATS detection against the manual detection on lidar data
We construct a confusion matrix to assess the performance of the methods for a single lidar scan. The confusion matrix 2×2 370 describes the comparison of the automatic thresholding methods (wake deficit or ATS) against the manual method and contains the following outcomes: -True Positive (TP) -the point is detected as a wake point by both methods.
-True Negative (TN) -the point is detected as a free-flow point by both methods.
-False Positive (FP) -the point is detected as a wake point by the automatic method but is a free-flow point in the manual 375 detection.
-False Negative (FN) -the point is detected as a free-flow point by the automatic method but is a wake point in the manual detection.
If the detection is accurate with the respect to the manual detection, TP and TN values tend to 100%, while FP and FN are close to zero.

380
The bimodal subset can be considered the most convenient for the manual threshold segmentation. It utilizes strict criteria for the manual threshold that the wake shapes should not merge (Fig. 12d). In the example, the ATS method sets the threshold higher compared to the manual detection (Fig. 12e). Hence the far-wake area is slightly reduced. The wake deficit method ( Fig. 12f) produces a similar result. The bimodal subset is the only one, where the performance of the deficit-based method is comparable with the manual detection and ATS method and all methods produce similar results.

385
The aligned wakes subset utilizes the same manual threshold criteria for the wake splitting as bimodal subset (Fig. 13), although it may be harder to fulfill. For some lidar scans, the far wake from the turbine AV10 and the near wake from AV7 cannot be separated, unless the threshold is increased so the far wake is not detected (Fig. 13d). In this case, detecting a general shape of the wakes is the priority. The manual threshold is then more subjective than that of the bimodal subset. The deficitbased method underestimates the threshold more significantly than in the bimodal case and produces a large percentage of false 390 positive detections (Fig. 13f).
The parallel wakes subset is the most challenging both for the manual detection and automatic methods (Fig. 14). The wind direction in the subset is close to 210 • , at which the crosswind effects start (Fig. 3) and noise appears at the border of a lidar scan. Unlike the corrupted scans with a high amount of non-physical wind speed values, the region near the wind turbines AV7 and AV10 contains valid measurements and allows to perform wake detection with relative success. However, the detection 395 accuracy declines due to the border noise; only one wake can be extracted well enough to perform analysis on the wake direction and shape evolution. If the threshold is increased to distinguish wakes and noise, the wake from AV10 remains nearly undetected as can be seen from Fig. 14d. The ATS method returns a lower threshold that improves distinguishing the shape of the AV10 wake but falsely detects noise near as a part of the AV7 wake (Fig. 14e). The deficit-based method significantly underestimates the threshold and generates too many false positives (Fig. 14f).    We plot the thresholds predicted by the wake deficit and ATS methods against the manual segmentation (Fig. 15). The deficit-based threshold is primarily used to detect wakes from LES or PIV data (España et al. (2011)). The lidar data has less uniform free flow and lower resolution. This results in a relatively poor correlation with the manual segmentation -the deficitbased threshold is always underestimated. Therefore, more points, including noise points, might be falsely detected as wake points. The deficit-based thresholds form two clusters in the threshold plot (Fig. 15a). The parallel and aligned wake subsets share similar behavior and return a threshold significantly lower than manual-detected threshold. In the bimodal case, the deficit-based threshold shows better agreement with the manual-detected threshold but not as good as the ATS-based threshold (Fig. 15b). Overall, the deficit-based method consistently underestimates the threshold, implying that the 5% criteria should be changed to improve the detection result. The ATS method performs on par with the manual detection. The strongest discrep-410 ancy between the manual and ATS methods can be seen for the aligned wakes subset, in which the outliers might be partially caused by the ambiguity of the manual detection criteria.
To reduce the influence of ambiguity of the manual detection, we construct a confusion matrix for each subset of 25 consecutive lidar scans instead of single scans. The corrupted scans are excluded from the comparison, since high noise prevented the manual detection for most of the scans. Table 2 summarizes the detection outcomes for each subset. The deficit-based method 415 often underestimates the threshold, therefore the share of true positive detections is nearly always 100%. However, the amount of true negatives indicates a high probability of identifying noise as a wake. Additionally, the percentage of true negative detections strongly fluctuates within the same type of the subset making the fixed threshold method unreliable. While the amount of true positives for the ATS method may drop to 80% for a complex subset, the amount of true negatives consistently stays near 95% -the background flow is mostly detected correctly, which is a significant improvement compared to the deficit-based 420 method. Compared to the manual detection, the ATS method does not always separate wake and noise correctly, particularly for the parallel wakes subset (Fig. 14) and thus requires additional filtering. For the aligned and bimodal subsets, the ATS method is capable to detect the general wake shape rather similar to the manual detection.

Centerline detection
We perform centerline detection using the ATS method as described in Sect. 4.2 and the Gaussian method as described in 425 Sect. 4.3. Both methods are compared against the manual wake detection from the lidar scan image as described in Sect. 4.5.
We compare the found centerlines by fitting the regression lines to the relative coordinates, so that each local coordinate system is centered at a selected wind turbine.
The parallel wakes subset (Fig. 16) contains a short but contrast wake from the wind turbine AV7 and a longer wake from the wind turbine AV10. Due to the fact that the AV10 wake is detected as series of small disconnected structures, the current 430 ATS method detects the centerline only in the near-wake region. The manual and Gaussian detection can be carried further into the far-wake region, but become rather uncertain as the far wake recovers to the free flow or mixes with the border noise.
The aligned wakes subset (Fig. 17) possesses a distinctive feature: the wakes are aligned along the line connecting two wind turbines resulting into the merge of the AV10 far wake and the AV7 near wake. Additionally, the connecting line is parallel to Y -axis in the Cartesian coordinates (Fig. 2), so the centerline tends to X = const when the wakes are perfectly aligned. Hence, 435 the coefficient of determination R 2 either approaches zero or negative and does not indicate the quality of the regression fit.
The bimodal subset (Fig. 18) has the longest wakes in the data set. The wake detection in the far wake (i.e. lidar near range) is hindered by wake merging and the narrowness of the scanned area. For example, the ATS method may underestimate the threshold and detect merging wakes as a single shape. The ATS-based threshold can be adjusted to guarantee the wake splitting. The adjustment is performed automatically by increasing the threshold with an increment of 0.05 until the stopping criteria -440 the wind turbines belong (or are located near) to different wake shapes -is reached.
The Gaussian method, in turn, may detect a wake center incorrectly because of high wake deficit in the neighboring wake.
The detection inaccuracy in the lidar near range is compensated by higher overall number of data points available for fitting, compared to other subsets.
Both methods perform best for the long wakes in a lidar scan with low noise and show a good agreement on the detected 445 centerlines. When the far-wake region cannot be detected properly, the methods still agree in the near wake. For noisy subsets, such as parallel wakes, the performance of all methods strongly relies on the data quality. The longer the near wake can be detected, the higher is the accuracy of the centerline detection. The ATS method requires additional algorithm to identify downstream disconnected structures as a part of the wake.  Figure 19 shows an example of wake detection performed on a lidar scan from the aligned wakes subset. The subset is characterized by the wake merging near AV7. The formed structure proves to be challenging for a Gaussian method, as the centerline point and far-wake width for AV10 are estimated incorrectly. The ATS method detects wakes as a single shape. Unlike the bimodal subset, the merging wakes in the aligned wake subset do not necessarily worsen the performance of the centerline detection method. The centerline is first detected for AV10 wake, 455 from which the wake direction can be estimated. Since the wakes are merged, the centerline detection for AV10 continues in the AV7 wake. The centerline search for AV7 starts at the corresponding turbine location and is performed in the direction of AV10 wake, thus excluding the merge region from the search. Thus the centerline of AV7 wake gets detected twice if no stopping criteria (e.g., the AV10 centerline passes AV7 location) is activated. Both detected centerlines agree in the AV7 wake region and follow the Gaussian centerline rather well. Near-border wake centers of the AV7 wake deviate from the presumed centerline because border noise is erroneously attributed as a part of the wake. Considering the problems the border noise may pose for the wake detection in less clean scans (Fig. 14), excluding the near-border sector of 1−2 • width from the detection should improve the outcome.

Wind and wake direction
The regression line fitted to the ATS-detected centerline also indicates the wake direction. A strong mismatch between reference 465 wind direction and wake direction can be seen for most lidar scans from the data set (Fig. 19b).
Comparing the directions for the whole data set, we observe a clear trend for the wake direction deviating clockwise from the reference wind direction (Fig. 20). The trend continues for the reference wind direction above 210 • for the corrupted lidar scans, where wake detection was still possible despite the erroneous data.
The valid points for the reference wind directions less than 210 • group into two distinct clusters (Fig. 20). The leftmost 470 cluster corresponds to the bimodal subset and lies within the range of wind directions of 140−170 • . Another cluster contains the results for the aligned, transitional, and parallel wakes subsets and covers the range of wind directions of 170−210 • . Fitting a linear regression to each group returns a similar slope but a different intercept value. Although the fitted line slope is not equal to one, the regression fit on the selected range shows a nearly constant offset between wind and wake direction with the bimodal subset having noticeably lower difference, than other subsets.

475
The vertical veer and clockwise rotation of the wake in the Northern hemisphere due to the Coriolis force are known effects causing wake rotation and were confirmed by in situ observations and LES studies of the wind farms (Magnusson and Smedman (1994); Abkar and Porté-Agel (2016); van der Laan and Sørensen (2017)). The wind turbine AV7, closest to the lidar, is scanned nearly at the hub height, while the farther wind turbines, AV10 and AV11, are scanned near the top-tip height (Fig. 1). Due to the elevation and vertical veer, the wind and wake direction discrepancy is the strongest for AV10 and AV11. Nevertheless, we also observe a deflection for the near wake of AV7, although the noticeable effects of the Coriolis force are usually recorded for the downwind distance of 6D or higher. The additional discrepancy can be explained by the yaw misalignment (Bromm et al. (2018)), reference measurements uncertainty (Gaumond et al. (2014)), the lidar installation's imperfection. The discrepancy for the bimodal subset could be possibly reduced because of the longer wakes and, consequently, more precise regression fit.
We do not have additional data to distinguish these factors and leave it for future study.

485
The outliers showing strong differences between wind and wake direction highlight the lidar scans where the wake detection was hindered by noise or strong irregularity of the wake shape. The wind-wake direction plot can be used for diagnostic purposes to select the lidar scans that require additional processing prior to wake detection.

Near and far wake separation with a threshold
Bimodal subsets have a distinctive double peak in the intensity histogram, which results into two local minimums in the second 490 derivative graph. The highest histogram peak corresponds to the free flow. The second peak forms due to a long far wake from AV10 and subsequent merging of the two wakes. Therefore, a bimodal subset provides a unique opportunity to use one of the two local minimums for the threshold estimation. Applying the ATS method for the first (lowest) local minimum returns a threshold to separate the full wake from the free flow. The other local minimum returns an ATS-based threshold that allows to detect only the near wake (Fig. 21a). According to our observations of the bimodal subsets, the ratio between near-wake and full-wake thresholds falls into the range of 1.3−1.5. The threshold multiplier can be applied to the aligned wakes subset but needs to be higher than in the bimodal case to allow a good separation of the near and far wake. For the parallel wakes subset, the near and far wake separation is complicated by noise, and the full-wake detection returns disconnected structures. Overall, we see a potential in using the ATS method to split the wake into parts for the further analysis.
We proposed an automatic thresholding method for the wake detection based on the image processing method for the whitecaps detection on the ocean surface. We also described an automatic method to detect the wake centerline from an irregular wake shape. The results showed that image processing techniques were also a viable solution for wake detection and did not strongly depend on the supplementary measurements, such as reference wind speed and direction. The preprocessing tools developed 505 alongside the wake detection method helped to recognize the lidar scans that are not suitable for the analysis. The same tools allowed to group the valid scans into subsets based on the expected features. Future plans include testing the preprocessing tools on a larger data set to completely remove dependency on the supplementary data and visual analysis for the filtering and classification.
The ATS method generally agreed well with the manual threshold selection. We also compared the ATS method against 510 the Gaussian fit method. Although the Gaussian method performed on the lidar scans not as good as on the LES data, the detections in the near-wake region showed a good agreement between the methods with the respect to the manual detection. At the same time, the accuracy of both Gaussian and ATS methods decreased in the far-wake region or in the case of wake-wake interaction. In the latter case, the ATS method often detected two wakes as a single shape. To perform a centerline detection, we adjusted a threshold to split the wakes. The separation of wake and false positives remained an open problem and left room 515 for improving the ATS method in the future studies.
Video supplement. The video https://doi.org/10.5446/54055 demonstrates wake shape detection results for all lidar scans in the data set. No post-processing is performed after running the ATS algorithm.
Appendix A: Image data processing Conversion of the raw data into the image erases the exact information on distances and wind speed values. Therefore, it is 520 preferable to use original lidar data for the wake detection. If the original data is not available anymore, the ATS method can still be run on the image.
An image has several properties which may affect the algorithm performance compared to the use of the raw wind speed data: 1. Image resolution in dots-per-inch (dpi): the resolution of 72 dpi transforms an original data point into an image pixel as 525 1-to-1 approximately. Higher resolution increases the number of pixels per data point. Lower resolution merges several data points into one pixel.
2. Colormap: the ATS method relies on the image grayscale intensity as an input. A non-grayscale image can be desaturated, but the colormap of the original image then should be sequential rather than perceptually uniform or diverging. For the latter, the conversion to the grayscale gradually reduces the contrast between high and low values making the detection impossible. Additionally, several grayscale colormaps exist. Depending on the colormap, the image intensity histogram may shift to the left or right compared to the raw data. We observed this effect when the 'Greys' colormap of the Python Matplotlib library was used. This colormap emphasizes light tones; as a result, the intensity histogram peak slightly shifts to the right, although the general shape of the peak is preserved (Fig. A1). The colormaps 'binary' or 'gray' from the same library return the result that follows the original data. Raw data Binary colormap Greys colormap Figure A1. Comparison of the intensity distribution in the original (raw) data and image plotted using Python Matplotlib with different grayscale colormaps.
3. The image intensity: as processed by Python, the values are rounded up to second digits and some are assigned to different bins compared to the original data. The histogram and CDF have stronger oscillations than the raw data ( Fig. A1) and require smoothing before the ATS method can be applied.
Running an automated threshold detection on the image raises another question: how much does the image resolution affect the detection accuracy compared to the raw data. We apply the ATS algorithm to raw and image data under different 540 resolutions: 72, 150, and 300 dpi. We observe little influence from the image resolution, except for few LES cases the low resolution of 72 dpi affected the threshold detection. In those cases, the detected threshold is lower than in fine resolution cases, and, therefore, a larger shape is identified as a wake. The image resolution of 150 dpi and above agrees well with the wake detection from the raw data. The general shape of an image intensity histogram does not depend on the image resolution.
The image resolution of 150 dpi or higher is recommended for use, although 72 dpi also produces good detection results.

545
In the case of lidar measurements, the detection from the image data can be performed in two ways: by plotting the original data either in polar or Cartesian coordinates. The detection from the polar coordinates image does not bear a significant difference from the raw data detection. After the conversion to the Cartesian coordinates, the lidar close and far range get distorted, affecting the histogram shape.
The effect is most pronounced when the wind blows towards the lidar. As described in the subsets overview in Sect. 3.1, this 550 wind direction and wake behavior result in the bimodal intensity histogram. The leftmost, high peak, contains points from the free flow, while the second, low, peak accumulates points from the far wake. The second peak gets smoothed when the input data is changed from the normalized wind speed to the grayscale image plotted in Cartesian coordinates. As can be seen from the comparison (Fig. A2), the lower peak corresponds to the data in the lidar's close range. After conversion to the Cartesian coordinates, the close range area shrinks significantly, while the free-flow area on the far lidar range enlarges. The transition 555 between coordinate systems changes the balance between wake and free-flow pixels and virtually increases the share of the latter.