Comment on wes-2020-134

The preprint “Seasonal effects in the long-term correction of short-term wind measurements using reanalysis data” by Alexander Basse et al. deals with MeasureCorrelate-Predict (MCP) methods in the context of a shorter-than-standard measurement periods (3 month instead of 1 year). The long-term correction of in-situ measurements is an important step of any wind resource assessment for developing a wind farm. Thus, the topic is very relevant for the wind energy industry. In particular, this focus on 3-month measurements could help make a better use of preliminary or incomplete measurement campaigns. Among the variety of MCP methods, the authors investigate 2: the Variance Ratio and the Linear Regression with residuals. This choice is relevant as they are amongst the most used MCP methods. The question is addressed on a theoretical level and using a large set of wind-speed measurement data (18 sites) and a set of 6 relevant reanalyses as long-term reference data. Despite those qualities, some flaws in the theoretical analysis impair the argumentation and conclusions. The major issue being that the role of the correlation between measurements and references is not enough considered nor investigated. The measurement data set is also under-exploited as site differences (arising from various terrains, various correlations to the reference data, or various measurement years) are not taken into account to analyse the results. Sections 4 & 5 are too long, difficult to follow, for little conclusion. They could be written much more concisely. The reading is also impaired by grammatical mistakes and awkward phrasings.


§1 Introduction
L25: I don't think that a "combination of observations and numerical models" is a fair definition of reanalysis. §2 Data L80 "an entire year". Could you precise (in Table 1 or elsewhere) which year is available at each site? Are they all from the same year or from various years? You should also add some information about the average wind speed at each site and the correlation coefficient with each reference data set. All of this is important to understand the results.
L93: MERRA-2 is also available as instantaneous values (at hourly resolution). Why not use this data set instead of the time-averaged one? §3 Methodology L128: you should state more clearly in this paragraph that, in this paper, ST=3 month and LT=1 year, while, usually, ST=1 year and LT=10-20 years.
You should consider adding a graph or something to better explain your notations and the connexion between the various series (a. finding correlation between u meas and u ref ; b. predicting U corr, LR/VR from U ref ; c. verifying U corr against U meas ...). §3.3 This section should contain the actual definition (formulas) of the scores. In fact, Err mean and Err var are never defined in the manuscript (except that Err mean is given L250-251, but called "theoretical" there, why?). And the figures are given in %: % of what? How do you normalize?
[major issue] You apply the MCP methods to all sites and periods with all references. Is the correlation between reference and measurements high enough in all cases? Linear MCP methods should not be used if the correlation is low. If some combinaisons of [site, period, reanalysis] do not meet this criterium (with a sensitivity analysis on the correlation threshold), they should be removed from the results.
All results are shown averaged over all sites. First, it is not always clear how you average the sites. It should be detailed in this methodology section.
[major issue] Then, what about the variability across sites? You address it only at the very end of the manuscript (and not in a satisfactory way). But in the first figures, we have no idea whether the "average" behaviour is representative of all sites and if the differences between references are significant or not. There sould be some intervals associated with the mean curves. (+previous comment on the need of maybe removing some sites). §4 Theoretical considerations §4.1 This section is confused between the energy density (energy available in the wind, proportional to u 3 ) [let's call it ED] and the power production i.e. the output of a wind turbine (not proportional to u 3 because of the shape of the power curve) [let's call it PP] L224 "power in wind P": this is ED, not PP And since PP (what we want) is not proportional to u 3 , I do not find this analysis very relevant. Using u 3 puts way too much weight over high wind speeds that finally do not produce power (>cut-out) or only the nominal power. §4.2 Eq 13: I do not find this decomposition so relevant. It is difficult to interpret because it is a difference, not a sum, of 2 terms. §4.3 L282-284: I am not sure that this is the right conclusion. Hence, what is important is that the ratio ST-variance over LT-variance should be similar in the reference and the measurement. So, if a reference is always under/over-dispersive, it could OK (regarding this particular aspect). Also note how this relates to the parameter d var introduced later in §5.3. L290-292: You did not explain how the residual distribution is fit, i.e. how σ ε relates to the other properties (and I could not access the reference paper). Hence some questions: Is is possible to really distinguish between effects 2 & 3? Isn't effect 3 important and worth investigating? You do not mention it ever again. § 5 Experimental results § 5.1 to § 5.4.1 L313: it is unclear why you introduce this particular parameter d mean. , not directly related to the previous theoretical considerations. In fact, you could have linked d mean to Err mean . If you go further from Eq (13), you would see that: Err mean = mean(U corr ) -mean(U meas ) = -β 1 U ref d mean + … This explains what you see in the experimental section.
L330-341 ( § 5.2): it feels like you are discovering that β 1, VR > β 1, LR while this is central and should be obvious from the start. L345: again it is unclear why you introduce this particular parameter d var , and how it relates to the theory in §4.3. L381: of course the differences between LR and VR arise from β 1 . This should be stated much earlier and investigated thourougly. Considering that β 1, LR = r 2 β 1, VR , the correlation coefficient is very important and should be considered way before §5.4.2 You should also try to understand the differences between seasons and sites originating from β 1 : σ meas /σ ref : whether the reference is under/over dispersive: how does this vary spatially, temporally and among reanalyses? [for LR only]: r 2 : how does the correlation vary across the sites and across the year? Why is the correlation lower in spring? Is it a general conclusion or is it linked to one/some particular(s) year(s)? (Are all the measurements from the same year or very different years? cf. Data section) L406-431: some literature review that might be better in the introduction? § 5.4.3 For the "ED" score: why not use mean(u 3 ) directly? And relevance for the industry (cf. §4.1)?
The results of the "turbine" score may depend a lot on the turbine's choice (especially its rated wind speed, here 14 m/s, and, to a smaller extent, its cut-in wind speed). Did you conduct any sensitivity analysis? If not, you should try several power curves having different rated wind speeds. §6 Conclusions L541: "Short-term wind measurements are recommended to be conducted in periods of representative wind conditions", but the whole point of a MCP method is to correct the fact that the ST period is not representative of the LT! In practice, how would you know that a ST period is representative enough? given the very high inter-annual variability in wind speed, even if the ST period is in spring or fall, it does not seem that you could guarantee that they would be close to the LT mean, does it?
Have you considered the possibility of non-contiguous 3-month periods? For example if a LIDAR is moved around 3 or 4 sites, changing place every month?

Technical corrections
Maybe use different notations for residuals in Eqs 1 and 6.