Wind turbine wakes cause significant reductions in power production and increased fatigue damage for downwind turbines. Thus, they affect the wind levelized cost of energy. Computational Fluid Dynamics (CFD) can be used to quantify the wake characteristics, whereby Reynolds-avera
...

Wind turbine wakes cause significant reductions in power production and increased fatigue damage for downwind turbines. Thus, they affect the wind levelized cost of energy. Computational Fluid Dynamics (CFD) can be used to quantify the wake characteristics, whereby Reynolds-averaged Navier-Stokes (RANS) has the most potential for industrial applications due to the relatively low computational costs. However, RANS models all turbulence scales, usually done by the linear κ-ε turbulence model, which has significant shortcomings in accurately representing the turbulence characteristics in wind turbine wake applications. This results in an underprediction of the wake deficit. Key reasons for these shortcomings are that the eddy viscosity assumption is not valid in the near wake and that the anisotropic Reynolds stresses are not properly modeled. Also, the direct effects of the turbine forcing is not incorporated in the transport equations.

To address for these shortcomings, machine learning can be used to enhance the turbulence model with data-driven corrections. Recent developments showed for fundamental 2D flow cases that a novel algorithm referred to as SpaRTA (Sparse Regression of Turbulent Stress Anisotropy) can be used to discover sparse algebraic turbulence model corrections. These corrections could lead to improved mean-flow fields when trained on high-fidelity data. Disadvantages of SpaRTA are however that it can only cope with a limited input feature set and that the models have difficulty generalizing towards multiple flow regions simultaneously (e.g. free-stream and wake region).

To help resolve these disadvantages, mutual information, which is a measure from information theory that quantifies the general dependency between variables, is used to a priori measure the importance of a large number of features to the turbulence model corrections. As a result, the most important features can be used for correction model construction. In addition to this, to improve the model predictions in the turbine's wake, only the data samples located in the wake regions are used for training, discarding the free-stream data. Given that these data are discarded, it cannot be guaranteed that the correction models fit the trends in the free-stream. The correction models must therefore be neutralized by a newly constructed sparse algebraic logistic regression model, which distinguishes the wake from the free-stream region. The data used in this research consists of three time-averaged LES (Large Eddy Simulation) cases with multiple turbines on wind tunnel scale, under neutral conditions.

This thesis shows that mutual information can detect most of the essential features, which leads to a good match between the model predictions and the corrections derived from high-fidelity data. Discarding the free-stream samples during model training leads to a further reduction in error in the wake region, both in mean-squared as maximum-squared error of the correction terms. By implementing the constructed algebraic models into CFD, significant improvements in mean-flow fields are obtained compared to the linear κ-ε turbulence model. Nevertheless, there remains room for improvement as well as further research. Although the mean-flow fields match the high-fidelity data in the near wake closely, a discrepancy remains in the far wake.