Multi Agent Deep Deterministic Policy Gradient for Active Wake Control

Bachelor Thesis (2023)
Author(s)

G. van der Schaaf (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.M. de Weerdt – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

G. Neustroev – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

P. Pawelczak – Graduation committee member

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2023
Language
English
Graduation Date
29-06-2023
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
331
Collections
thesis
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In wind farms wind turbines are often placed close to each other. Each turbine generates a turbulent wake field, this field negatively affects subsequent turbines. This can cost more than 12% efficiency. To decrease this loss we can steer the turbines away from the wind direction, this will decrease the individual turbine power output, but can increase the total power output of the farm. As the size of the farm increases the number of possible actions increase exponentially. Due to this a numerical solution is not feasible. A reinforcement learning technique has been proven useful in the past, but a standard single agent implementation is still very computationally expensive. We evaluate the effectiveness of MADDPG on the active wake control problem. MADDPG is a multi agent reinforcement learning algorithm. MADDPG will be compared to the numerical solver FLORIS and to the already implemented and proven TD3 (which is a variation on a single agent DDPG algorithm). We compare the eventual output power of the algorithms with MADDPG. From the results we can see that MADDPG does improve on the learning performance of TD3, but since MADDPG needs to manage more neural networks the overhead is larger. MADDPG reaches an optimum solution in less training steps, but these steps take significantly more time.

Files

License info not available