Mean Field Multi Agent Reinforcement Learning for Active Wake Control

Bachelor Thesis (2023)
Authors

I. Plămădeală (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Supervisors

Greg Neustroev (TU Delft - Algorithmics)

MM de Weerdt (TU Delft - Algorithmics)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Ion Plămădeală
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Ion Plămădeală
Graduation Date
30-06-2023
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The wake effect which is turbulence behind a wind turbine created when it extracts energy negatively impacts the power output of the downstream turbines. Active Wake Control can mitigate this effect, by rotating some turbines away from the wind. Previous research applied single agent reinforcement learning to apply Active Wake Control, show- ing good results for small-scale layouts, that don’t scale for larger, practical wind farms.
To that extent, this study focuses on the application of mean-field multi-agent reinforcement learning to Active Wake Control, under constant wind conditions. This algorithm limits the computations to a limited set of neighbouring turbines, reducing their complexities. To build the answer to this question I will also study:
1. how to model the rewards to solve the lazy- agent problem, leveraging the nature of the Active Wake Control
2. how the view of the agent changes the results
3. how does it compare to a single-agent reinforcement learning algorithm, TD3
The experiments were done using the Floris Wake Simulator, with each turbine sharing the same agent, placed in tunnel layouts at real-life distances (6-7 rotor diameters), under constant wind conditions.
Results show that with the proper configuration of rewards and view space within wind tunnels, the mean-field algorithm finds near optimal configurations for Active Wake Control, within a small number of episodes. This shows a promising start for the application of mean-field multi-agent algorithms for the Active Wake Control problem, and provides insight into how to model the rewards, which might be applicable for the whole class of algorithms.

Files

Plamadeala_thesis.pdf
(pdf | 0.46 Mb)
License info not available