Mean Field Multi Agent Reinforcement Learning for Active Wake Control

None, None

Mean Field Multi Agent Reinforcement Learning for Active Wake Control

Bachelor Thesis (2023)

Author(s)

I. Plămădeală (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

G. Neustroev – Mentor (TU Delft - Algorithmics)

MM de Weerdt – Mentor (TU Delft - Algorithmics)

P Przemysław – Graduation committee member (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Active Wake Control Mean-Field Multi Agent Reinforcement Learning Reward function Observation view

To reference this document use:

https://resolver.tudelft.nl/uuid:5cf1f1a6-e306-4c1c-9a9b-f4c2de104065

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

30-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The wake effect which is turbulence behind a wind turbine created when it extracts energy negatively impacts the power output of the downstream turbines. Active Wake Control can mitigate this effect, by rotating some turbines away from the wind. Previous research applied single agent reinforcement learning to apply Active Wake Control, show- ing good results for small-scale layouts, that don’t scale for larger, practical wind farms.
To that extent, this study focuses on the application of mean-field multi-agent reinforcement learning to Active Wake Control, under constant wind conditions. This algorithm limits the computations to a limited set of neighbouring turbines, reducing their complexities. To build the answer to this question I will also study:
1. how to model the rewards to solve the lazy- agent problem, leveraging the nature of the Active Wake Control
2. how the view of the agent changes the results
3. how does it compare to a single-agent reinforcement learning algorithm, TD3
The experiments were done using the Floris Wake Simulator, with each turbine sharing the same agent, placed in tunnel layouts at real-life distances (6-7 rotor diameters), under constant wind conditions.
Results show that with the proper configuration of rewards and view space within wind tunnels, the mean-field algorithm finds near optimal configurations for Active Wake Control, within a small number of episodes. This shows a promising start for the application of mean-field multi-agent algorithms for the Active Wake Control problem, and provides insight into how to model the rewards, which might be applicable for the whole class of algorithms.

Files

Plamadeala_thesis.pdf

(pdf | 0.46 Mb)

License info not available