Multi Agent Deep Deterministic Policy Gradient for Active Wake Control

None, None

Multi Agent Deep Deterministic Policy Gradient for Active Wake Control

Bachelor Thesis (2023)

Author(s)

G. van der Schaaf (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.M. de Weerdt – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

G. Neustroev – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

P. Pawelczak – Graduation committee member

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Active Wake Control Multi Agent Reinforcement Learning Deep Learning MARL

To reference this document use

https://resolver.tudelft.nl/uuid:0915d6b9-f8a0-4097-bfa9-490cbe5e942b

More Info

expand_more

Publication Year

2023

Language

English

Graduation Date

29-06-2023

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

344

Collections

thesis

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In wind farms wind turbines are often placed close to each other. Each turbine generates a turbulent wake field, this field negatively affects subsequent turbines. This can cost more than 12% efficiency. To decrease this loss we can steer the turbines away from the wind direction, this will decrease the individual turbine power output, but can increase the total power output of the farm. As the size of the farm increases the number of possible actions increase exponentially. Due to this a numerical solution is not feasible. A reinforcement learning technique has been proven useful in the past, but a standard single agent implementation is still very computationally expensive. We evaluate the effectiveness of MADDPG on the active wake control problem. MADDPG is a multi agent reinforcement learning algorithm. MADDPG will be compared to the numerical solver FLORIS and to the already implemented and proven TD3 (which is a variation on a single agent DDPG algorithm). We compare the eventual output power of the algorithms with MADDPG. From the results we can see that MADDPG does improve on the learning performance of TD3, but since MADDPG needs to manage more neural networks the overhead is larger. MADDPG reaches an optimum solution in less training steps, but these steps take significantly more time.

Files

CSE3000_Research_Paper_Guus_va... (pdf)

(pdf | 0.282 Mb)

License info not available