A Hybrid Reinforcement Learning and Tree Search Approach for Network Topology Control

Master Thesis (2023)
Author(s)

G.J. Meppelink (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J.L. Cremer – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A. Rajaei – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2023
Language
English
Graduation Date
22-12-2023
Awarding Institution
Delft University of Technology , Norwegian University of Science and Technology (NTNU)
Programme
Electrical Engineering, European Wind Energy Masters (EWEM), Rotor Design Track
Faculty
Electrical Engineering, Mathematics and Computer Science
Downloads counter
262
Collections
thesis
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The growing demand for electricity, driven by widespread adoption of heat pumps, electric vehicles, and industrial electrification, strains power grids and introduces challenges for a reliable and secure supply amidst intermittent renewable energy integration. Network topology control offers flexibility, altering connections to redirect power flows and mitigate transmission line overloads. This thesis aims to investigate an ML and AI approach to overcome the computational complexity. The proposed approach merges a curriculum-trained machine learning agent with a Monte Carlo Tree Search (MCTS) to enhance power network action security. The MCTS guides the simulation of potential actions, considering future outcomes for improved long-term performance identification. A curriculum-based ML approach is used to pre-train an agent to propose grid actions. MCTS is then used to secure these actions, leveraging outcomes in the training algorithm for enhanced sample efficiency and reduced training times. The approach uses MCTS-verified, simulation-tested actions for immediate training feedback, eliminating the need to wait for scenario completion, enhancing sample efficiency. An electrically distance-guided search in the MCTS improves convergence by prioritising actions closer to overflows, often found to be most influential in reducing violations.

Files

License info not available