Distributed Load Frequency Control via Integrated Model Predictive Control and Reinforcement Learning

Under Increasing Levels of Uncertainties

Master Thesis (2025)
Author(s)

N.J. van der Strate (TU Delft - Mechanical Engineering)

Contributor(s)

Bart De Schutter – Graduation committee member (TU Delft - Delft Center for Systems and Control)

Samuel Mallick – Mentor (TU Delft - Team Bart De Schutter)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
27-03-2025
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Systems and Control']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The growing penetration of renewable energy sources (RES) in power networks introduces significant challenges in load frequency control (LFC). Uncertainties in power generation make load balancing difficult, leading to frequency fluctuations that can cause equipment damage and blackouts. Additionally, the large-scale, spatially distributed nature of modern power systems necessitates a multi-agent control approach. Traditional PID-based controllers are ill-equipped to handle the uncertainties introduced by RES, while stochastic and robust model predictive control (MPC) methods, though capable of addressing small uncertainties, are often overly conservative. Similarly, reinforcement learning (RL) offers adaptability but lacks interpretability and explicit constraint handling. This thesis presents a distributed control framework that integrates model predictive control and reinforcement learning to address these challenges. Parametric uncertainties are incorporated into the system dynamics to account for stochasticities introduced by RES. At the core of the approach is a parameterized MPC scheme that approximates the RL value function, enabling the system to learn to avoid constraint violations while optimizing performance by driving state deviations from nominal operating conditions to zero. A distributed Q-learning scheme is used to learn the parametrization, which reduces the need for extensive information sharing, enhancing cybersecurity, and enables learning even with imperfect initial knowledge of system dynamics. The proposed framework is applied in simulations of a three-area power network to evaluate its potential and is compared against stochastic MPC and a deep deterministic policy gradient learning method. Results show that the proposed approach offers a balance between adaptability, performance and interpretability, and successfully handles constraints. It outperforms sample-based stochastic MPC in terms of cost, computation time and constraint handling, and outperforms deep deterministic policy gradient RL in performance, constraint handling and sample efficiency.

Files

License info not available