The growing penetration of renewable energy sources (RES) in power networks introduces significant challenges in load frequency control (LFC). Uncertainties in power generation make load balancing difficult, leading to frequency fluctuations that can cause equipment damage and bl
...
The growing penetration of renewable energy sources (RES) in power networks introduces significant challenges in load frequency control (LFC). Uncertainties in power generation make load balancing difficult, leading to frequency fluctuations that can cause equipment damage and blackouts. Additionally, the large-scale, spatially distributed nature of modern power systems necessitates a multi-agent control approach. Traditional PID-based controllers are ill-equipped to handle the uncertainties introduced by RES, while stochastic and robust model predictive control (MPC) methods, though capable of addressing small uncertainties, are often overly conservative. Similarly, reinforcement learning (RL) offers adaptability but lacks interpretability and explicit constraint handling. This thesis presents a distributed control framework that integrates model predictive control and reinforcement learning to address these challenges. Parametric uncertainties are incorporated into the system dynamics to account for stochasticities introduced by RES. At the core of the approach is a parameterized MPC scheme that approximates the RL value function, enabling the system to learn to avoid constraint violations while optimizing performance by driving state deviations from nominal operating conditions to zero. A distributed Q-learning scheme is used to learn the parametrization, which reduces the need for extensive information sharing, enhancing cybersecurity, and enables learning even with imperfect initial knowledge of system dynamics. The proposed framework is applied in simulations of a three-area power network to evaluate its potential and is compared against stochastic MPC and a deep deterministic policy gradient learning method. Results show that the proposed approach offers a balance between adaptability, performance and interpretability, and successfully handles constraints. It outperforms sample-based stochastic MPC in terms of cost, computation time and constraint handling, and outperforms deep deterministic policy gradient RL in performance, constraint handling and sample efficiency.