Optimal energy system scheduling using a constraint-aware reinforcement learning algorithm

Journal Article (2023)
Author(s)

Shengren Hou (TU Delft - Intelligent Electrical Power Grids)

Pedro Vergara Barrios (TU Delft - Intelligent Electrical Power Grids)

Edgar Mauricio Salazar Salazar (Eindhoven University of Technology)

Peter Palensky (TU Delft - Intelligent Electrical Power Grids)

Research Group
Intelligent Electrical Power Grids
Copyright
© 2023 H. Shengren, P.P. Vergara Barrios, Edgar Mauricio Salazar Duque, P. Palensky
DOI related publication
https://doi.org/10.1016/j.ijepes.2023.109230
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 H. Shengren, P.P. Vergara Barrios, Edgar Mauricio Salazar Duque, P. Palensky
Research Group
Intelligent Electrical Power Grids
Volume number
152
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The massive integration of renewable-based distributed energy resources (DERs) inherently increases the energy system’s complexity, especially when it comes to defining its operational schedule. Deep reinforcement learning (DRL) algorithms arise as a promising solution due to their data-driven and model-free features. However, current DRL algorithms fail to enforce rigorous operational constraints (e.g., power balance, ramping up or down constraints) limiting their implementation in real systems. To overcome this, in this paper, a DRL algorithm (namely MIP-DQN) is proposed, capable of strictly enforcing all operational constraints in the action space, ensuring the feasibility of the defined schedule in real-time operation. This is done by leveraging recent optimization advances for deep neural networks (DNNs) that allow their representation as a MIP formulation, enabling further consideration of any action space constraints. Comprehensive numerical simulations show that the proposed algorithm outperforms existing state-of-the-art DRL algorithms, obtaining a lower error when compared with the optimal global solution (upper boundary) obtained after solving a mathematical programming formulation with perfect forecast information; while strictly enforcing all operational constraints (even in unseen test days).