A Mix-Integer Programming Based Deep Reinforcement Learning Framework for Optimal Dispatch of Energy Storage System in Distribution Networks
H. Hou (TU Delft - Intelligent Electrical Power Grids)
Edgar Mauricio Salazar (Eindhoven University of Technology)
Peter Palensky (TU Delft - Electrical Sustainable Energy)
Qixin Chen (Tsinghua University)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The optimal dispatch of energy storage systems (ESSs) in distribution networks poses significant challenges, primarily due to uncertainties of dynamic pricing, fluctuating demand, and the variability inherent in renewable energy sources. By exploiting the generalization capabilities of deep neural networks (DNNs), the deep reinforcement learning (DRL) algorithms can learn good-quality control models that adapt to the stochastic nature of distribution networks. Nevertheless, the practical deployment of DRL algorithms is often hampered by their limited capacity for satisfying operational constraints in real time, which is a crucial requirement for ensuring the reliability and feasibility of control actions during online operations. This paper introduces an innovative framework, named mixed-integer programming based deep reinforcement learning (MIP-DRL), to overcome these limitations. The proposed MIP-DRL framework can rigorously enforce operational constraints for the optimal dispatch of ESSs during the online execution. This framework involves training a Q-function with DNNs, which is subsequently represented in a mixed-integer programming (MIP) formulation. This unique combination allows for the seamless integration of operational constraints into the decision-making process. The effectiveness of the proposed MIP-DRL framework is validated through numerical simulations, demonstrating its superior capability to enforce all operational constraints and achieve high-quality dispatch decisions and showing its advantage over existing DRL algorithms.