Configuration of the Actor and Critic Network of the Deep Reinforcement Learning controller for Multi-Energy Storage System
Paula Páramo-Balsa (University of Seville)
Francisco Gonzalez-Longatt (University of South-Eastern Norway)
Martha Nohemi Acosta Montalvo (University of South-Eastern Norway)
Jose L. Torres (TU Delft - Intelligent Electrical Power Grids)
Peter Palensky (TU Delft - Intelligent Electrical Power Grids)
Francisco Sánchez (Loughborough University)
Juan Manuel Roldan-Fernandez (University of Seville)
Manuel Burgos-Payán (University of Seville)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The computational burden and the time required to train a deep reinforcement learning (DRL) can be appreciable, especially for the particular case of a DRL control used for frequency control of multi-electrical energy storage (MEESS). This paper presents an assessment of four training configurations of the actor and critic network to determine the configuration training that produces the lower computational time, considering the specific case of frequency control of MEESS. The training configuration cases are defined considering two processing units: CPU and GPU and are evaluated considering serial and parallel computing using MATLAB® 2020b Parallel Computing Toolbox. The agent used for this assessment is the Deep Deterministic Policy Gradient (DDPG) agent. The environment represents the dynamic model to provide enhanced frequency response to the power system by controlling the state of charge of energy storage systems. Simulation results demonstrated that the best configuration to reduce the computational time is training both actor and critic network on CPU using parallel computing.