Configuration of the Actor and Critic Network of the Deep Reinforcement Learning controller for Multi-Energy Storage System

None, None; None, None; None, None; None, None; None, None; None, None; None, None; None, None

Configuration of the Actor and Critic Network of the Deep Reinforcement Learning controller for Multi-Energy Storage System

Conference Paper (2022)

Author(s)

Paula Páramo-Balsa (University of Seville)

Francisco Gonzalez-Longatt (University of South-Eastern Norway)

Martha Nohemi Acosta Montalvo (University of South-Eastern Norway)

Jose L. Torres (TU Delft - Intelligent Electrical Power Grids)

Peter Palensky (TU Delft - Intelligent Electrical Power Grids)

Francisco Sánchez (Loughborough University)

Juan Manuel Roldan-Fernandez (University of Seville)

Manuel Burgos-Payán (University of Seville)

Research Group

Intelligent Electrical Power Grids

Copyright

DOI related publication

https://doi.org/10.1109/GPECOM55404.2022.9815793

Parallel computing Deep reinforcement learning Energy storage systems Actor-network Critic network Enhanced frequency response

To reference this document use:

https://resolver.tudelft.nl/uuid:5ddd951e-8708-4961-b95e-12b0c5427d03

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Research Group

Intelligent Electrical Power Grids

Pages (from-to)

564-568

ISBN (print)

978-1-6654-6926-5

ISBN (electronic)

978-1-6654-6925-8

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The computational burden and the time required to train a deep reinforcement learning (DRL) can be appreciable, especially for the particular case of a DRL control used for frequency control of multi-electrical energy storage (MEESS). This paper presents an assessment of four training configurations of the actor and critic network to determine the configuration training that produces the lower computational time, considering the specific case of frequency control of MEESS. The training configuration cases are defined considering two processing units: CPU and GPU and are evaluated considering serial and parallel computing using MATLAB® 2020b Parallel Computing Toolbox. The agent used for this assessment is the Deep Deterministic Policy Gradient (DDPG) agent. The environment represents the dynamic model to provide enhanced frequency response to the power system by controlling the state of charge of energy storage systems. Simulation results demonstrated that the best configuration to reduce the computational time is training both actor and critic network on CPU using parallel computing.

Files

Configuration_of_the_Actor_and... (pdf)

(pdf | 1.09 Mb)

- Embargo expired in 11-01-2023

License info not available