Safe Reinforcement Learning for V2G-Enabled Electric Vehicle Aggregators
Ruben Eland (Student TU Delft)
S. Orfanoudakis (TU Delft - Electrical Engineering, Mathematics and Computer Science)
P.P. Vergara Barrios (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The increasing penetration of Electric Vehicles (EVs) and renewable energy sources is placing significant stress on existing power grid infrastructure. This work investigates the application of vehicle-to-grid (V2G)-enabled smart charging in workplace environments from the perspective of EV aggregators, using real-world charging data from Dutch business parking lots. To address the limitations of conventional deep Reinforcement Learning (RL) methods in enforcing operational constraints, we propose a Safe RL method using the Constrained Variational Policy Optimization (CVPO) algorithm, specifically designed to reduce constraint violations and enhance reliability. Empirical results show that CVPO outperforms classic RL baselines and rule-based policies, closely approximating the performance of an optimal offline benchmark while exhibiting strong generalization to unseen scenarios.