As the number of Electric Vehicles (EVs) and renewable energy sources (RES) increases rapidly, power grids struggle to adapt. In the coming years, power system flexibility is urgently required to use the limited capacity of the existing infrastructure efficiently. Over the long t
...
As the number of Electric Vehicles (EVs) and renewable energy sources (RES) increases rapidly, power grids struggle to adapt. In the coming years, power system flexibility is urgently required to use the limited capacity of the existing infrastructure efficiently. Over the long term, flexibility will remain essential to account for the variable and uncertain electricity production from RES. EVs have large batteries that are often only partly used for daily travel, particularly in densely populated areas. Smart charging and Vehicle-to-Grid (V2G) can harness the flexibility of EVs to support grid balancing and congestion management.
This thesis investigates the smart charging and V2G potential for EV aggregators, with a focus on workplace charging. State-of-the-art Reinforcement Learning (RL) techniques are applied to a case study involving a business parking lot. The objective is to maximize the profits of the EV aggregator while satisfying EV user and transformer power limit constraints. The modeled EV behavior is based on data from real EV measurements in the Netherlands. The real-time charging optimization problem is characterized by high uncertainty. RL is widely considered a promising algorithm for solving highly uncertain problems. However, the latest Deep RL algorithms often struggle to guarantee constraint-satisfying behavior. Safe RL, an emerging subfield, aims to reduce constraint violations in the learned behavior, thus making algorithms ‘safer’.
This thesis applies recent Safe RL algorithms and compares their performance to Deep RL baselines and conventional As-Fast-As-Possible (AFAP) charging. The proposed method, Constrained Variational Policy Optimization (CVPO), achieved performance comparable to that of the optimal offline Gurobi solver in simulation scenarios where sufficient transformer capacity was available and overloads could not occur. The learned behavior generalized well to unseen levels of charger occupation. However, in scenarios with more inflexible loads and a smaller transformer power limit, transformer overloading risk made the problem more constrained, resulting in a decline in CVPO’s performance.