EV Charging Strategies through Power Setpoint Tracking: A Reinforcement Learning Approach

More Info
expand_more

Abstract

The transportation sector continues decarbonizing with the increasing number of Electric Vehicles (EVs) replacing gasoline and diesel cars every year. However, the integration of vast amounts of EVs introduces complexities in energy distribution and grid stability. Charge Point Operators (CPOs), positioned at the intersection of EVs and the grid, play a critical role in managing these complexities. They ensure that the charging infrastructure meets the needs of both EV users and the grid, highlighting the importance of smart charging strategies.

In this thesis, a smart charging approach is proposed from the point of view of a CPO. The proposed approach aims to optimize the charging schedules for EVs parked at a commercial building's parking lot. The objective of the optimization problem is to minimize the Power Setpoint Tracking (PST) error, which indicates the error between the contracted energy in the day-ahead market by the CPO and the aggregated consumption of charging stations the next day. This optimization involves complex sequential decision-making, where the uncertain nature of EV arrivals and departures demands a fast and adaptive solution. Thus, this thesis proposes a Markov Decision Process (MDP) formulation and solves it using the Deep Deterministic Policy Gradient (DDPG) algorithm to minimize the PST error by scheduling the charging of EVs. DDPG is chosen for its ability to efficiently handle complex problems with continuous state and action spaces, making it ideal, considering the uncertainties inherent to the arrival of EVs and the charging process. Additionally, DDPG's application in a commercial building's parking lot, where EV arrival and departure patterns are usually consistent, further solidifies DDPG as a strong alternative.

Evaluating the proposed DDPG approach with alternative benchmarks, such as the uncontrolled "charge as fast as possible" (CAFAP) and the optimal solution obtained through a Mixed Integer Non-Linear Programming (MINLP) formulation, signifies DDPG's superior performance in several metrics. Specifically, it outperforms the CAFAP algorithm by achieving a reduction in PST error by an average of 34% for a parking lot with 10 chargers over 12 hours of charging for a day. This highlights DDPG's efficacy in optimizing EV charging schedules over the CAFAP algorithm. Moreover, DDPG's model benefits from the ability to be trained offline with historical data and deployed online once trained. This approach allows for rapid, dynamic rescheduling of charging in real-world operations, offering speed advantages over the theoretically optimal solution, which requires prior knowledge of arrival and departure times and State of Charge (SoC) of EVs. All experiments validating these findings were conducted within the EV2Gym, a Gym environment specifically designed to simulate the EV charging scenarios.

Lastly, this thesis contributes to the field by demonstrating how RL, through the use of DDPG, can optimize PST for EV charging in a commercial building's parking lot. By offering a detailed comparison with other algorithms and showcasing the scalability and adaptability of DDPG, the research provides valuable insights for CPOs and stakeholders in the energy sector.