A Reinforcement Learning-based framework for optimizing mobile charging pod operations

None, None; None, None; None, None; None, None; None, None

A Reinforcement Learning-based framework for optimizing mobile charging pod operations

Conference Paper (2026)

Author(s)

Mohd Aiman Khan (KTH Royal Institute of Technology)

Wilco Burghout (KTH Royal Institute of Technology)

Erik Jenelius (KTH Royal Institute of Technology)

Oded Cats (TU Delft - Transport and Planning)

Matej Cebecauer (KTH Royal Institute of Technology)

Department

Transport and Planning

Autonomous vehicles SUMO Dynamic Charging Vehicle-to-Vehicle charging Mobile charging stations

DOI related publication

https://doi.org/10.1109/FISTS67319.2026.11421762 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:9920b25d-2ff1-47f0-8515-df0dca208302

More Info

expand_more

Publication Year

2026

Language

English

Department

Transport and Planning

Pages (from-to)

75-80

Publisher

IEEE

ISBN (electronic)

9798331553616

Event

2026 IEEE Forum for Innovative Sustainable Transportation Systems, FISTS 2026 (2026-02-04 - 2026-02-06), Cairo, Egypt

Downloads counter

7

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The rise of autonomous electric vehicles (AEVs) presents new challenges and opportunities for an efficient and flexible charging infrastructure. This study proposes a reinforcement learning (RL) based framework for optimizing the control and operation of mobile autonomous charging pods (MAPs) for maintaining the operation of AEVs through dynamic charging. We formulate a time and energy aware Markov Decision Process (MDP) to maximize the energy delivered, and the number of AEVs serviced, while also minimizing energy consumed and increasing efficiency. We integrate this framework with SUMO to enable realistic MAP-AEV interactions. A Proximal Policy Optimization (PPO) algorithm was used to train this MDP and identify the optimal control strategies for initiating, terminating, and balancing the network. The results show that the PPO agent can service around 175 AEVs, with an efficiency of 91.5%, representing a 25% improvement over baseline greedy heuristics. Moreover, the battery capacities of AEVs can also be reduced by up to 26%, without compromising the performance. The simulation results show the potential of the proposed method in providing a flexible, and scalable charging for future transport.

Files

A_Reinforcement_Learning-based... (pdf)

(pdf | 0.477 Mb)

Taverne

File under embargo until 12-09-2026