Reward Definitions in Reinforcement Learning for Traffic Light Control
C.P. Jansen (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M. Suau de Castro – Mentor (TU Delft - Interactive Intelligence)
Frans A. Oliehoek – Graduation committee member (TU Delft - Interactive Intelligence)
M.A. Migut – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Traffic congestion is a problem of tremendous size that affects many people. Using Reinforcement Learning to find a light control policy can ease traffic congestion and decrease travel time for vehicles. This paper specifically looks at the effect of using different reward functions for training agents. We highlight how the learnabilty of a reward function and its alignment with the final goal of the agent are the most important factors when designing a reward definition. Finally we propose a reward function to use for future