Q-value reuse between state abstractions for traffic light control

Bachelor Thesis (2020)
Author(s)

E.F.M. Kuhn (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jinke He – Mentor (TU Delft - Interactive Intelligence)

R.A.N. Starre – Mentor (TU Delft - Interactive Intelligence)

FA Oliehoek – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2020 Emanuel Kuhn
More Info
expand_more
Publication Year
2020
Language
English
Copyright
© 2020 Emanuel Kuhn
Graduation Date
22-06-2020
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Previous research has in reinforcement learning for traffic control has used various state abstractions. Some use feature vectors while others use matrices of car positions. This paper first compares a simple feature vector consisting of only queue sizes per incoming lane to a matrix of car positions. Then it investigates if knowledge can be transferred from a simple agent using the feature vector abstraction to a more complex agent that uses the position matrix abstraction.We find that training cannot be sped up by first training an agent with the feature vector abstraction and then reusing this Q-function to train an agent with the position matrix abstraction. The simple agent does not take considerably fewer samples to converge, and the total time needed to first train the simple agent and then transfer exceeds the time needed to train the complex agent from scratch.

Files

License info not available