Automatic learning of cyclist's compliance for speed advice at intersections - a reinforcement learning-based approach

None, None; None, None; None, None

Automatic learning of cyclist's compliance for speed advice at intersections - a reinforcement learning-based approach

Conference Paper (2019)

Author(s)

Azita Dabiri (Transport and Planning, TU Delft - Mechanical Engineering)

Andreas Hegyi (Transport and Planning)

Serge Hoogendoorn (TU Delft - Civil Engineering & Geosciences)

Research Group

Team Bart De Schutter

DOI related publication

https://doi.org/10.1109/ITSC.2019.8916847 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:d7739dcf-4ea2-489b-b386-2e064433b186

More Info

expand_more

Publication Year

2019

Language

English

Research Group

Team Bart De Schutter

Article number

8916847

Pages (from-to)

2375-2380

ISBN (electronic)

9781538670248

Event

22nd IEEE International Conference on Intelligent Transportation Systems, ITSC 2019 (2019-10-27 - 2019-10-30), Auckland, New Zealand

Downloads counter

126

Abstract

Although there exists algorithms that give speed advice for cyclists when approaching traffic lights with uncertainty in the timing, they all need to know, and thus assume, the cyclist's response to the advice in order to be able to optimize the advice. To relax this assumption, in this paper an algorithm is proposed that combines reinforcement learning and planning to learn the reaction of cyclist to the advice and deploys this information for planning the best next advice on-the-fly. Rather than a single search procedure, which is conventional in the existing architectures, two sample-based search procedures are suggested to be used in the algorithm. This makes it possible to obtain an accurate local approximation of the action-value function, in spite of the short computation time that is available in each decision epoch. The algorithm is tested in a simulation case study where the impact of a proper initialisation of action-value function as well as the importance of using two search procedures are affirmed.