Learning cycling styles using experimental trajectory data with Inverse Reinforcement Learning

More Info
expand_more

Abstract

Cycling is an increasingly attractive transportation mode, thanks to its health and environmental benefits. Personalized travel assistance services can help make cycling more appealing by providing speed or route advices that can reduce travel time and increase safety while taking into account the personal preferences of cyclists. Due to its ability to learn agents' reward function, Inverse Reinforcement Learning is a suitable algorithm for learning cycling preferences from data.
This thesis aims to describe cycling styles as a set of cycling preferences encoded as a reward function composed of a weighted sum of features. The weights associated to the features composing the reward function represent the importance given to each cycling preference and express the trade-off between different goals of a cyclist. Continuous-time Inverse Reinforcement Learning extracts the weights from empirical cyclists' trajectories collected during an experiment performed in Delft. During the experiment, cyclists were asked to cycle according to three different cycling styles: cautious, normal and aggressive. Differences between weight sets extracted for each cycling styles were analyzed by means of the Kruskar-Wallis statistical test and K-Means clustering algorithm, and the averaged weights for each cycling style were used to simulate a set of test trajectories.
It is shown by simulations that the reward function identified for a specific cycling style leads to an improvement in terms of similarity to test trajectories with the same cycling style with respect to the reward functions corresponding to other cycling styles. The statistical analysis shows that the weights of cautious and aggressive cycling styles show statistical differences and define separate clusters.