Optimizing driving entity switching of semi-automated vehicles under automation degradation

None, None

Optimizing driving entity switching of semi-automated vehicles under automation degradation

Bachelor Thesis (2021)

Author(s)

C. Bakos (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Y. Li – Mentor (TU Delft - Algorithmics)

Matthijs Spaan – Mentor (TU Delft - Algorithmics)

A. van Deursen – Graduation committee member (TU Delft - Software Technology)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Semi-autonomous vehicles Reinforcement learning Markov Decision Processes Control authority Intelligent transportation systems Safety

To reference this document use:

https://resolver.tudelft.nl/uuid:b2a0aa8e-fffe-473e-97de-57e2174cdf90

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

01-07-2021

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Transitioning to use automated vehicles is a gradual process. Until full automation capabilities are developed there is a need to mediate which driving entity - human or autonomous driving system (ADS) - should be in control depending on the circumstances. This research aims at investigating the switching between manual and automated driving in semi-autonomous vehicles when the ADS becomes unfit to drive. To this end, a simple environment simulation was created and an MDP model was formulated that accounts for sensor failures and leaving the operational design domain (ODD). Deep Q-Network (DQN), a deep reinforcement learning (RL) algorithm was trained and evaluated against a hand-curated decision-tree-based standard. The DQN-based policy did not reach the performance of the baseline algorithm. The conclusion is drawn that using DQN to handle this multi-objective decision problem using an intuition-based reward function cannot learn an optimal policy.

Files

Optimizing_driving_entity_swit... (pdf)

(pdf | 1.54 Mb)

License info not available