Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller

None, None

Safe Navigation in Dense Traffic Scenarios using Reinforcement Learning as Global Guidance for a Model Predictive Controller

Master Thesis (2020)

Author(s)

A. Agarwal (TU Delft - Mechanical Engineering)

Contributor(s)

J. Alonso-Mora – Mentor (TU Delft - Learning & Autonomous Control)

Bruno Brito – Mentor (TU Delft - Learning & Autonomous Control)

Faculty

Mechanical Engineering

Copyright

Deep Reinforcement Learning Optimal Control Motion Planning Safe Navigation

To reference this document use:

https://resolver.tudelft.nl/uuid:a18ae4f2-788e-4b49-af0a-4c4d3a964c97

More Info

expand_more

Publication Year

2020

Language

English

Copyright

Graduation Date

14-12-2020

Awarding Institution

Delft University of Technology

Programme

Mechanical Engineering | Vehicle Engineering

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The successful integration of autonomous vehicles (AVs) in human environments is highly dependent on their ability to navigate safely and timely through dense traffic conditions. Such conditions involve a diverse range of human behaviors, ranging from cooperative (willing to yield) to non-cooperative human drivers (unwilling to yield) that need to be identified without any explicit inter-vehicle communication. In order to maneuver through such conditions, AVs must not only compute a collision-free trajectory but also account for the effects of its actions on the surrounding agents to negotiate the navigation maneuver safely. Existing motion planning techniques fail in these environments because they suffer from one or more of the following drawbacks: suffer from ”the curse of dimensionality” due to the high number of agents (e.g., optimization-based methods); do not account for the interaction effects among the agents; do not provide any collision avoidance or trajectory feasibility guarantees (e.g., learning-based methods). In this paper, we propose a novel navigation framework combining the strengths of learning-based with optimization-based algorithms. More specifically, we employ a Soft Actor-Critic agent to learn a continuous guidance policy that provides global guidance to an optimization-based planner generating feasible and collision- free trajectories. We evaluate our method in a highly inter- active simulation environment where we compare our method with two baseline approaches, a learning-based method and an optimization-based method, and present performance results demonstrating our method significantly reduces the number of collisions and increase the success rate with fewer number of deadlocks. We also show that that our method is able to generalise and applicable to other traffic scenarios (e.g., an unprotected left turn).

Files

Thesis_Achin_final.pdf

(pdf | 2.77 Mb)

License info not available