The Effects of Entropy Regularization and Lyapunov Stability Constraint on Multi-Agent Reinforcement Learning for Autonomous Driving

None, None

The Effects of Entropy Regularization and Lyapunov Stability Constraint on Multi-Agent Reinforcement Learning for Autonomous Driving

Master Thesis (2022)

Author(s)

M. Madi (TU Delft - Mechanical Engineering)

Contributor(s)

Wei Pan – Mentor (TU Delft - Robot Dynamics)

Faculty

Mechanical Engineering

Copyright

Multi-agent reinforcement learning Autonomous driving Stability Entropy regularization

To reference this document use:

https://resolver.tudelft.nl/uuid:743d8142-9d59-4c97-a8da-2c0d734c8ebc

More Info

expand_more

Publication Year

2022

Language

English

Copyright

Graduation Date

31-08-2022

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

High level decision making in Autonomous Driving (AD) is a challenging task due to the presence of multiple actors and complex driving interactions. Multi-Agent Reinforcement Learning (MARL) has been proposed to learn multiple driving policies concurrently to solve AD tasks. In the literature, multi-agent algorithms have been shown to outperform single-agent algorithms and rule-based algorithms. Also several techniques have been employed to facilitate convergence in policy learning such as parameter sharing and local reward design. Further, functional safety in AD has been addressed with techniques such as unsafe action-masking. However, there is a gap in the literature on the study of the effects of entropy regularization and on policies learned with closed-loop stability guarantee to solve AD tasks in MARL. In this thesis, research gaps are addressed in entropy regularization and Lyapunov stability constrained policy objectives applied to Autonomous Driving in MARL. Specifically, it is demonstrated on the lane-keeping task with 2 agents that entropy regularization improves training stability. It was also shown that in stochastic multi-agent algorithms on the lane-keeping task, a Lyapunov constrained policy objective performs better in average episode returns, success rate and collision rate than a policy objective without a Lyapunov constraint with low measurement noise perturbation. However, an algorithm with a stochastic actor performs worse than that with a deterministic actor in stability and lane center proximity on the lane-keeping task.

Files

Thesis_1.pdf

(pdf | 0.845 Mb)

License info not available