Print Email Facebook Twitter The Effects of Entropy Regularization and Lyapunov Stability Constraint on Multi-Agent Reinforcement Learning for Autonomous Driving Title The Effects of Entropy Regularization and Lyapunov Stability Constraint on Multi-Agent Reinforcement Learning for Autonomous Driving Author Madi, Mohamed (TU Delft Mechanical, Maritime and Materials Engineering) Contributor Pan, W. (mentor) Degree granting institution Delft University of Technology Programme Mechanical Engineering Date 2022-08-31 Abstract High level decision making in Autonomous Driving (AD) is a challenging task due to the presence of multiple actors and complex driving interactions. Multi-Agent Reinforcement Learning (MARL) has been proposed to learn multiple driving policies concurrently to solve AD tasks. In the literature, multi-agent algorithms have been shown to outperform single-agent algorithms and rule-based algorithms. Also several techniques have been employed to facilitate convergence in policy learning such as parameter sharing and local reward design. Further, functional safety in AD has been addressed with techniques such as unsafe action-masking. However, there is a gap in the literature on the study of the effects of entropy regularization and on policies learned with closed-loop stability guarantee to solve AD tasks in MARL. In this thesis, research gaps are addressed in entropy regularization and Lyapunov stability constrained policy objectives applied to Autonomous Driving in MARL. Specifically, it is demonstrated on the lane-keeping task with 2 agents that entropy regularization improves training stability. It was also shown that in stochastic multi-agent algorithms on the lane-keeping task, a Lyapunov constrained policy objective performs better in average episode returns, success rate and collision rate than a policy objective without a Lyapunov constraint with low measurement noise perturbation. However, an algorithm with a stochastic actor performs worse than that with a deterministic actor in stability and lane center proximity on the lane-keeping task. Subject Multi-agent reinforcement learningAutonomous drivingEntropy regularizationStability To reference this document use: http://resolver.tudelft.nl/uuid:743d8142-9d59-4c97-a8da-2c0d734c8ebc Part of collection Student theses Document type master thesis Rights © 2022 Mohamed Madi Files PDF Thesis_1.pdf 864.83 KB Close viewer /islandora/object/uuid:743d8142-9d59-4c97-a8da-2c0d734c8ebc/datastream/OBJ/view