Cooperative Planning and Control for Connected and Automated Vehicles' On-ramp Merging in Mixed Traffic Through Value Decomposition-based Multiagent Deep Reinforcement Learning

None, None

Cooperative Planning and Control for Connected and Automated Vehicles' On-ramp Merging in Mixed Traffic Through Value Decomposition-based Multiagent Deep Reinforcement Learning

Master Thesis (2024)

Author(s)

Y. Zhang (TU Delft - Civil Engineering & Geosciences)

Contributor(s)

H. Farah – Mentor (TU Delft - Traffic Systems Engineering)

M. Rinaldi – Graduation committee member (TU Delft - Traffic Systems Engineering)

B. Shyrokau – Graduation committee member (TU Delft - Intelligent Vehicles)

C. Evans – Graduation committee member (TU Delft - Traffic Systems Engineering)

Y. Dong – Graduation committee member (TU Delft - Traffic Systems Engineering)

Faculty

Civil Engineering & Geosciences

To reference this document use:

https://resolver.tudelft.nl/uuid:6c59ac02-e110-434b-9e93-3999c23dbcf7

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

22-11-2024

Awarding Institution

Delft University of Technology

Programme

['Transport, Infrastructure and Logistics']

Faculty

Civil Engineering & Geosciences

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Connected and Automated Vehicles (CAVs) have the poten- tial to revolutionize transportation systems, but their integration with human-driven vehicles (HDVs) in mixed traffic environments presents significant challenges, particularly in complex scenarios such as on-ramp merging. This paper addresses the challenge of on-ramp merging for CAVs in mixed traffic environments, proposing a novel approach called QMIX-QLambdaM. We formulate the problem as a Centralized Train- ing with Decentralized Execution (CTDE) Cooperative Multi-Agent Re- inforcement Learning (MARL) task, capable of handling dynamic sce- narios with both CAVs and HDVs. QMIX-QLambdaM enhances the QMIX algorithm by incorporating Q(λ) returns for improved value es- timation and an action masking mechanism for safer action selection. Our comprehensive experiments demonstrate that QMIX-QLambdaM consistently outperforms state-of-the-art algorithms, including QMIX, MAA2C, and COMA, across various performance metrics related to traf- fic efficiency and safety. The proposed method exhibits superior adapt- ability across different traffic densities, maintaining high performance in terms of safety, efficiency, and overall rewards. Furthermore, case stud- ies illustrate QMIX-QLambdaM’s ability to generate effective strate- gic control for both main-lane and merging-lane vehicles, showcasing smoother driving behavior and better collision avoidance compared to baseline methods. The learning curve comparison also reveals QMIX- QLambdaM’s advantage in credit assignment compared to other CTDE baselines for the formulated problem. The code are available at https: //github.com/ayton-zhang/MARL_qmix_merging.

Files

Yuteng_thesis.pdf

(pdf | 4.1 Mb)

License info not available