Multi-agent reinforcement learning via distributed MPC as a function approximator

Journal Article (2024)
Authors

S.H. Mallick (TU Delft - Team Bart De Schutter)

Filippo Airaldi (TU Delft - Team Azita Dabiri)

Azita Dabiri (TU Delft - Team Azita Dabiri)

B. de Schutter (TU Delft - Delft Center for Systems and Control)

Research Group
Team Azita Dabiri
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Team Azita Dabiri
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Volume number
167
DOI:
https://doi.org/10.1016/j.automatica.2024.111803
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the first work to extend this idea to the multi-agent setting. We propose the use of a distributed MPC scheme as a function approximator, with a structure allowing for distributed learning and deployment. We then show that Q-learning updates can be performed distributively without introducing nonstationarity, by reconstructing a centralized learning update. The effectiveness of the approach is demonstrated on a numerical example.

Files

1-s2.0-S0005109824002978-main.... (pdf)
(pdf | 1 Mb)
- Embargo expired in 22-12-2024
License info not available