Robust Event-Driven Interactions in Cooperative Multi-agent Learning

None, None; None, None

Robust Event-Driven Interactions in Cooperative Multi-agent Learning

Conference Paper (2022)

Author(s)

Daniel Jarne Ornia (TU Delft - Team Manuel Mazo Jr)

Manuel Mazo (TU Delft - Team Manuel Mazo Jr)

Research Group

Team Manuel Mazo Jr

DOI related publication

https://doi.org/10.1007/978-3-031-15839-1_16

Reinforcement Learning Multi-Agent Systems Event-Triggered Communication

To reference this document use:

https://resolver.tudelft.nl/uuid:1b8e09ee-2e8e-48e4-ae9c-77966fd13e44

More Info

expand_more

Publication Year

2022

Language

English

Research Group

Team Manuel Mazo Jr

Bibliographical Note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en

Pages (from-to)

281-297

Publisher

Springer

ISBN (print)

978-3-031-15838-4

ISBN (electronic)

978-3-031-15839-1

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We present an approach to safely reduce the communication required between agents in a Multi-Agent Reinforcement Learning system by exploiting the inherent robustness of the underlying Markov Decision Process. We compute robustness certificate functions (off-line), that give agents a conservative indication of how far their state measurements can deviate before they need to update other agents in the system with new measurements. This results in fully distributed decision functions, enabling agents to decide when it is necessary to communicate state variables. We derive bounds on the optimality of the resulting systems in terms of the discounted sum of rewards obtained, and show these bounds are a function of the design parameters. Additionally, we extend the results for the case where the robustness surrogate functions are learned from data, and present experimental results demonstrating a significant reduction in communication events between agents.

Files

978_3_031_15839_1_16.pdf

(pdf | 0.92 Mb)

- Embargo expired in 01-03-2023

License info not available