Decentralized Reinforcement Learning of robot behaviors

Journal Article (2018)
Author(s)

David L. Leottau (Universidad de Santiago de Chile)

Javier Ruiz-del-Solar (Universidad de Santiago de Chile)

R. Babuska (TU Delft - Learning & Autonomous Control, Czech Technical University)

Research Group
Learning & Autonomous Control
DOI related publication
https://doi.org/10.1016/j.artint.2017.12.001
More Info
expand_more
Publication Year
2018
Language
English
Research Group
Learning & Autonomous Control
Volume number
256
Pages (from-to)
130-159
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A multi-agent methodology is proposed for Decentralized Reinforcement Learning (DRL) of individual behaviors in problems where multi-dimensional action spaces are involved. When using this methodology, sub-tasks are learned in parallel by individual agents working toward a common goal. In addition to proposing this methodology, three specific multi agent DRL approaches are considered: DRL-Independent, DRL Cooperative-Adaptive (CA), and DRL-Lenient. These approaches are validated and analyzed with an extensive empirical study using four different problems: 3D Mountain Car, SCARA Real-Time Trajectory Generation, Ball-Dribbling in humanoid soccer robotics, and Ball-Pushing using differential drive robots. The experimental validation provides evidence that DRL implementations show better performances and faster learning times than their centralized counterparts, while using less computational resources. DRL-Lenient and DRL-CA algorithms achieve the best final performances for the four tested problems, outperforming their DRL-Independent counterparts. Furthermore, the benefits of the DRL-Lenient and DRL-CA are more noticeable when the problem complexity increases and the centralized scheme becomes intractable given the available computational resources and training time.

Files

MAS_DRL.pdf
(pdf | 6.87 Mb)
- Embargo expired in 22-12-2019