Difference Rewards Policy Gradients

Castellini, Jacopo; Oliehoek, F.A.; Devlin, Sam; Savani, Rahul

Difference Rewards Policy Gradients

Title

Difference Rewards Policy Gradients

Author

Castellini, Jacopo (University of Liverpool)
Oliehoek, F.A. (TU Delft Interactive Intelligence)
Devlin, Sam (Microsoft Research Cambridge)
Savani, Rahul (University of Liverpool)

Date

2021

Abstract

Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning. A key challenge, however, that is not addressed by many of these methods is multi-agent credit assignment: assessing an agent’s contribution to the overall performance, which is crucial for learning good policies. We propose a novel algorithm called Dr.Reinforce that explicitly tackles this by combining difference rewards with policy gradients to allow for learning decentralized policies when the reward function is known. By differencing the reward function directly, Dr.Reinforce avoids difficulties associated with learning the 푄-function as done by Counterfactual Multiagent Policy Gradients (COMA), a state-of-the-art difference rewards method. For applications where the reward function is unknown, we show the effectiveness of a version of Dr.Reinforce that learns a reward network that is used to estimate the difference rewards.

Subject

Multi-Agent Reinforcement Learning
Policy Gradients
Difference Rewards
Multi-Agent Credit Assignment
Reward Learning

To reference this document use:

http://resolver.tudelft.nl/uuid:2721bb51-58c3-47f1-a6d2-c18c24bc1f60

Publisher

International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC

ISBN

9781450383073

Source

Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems

Event

20th International Conference on Autonomous Agentsand Multiagent Systems, 2021-05-03 → 2021-05-07, Virtual/online event due to COVID-19

Series

AAMAS '21, 2523-5699

Part of collection

Institutional Repository

Document type

conference paper

Rights

Files

PDF

p1475.pdf

1.54 MB

Close viewer