Tarun Gupta

Conference paper (1)

1 records found

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

Conference paper (2021) - Tarun Gupta (author) , Anuj Mahajan (author) , Bei Peng (author) , J.W. Böhmer (author) , Shimon Whiteson (author)

VDN and QMIX are two popular value-based algorithms for cooperative MARL that learn a centralized action value function as a monotonic mixing of per-agent utilities. While this enables easy decentralization of the learned policy, the restricted joint action value function can pre ...