Smart Team Play: Utility of Population-Based Training for Cooperative AI in Overcooked
J.M. Moreira-Kanaley (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Robert Loftin – Mentor (TU Delft - Interactive Intelligence)
F.A. Oliehoek – Mentor (TU Delft - Interactive Intelligence)
Sicco Verwer – Graduation committee member (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In an ad-hoc teamwork environment, artificial intelligence agents have the potential to take on supportive roles and complete tasks in collaboration with human players. The following paper investigates the use of employing population-based training (PBT) for reinforcement learning agents in the multi-player game Overcooked. In addition to this, the research examines whether the incorporation of highly mutated agents, which serve to introduce noise into the initial population, could enhance the final performance of PBT. As the method used to answer the previous inquiries, the learning curve of a selected PBT agent is first evaluated and its final performance with a human proxy then examined within different layouts of the game. Following this method, it was concluded that PBT, and other self-play agents, have the tendency to drastically underperform against the human proxy and agents that are trained based on human data. Furthermore, while incorporating the mutated agents increased sample efficiency in layouts with low risk of collisions, it had negligible effect on the final performance of PBT with the human proxy.