Improvements in Imitation Learning for Overcooked

Bachelor Thesis (2023)
Authors

D.P. Niemantsverdriet (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Supervisors

Robert Loftin (TU Delft - Interactive Intelligence)

FA Oliehoek (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Duuk Niemantsverdriet
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Duuk Niemantsverdriet
Graduation Date
28-06-2023
Awarding Institution
Delft University of Technology
Project
CSE3000 Research Project
Programme
Computer Science and Engineering
Related content

GitHub repository which contains all the code that I worked with.

https://github.com/DuukPN/overcooked_ai
Faculty
Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Arguably the main goal of artificial intelligence is to create agents that can collaborate with humans to achieve a shared goal. It has been shown that agents that assume their partner to be optimal can converge to protocols that humans do not understand. Taking human suboptimality into consideration is imperative to perform well in a coordination task. One way to achieve this is imitation learning, where you train an agent on recorded data from a human playing optimally. I created several agents using different implementations of behavioral cloning, by reducing the dataset to state-action pairs and training a neural network on this. To evaluate their performance, I used an environment that poses a coordination challenge based on the popular game Overcooked. Neither expanding nor reducing the feature space that the agents are trained on yielded any significant improvement in the performance of the agents. In fact, expanding the feature space to include some historical data made the agent less generalizable and especially failed to perform when paired with agents with unfamiliar strategies. These limitations were mostly posed by the available dataset, which was not big enough to support more features and of too low quality of gameplay to create agents that perform exceptionally well.

Files

Research_Project.pdf
(pdf | 0.634 Mb)
License info not available