Print Email Facebook Twitter Modelling Agents with Variational Autoencoders in Multi-Agent Sequential Decision Making Title Modelling Agents with Variational Autoencoders in Multi-Agent Sequential Decision Making Author Lenferink, Luc (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Oliehoek, F.A. (mentor) Congeduti, E. (mentor) Degree granting institution Delft University of Technology Programme Computer Science | Artificial Intelligence Technology Date 2023-01-25 Abstract The ability to model other agents can be of great value in multi-agent sequential decision making problems and has become more accessible due to the introduction of deep learning into reinforcement learning. In this study, the aim is to investigate the usefulness of modelling other agents using variational autoencoder based models in partially observable settings. Previous studies that model other agents using (variational) autoencoders have shown promising results. In these studies, a single protagonist agent learns representations of other agents to then use them as additional components of its observation space which is, as such, augmented with those representations. It is, however, not always entirely clear what is being modelled and what would be the best feature of the other agent to represent. Moreover, in these works, a comparison between the used variational autoencoder based models and a baseline classifier trained to solve the same classification task is missing. This study investigates which features can best be used for the augmentation of the observations of deep reinforcement learning agents and if these features can be represented by variational autoencoder based models. Subsequently, it compares these models with a baseline classifier that solves the same classification problem to find out which model yields the best results when used for augmenting observations. Overall, the results suggest that it is beneficial to augment the observations of deep reinforcement learning agents with features related to other agents learned in a pre-training phase. Another interesting result is that the baseline classifier achieves similar or better performance compared to the variational autoencoder based model. Further research needs to be conducted to confirm the soundness of these findings. Subject Agent ModellingVariational Autoencoder (VAE)Multi-Agent Sequential Decision MakingReinforcement Learning (RL)Partial Observability To reference this document use: http://resolver.tudelft.nl/uuid:627aa817-cbb8-4c41-9b8c-3532476bfaee Part of collection Student theses Document type master thesis Rights © 2023 Luc Lenferink Files PDF Thesis_Luc_Lenferink_17012023.pdf 3.67 MB Close viewer /islandora/object/uuid:627aa817-cbb8-4c41-9b8c-3532476bfaee/datastream/OBJ/view