Understanding the Effects of Discrete Representations in Model-Based Reinforcement Learning

An analysis on the effects of categorical latent space world models on the MinAtar Environment

More Info
expand_more

Abstract

While model-free reinforcement learning (MFRL) approaches have been shown effective at solving a diverse range of environments, recent developments in model-based reinforcement learning (MBRL) have shown that it is possible to leverage its increased sample efficiency and generalisation abilities to solve highly complex tasks with fewer resources and environment interactions. The introduction of discrete latent states through categorical distributions allowed DreamerV2, a MBRL approach, to surpass the state-of-the-art MFRL Rainbow algorithm on the Arcade Learning Environment. Despite the successes of this approach, it is not yet understood why discretization improves performance. This paper investigates how the discretization of the latent space through categorical distribution affects planning performance in a deterministic environment. Further investigations are conducted on the model's generalization abilities and the impact of the latent space's shape on performance. By using a dataset of experiences instead of directly interacting with the environment, the models are trained in an offline setting. Results show that the discrete world model underperforms compared to a continuous latent space model while being significantly harder to train. Further investigations concluded that the number of categorical distributions has a high influence on performance and that in the considered setting the discrete world model can generalize better than the continuous baseline but it does so by sacrificing small gains in important metrics.