N.N. Pancham

Master thesis (1)

1 records found

Discrete state-space active inference in nonstationary environments

Master thesis (2021) - N.N. Pancham (author) , Martijn Wisse (mentor) , Ajith Anil Anil Meera (mentor) , Jens Kober (graduation committee member)

Active inference is a neuroscientific theory, which states that all living systems (e.g. the human brain) minimize a quantity termed the free energy. By minimizing this free energy, living systems keep an accurate representation of the world in their internal model (learning), are provided with an optimal way of acting on the world (action selection), and can predict incoming sensory data (perception). Considering the fact that these properties are sought after in artificial intelligence systems as well, active inference has also become an interesting topic from an engineering point of view. The application of active inference can be done with both a continuous and a discrete state-space framework. However, research on discrete state-space active inference has neglected the extension of its applicability to nonstationary environments. This work aims to fill that gap. More specifically, the goal of this research is to evaluate the performance of state-of-the-art discrete state-space active inference agents in nonstationary environments, and assess whether forgetting part of the agent's previous experiences can increase its performance. The type of nonstationarity that is used in this work is cyclostationarity, and this nonstationarity will only be manifested in the transition process of the active inference task. Moreover, the specific type of task solved is one of planning and navigation in a gridworld. Since the agent has to deal with a planning and navigation task, performance is quantified by the number of steps the agent needs to take in order to reach its goal. Three methods of forgetting are implemented and compared, inspired by techniques from reinforcement learning, deep learning and time series analysis respectively. These are: (1) the use of a constant forget rate, (2) the use of the updating mechanism of a long short-term memory (LSTM) cell applied to the updating of the generative model in active inference, and (3) the use of a memory window that stores experience only from a certain trial onwards and forgets experience from before this trial by utilizing a rolling summation of the concentration parameters. The results show that forgetting with the use of a memory window can significantly improve performance, provided that the agent can reach the goal state from the initial state in one trial. When this is not the case, the use of a memory window does not (positively or negatively) influence performance. Both the implementation of forgetting based on the updating of an LSTM cell and the use of a constant forget rate have unanimously shown to decrease performance, and thus should not be implemented in active inference.