Circular Image

J.W. Böhmer

27 records found

Over the past decade, model-based reinforcement learning (MBRL) has become a leading approach for solving complex decision-making problems. A prominent algorithm in this domain is MuZero, which integrates Monte Carlo Tree Search (MCTS) with deep neural networks and a latent world ...

Analyzing Plasticity Through Utility Scores

Comparing Continual Learning Algorithms via Utility Score Distributions

One of the central problems in continual learning is the loss of plasticity, which is the model’s inability to learn new tasks. Several approaches have been previously proposed, such as Continual Backpropagation (CBP). This algorithm uses utility scores, which represent how usefu ...
Continual Backpropagation (CBP) has recently been proposed as an effective method for mitigating loss of plasticity in neural networks trained in continual learning (CL) settings. While extensive experiments have been conducted to demonstrate the algorithm's ability to mitigate l ...

Layerwise Perspective into Continual Backpropagation

Replacing the First Layer is All You Need

Continual learning faces a problem, known as plasticity loss, where models gradually lose the ability to adapt to new tasks. We investigate Continual Backpropagation (CBP) – a method that tackles plasticity loss by constantly resetting a small fraction of low-utility neurons. We ...
Deep learning systems are typically trained in static environments and fail to adapt when faced with a continuous stream of new tasks. Continual learning addresses this by allowing neural networks to learn sequentially without forgetting prior knowledge. However, such models ofte ...

Maintaining Plasticity for Deep Continual Learning

Activation Function-Adapted Parameter Resetting Approaches

Standard deep learning utensils, in particular feed-forward artificial neural networks and the backpropagation algorithm, fail to adapt to sequential learning scenarios, where the model is continuously presented with new training data. Many algorithms that aim to solve this probl ...
AlphaZero and its successors employ learned value and policy functions to enable more efficient and effective planning at deployment. A standard assumption is that the agent will be deployed in the same environment where these estimators were trained; changes to the environment w ...
This thesis introduces a novel sparsity-regularized transformer to be used as a world model in model-based reinforcement learning, specifically targeting environments with sparse interactions. Sparse-interactive environments are a class of environments where the state can be deco ...
The application of multi-robot systems has gained popularity in recent years. Multi-robot systems show great potential in scaling up robotic applications in surveillance, monitoring, and exploration. Although single robots can already be used to automatize search and rescue, and ...
The research in this thesis falls within the realm of optimization under uncertainty, a crucial area in computer science and mathematics with broad applications in power systems, finance, machine learning, healthcare, and more. This thesis presents three main contributions across ...
In reinforcement learning, the ability to generalize to unseen situations is pivotal to an agent’s success. In this thesis, two novel methods that aim to enhance the generalizability of an agent will be introduced. Both of the methods rely on the idea that the diversity of a re ...
Recent advancements in differential simulators offer a promising approach to enhancing the sim2real transfer of reinforcement learning (RL) agents by enabling the computation of gradients of the simulator’s dynamics with respect to its parameters. However, the application of thes ...
Over the last decade, there have been significant advances in model-based deep reinforcement learning. One of the most successful such algorithms is AlphaZero which combines Monte Carlo Tree Search with deep learning. AlphaZero and its successors commonly describe a unified frame ...

Reward Based Program Synthesis for Minecraft

Adapting Program Synthesizers for Reward Evaluation and Leveraging Discovered Programs

Program synthesis is the task to construct a program that provably satisfies a given high-level specification. There are various ways in which a specification can be described. This research focuses on adapting the Probe synthesizer, traditionally reliant on input-output examples ...

DMQL: Deep Maximum Q-Learning

Combatting Relative Overgeneralisation in Deep Independent Learners using Optimism and Similarity

Various pathologies can occur when independent learners are used in cooperative Multi-Agent Reinforcement Learning. One such pathology is Relative Overgeneralisation, which manifests when a suboptimal Nash Equilibrium in the joint action space of a problem is preferred over an op ...
Deep, model based reinforcement learning has shown state of the art, human-exceeding performance in many challenging domains.
Low sample efficiency and limited exploration remain however as leading obstacles in the field.
In this work, we incorporate epistemic uncertain ...
This paper compares the generalizing capability of multi-head attention (MHA) models with that of convolutional neural networks (CNNs). This is done by comparing their performance on out-ofdistribution data. The dataset that is used to train both models is created by coupling dig ...
Most deep learning models fail to generalize in production. Indeed, sometimes data used during training does not completely reflect the deployed environment. The test data is then considered out-of-distribution compared to the training data. In this paper, we focus on out-of-dist ...
The use of Transformers outside the realm of natural language processing is becoming more and more prevalent. Already in the classification of data sets such as CIFAR-100 it has shown to be able to perform just as well as the much more established Convolutional Neural Network. Th ...