F.A. Oliehoek
79 records found
1
In this volume, we are happy present the post-proceedings of BNAIC/BeNeLearn 2023, the joint conference on Artificial Intelligence and Machine Learning in the BeNeLux, which took place at TU Delft. It is the main regional conference on these topics and has a long tradition: in 20
...
High sample complexity hampers the successful application of reinforcement learning methods, especially in real-world problems where simulating complex dynamics is computationally demanding. Influence-based abstraction (IBA) was proposed to mitigate this issue by breaking down th
...
Policy Space Response Oracles
A Survey
Game theory provides a mathematical way to study the interaction between multiple decision makers. However, classical game-theoretic analysis is limited in scalability due to the large number of strategies, precluding direct application to more complex scenarios. This survey prov
...
Model-based reinforcement learning (MBRL) has drawn considerable interest in recent years, given its promise to improve sample efficiency. Moreover, when using deep-learned models, it is possible to learn compact and generalizable models from data. In this work, we study MuZero,
...
This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and compl
...
We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms. As MOO is applied to solve diverse problems, approaches for analyzing the trade-offs offered by MOO algorithms are scattered across fie
...
Teacher-apprentices RL (TARL)
Leveraging complex policy distribution through generative adversarial hypernetwork in reinforcement learning
Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across diff
...
One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently. This is in stark contrast to m
...
Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of a
...
Reinforcement learning agents may sometimes develop habits that are effective only when specific policies are followed. After an initial exploration phase in which agents try out different actions, they eventually converge toward a particular policy. When this occurs, the distrib
...
One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently. This is in stark contrast to m
...
Model-based reinforcement learning methods are promising since they can increase sample efficiency while simultaneously improving generalizability. Learning can also be made more efficient through state abstraction, which delivers more compact models. Model-based reinforcement le
...
BADDr
Bayes-Adaptive Deep Dropout RL for POMDPs
While reinforcement learning (RL) has made great advances in scalability, exploration and partial observability are still active research topics. In contrast, Bayesian RL (BRL) provides a principled answer to both state estimation and the exploration-exploitation trade-off, but s
...
Constant growth of cities and their rapid urbanization contribute significantly to an increase in traffic congestion, leading to high costs both in terms of time and fuel consumption. Intelligent Transportation Systems (ITSs) play an important role in managing traffic in urban ar
...
Complex real-world systems pose a significant challenge to decision making: an agent needs to explore a large environment, deal with incomplete or noisy information, generalize the experience and learn from feedback to act optimally. These processes demand vast representation cap
...
Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning. A key challenge, however, that is not addressed by many of these methods is multi-agent credit assignment: assessing an agent’s contribution to the overall pe
...
How can we plan efficiently in a large and complex environment when the time budget is limited? Given the original simulator of the environment, which may be computationally very demanding, we propose to learn online an approximate but much faster simulator that improves over tim
...
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information, yet is able to share experience between global symmetries in the joint state-action space of cooperative multi-agent systems. In coopera
...
Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficul
...
Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human. To do so, the AI must be able to recognize the goals and constraints of the human and have the means to help them. We present a novel formulation of the interaction between the human
...