Search results | TU Delft Repositories

Searched for: subject%3A%22reinforcement%255C+learning%22

(1 - 6 of 6)

document: CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration
Yang, Q. (author), Spaan, M.T.J. (author)
Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.<br/>An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown.<br/>In this paper, we propose a...
conference paper 2023

document: Back to the Future: Solving Hidden Parameter MDPs with Hindsight
Ponnambalam, C.T. (author), Kamran, Danial (author), Simão, T. D. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author)
conference paper 2022

document: Influence-Augmented Local Simulators: a Scalable Solution for Fast Deep RL in Large Networked Systems
Suau, M. (author), He, J. (author), Spaan, M.T.J. (author), Oliehoek, F.A. (author)
Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simulators of complicated systems that can run sufficiently fast for...
conference paper 2022

document: Speeding up Deep Reinforcement Learning through Influence-Augmented Local Simulators
Suau, M. (author), He, J. (author), Spaan, M.T.J. (author), Oliehoek, F.A. (author)
Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simulators of complicated systems that can run sufficiently fast for...
conference paper 2022

document: WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning
Yang, Q. (author), Simão, T. D. (author), Tindemans, Simon H. (author), Spaan, M.T.J. (author)
Safe exploration is regarded as a key priority area for reinforcement learning research. With separate reward and safety signals, it is natural to cast it as constrained reinforcement learning, where expected long-term costs of policies are constrained. However, it can be hazardous to set constraints on the expected safety signal without...
conference paper 2021

document: The MADP Toolbox: An Open Source Library for Planning and Learning in (Multi-)Agent Systems
Oliehoek, F.A. (author), Spaan, M.T.J. (author), Terwijn, Bas (author), Robbel, Philipp (author), Messias, João V. (author)
This article describes the MultiAgent Decision Process (MADP) toolbox, a software library to support planning and learning for intelligent agents and multiagent systems in uncertain environments. Key features are that it supports partially observable environments and stochastic transition models; has unified support for single- and multiagent...
journal article 2017

Searched for: subject%3A%22reinforcement%255C+learning%22

(1 - 6 of 6)