Nils Jansen | TU Delft Repository

Reinforcement Learning by Guided Safe Exploration

Conference paper (2023) - Q. Yang (author) , Thiago D. Simão (author) , Nils Jansen (author) , Simon H. Tindemans (author) , MTJ Spaan (author)

Safety is critical to broadening the application of reinforcement learning (RL). Often, we train RL agents in a controlled environment, such as a laboratory, before deploying them in the real world. However, the real-world target task might be unknown prior to deployment. Reward- ...

Training and Transferring Safe Policies in Reinforcement Learning

Conference paper (2022) - Q. Yang (author) , Thiago D. Simão (author) , Nils Jansen (author) , Simon H. Tindemans (author) , MTJ Spaan (author)

Safety is critical to broadening the a lication of reinforcement learning (RL). Often, RL agents are trained in a controlled environment, such as a laboratory, before being de loyed in the real world. However, the target reward might be unknown rior to de loyment. Reward-free R ...

AlwaysSafe: Reinforcement Learning without Safety Constraint Violations during Training

Conference paper (2021) - Thiago D. Simão (author) , Nils Jansen (author) , M.T.J. Spaan (author)

Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward signal that allows the agent to maximize its performance while remaining safe is not trivial. Safe RL studies how to mitigate such problems. For instance, we can decouple safety fro ...

Safe Policies for Factored Partially Observable Stochastic Games

Conference paper (2021) - Steven Carr (author) , Nils Jansen (author) , Suda Bharadwaj (author) , Matthijs T. J. Spaan (author) , Ufuk Topcu (author)

We study planning problems where a controllable agent operates under partial observability and interacts with an uncontrollable opponent, also referred to as the adversary. The agent has two distinct objectives: To maximize an expected
value and to adhere to a safety specific ...