Search results | TU Delft Repositories

document

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Yang, Q. (author), Spaan, M.T.J. (author)

Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.<br/>An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown.<br/>In this paper, we propose a...

conference paper 2023

document

Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior

Albers, N. (author), Neerincx, M.A. (author), Brinkman, W.P. (author)

Despite their prevalence in eHealth applications for behavior change, persuasive messages tend to have small effects on behavior. Conditions or states (e.g., confidence, knowledge, motivation) and characteristics (e.g., gender, age, personality) of persuadees are two promising components for more effective algorithms for choosing persuasive...

conference paper 2023

document

Policy Analysis of Safe Vertical Manoeuvring using Reinforcement Learning: Identifying when to Act and when to stay Idle

Groot, D.J. (author), Ribeiro, M.J. (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)

The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different...

conference paper 2023

document

qgym: A Gym for Training and Benchmarking RL-Based Quantum Compilation

Van Der Linde, Stan (author), De Kok, Willem (author), Bontekoe, Tariq (author), Feld, S. (author)

Compiling a quantum circuit for specific quantum hardware is a challenging task. Moreover, current quantum computers have severe hardware limitations. To make the most use of the limited resources, the compilation process should be optimized. To improve currents methods, Reinforcement Learning (RL), a technique in which an agent interacts...

conference paper 2023

document

MARL-iDR: Multi-Agent Reinforcement Learning for Incentive-Based Residential Demand Response

van Tilburg, Jasper (author), Cavalcante Siebert, L. (author), Cremer, Jochen (author)

This paper presents a decentralized Multi-Agent Reinforcement Learning (MARL) approach to an incentive-based Demand Response (DR) program, which aims to maintain the capacity limits of the electricity grid and prevent grid congestion by financially incentivizing residential consumers to reduce their energy consumption. The proposed approach...

conference paper 2023

document

Improved DQN-Based Computation Offloading Algorithm in MEC Environment

Zhao, Zheyu (author), Cheng, H. (author), Xu, Xiaohua (author)

Massive terminal users have brought explosive need of data residing at edge of overall network. Multiple Mobile Edge Computing (MEC) servers are built in/near base station to meet this need. However, optimal distribution of these servers to multiple users in real time is still a problem. Reinforcement Learning (RL) as a framework to solve...

conference paper 2023

document

Back to the Future: Solving Hidden Parameter MDPs with Hindsight

Ponnambalam, C.T. (author), Kamran, Danial (author), Simão, T. D. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author)

conference paper 2022

document

Lateral and Vertical Air Traffic Control Under Uncertainty Using Reinforcement Learning

Badea, C. (author), Groot, D.J. (author), Morfin Veytia, A. (author), Ribeiro, M.J. (author), Dalmau, Ramon (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)

Air traffic demand has increased at an unprecedented rate in the last decade (albeit interrupted by the COVID pandemic), but capacity has not increased at the same rate. Higher levels of automation and the implementation of decision-support tools for air traffic controllers could help increase capacity and catch up with demand. The air traffic...

conference paper 2022

document

Robust Event-Driven Interactions in Cooperative Multi-agent Learning

Jarne Ornia, D. (author), Mazo, M. (author)

We present an approach to safely reduce the communication required between agents in a Multi-Agent Reinforcement Learning system by exploiting the inherent robustness of the underlying Markov Decision Process. We compute robustness certificate functions (off-line), that give agents a conservative indication of how far their state measurements...

conference paper 2022

document

Event-Based Communication in Distributed Q-Learning

Jarne Ornia, D. (author), Mazo, M. (author)

We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the...

conference paper 2022

document

Learning Complex Policy Distribution with CEM Guided Adversarial Hypernetwork

Tang, Shi Yuan (author), Oliehoek, F.A. (author), Irissappane, Athirai A. (author), Zhang, Jie (author)

Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural...

conference paper 2021

document

Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework

Li, Guangliang (author), Whiteson, Shimon (author), Dibeklioğlu, Hamdi (author), Hung, H.S. (author)

Interactive reinforcement learning provides a way for agents to learn to solve tasks from evaluative feedback provided by a human user. Previous research showed that humans give copious feedback early in training but very sparsely thereafter. In this paper, we investigate the potential of agent learning from trainers’ facial expressions via...

conference paper 2021

document

Transient non-stationarity and generalisation in deep reinforcement learning

Igl, Maximilian (author), Farquhar, Gregory (author), Luketina, Jelena (author), Böhmer, J.W. (author), Whiteson, Shimon (author)

Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. For example, most RL algorithms collect new data throughout training, using a non-stationary behaviour policy. Due to the transience of this non-stationarity, it is often not explicitly addressed in deep RL and a single neural network is continually...

conference paper 2021

document

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Yang, Q. (author), Simão, T. D. (author), Tindemans, Simon H. (author), Spaan, M.T.J. (author)

Safe exploration is regarded as a key priority area for reinforcement learning research. With separate reward and safety signals, it is natural to cast it as constrained reinforcement learning, where expected long-term costs of policies are constrained. However, it can be hazardous to set constraints on the expected safety signal without...

conference paper 2021

document

Synthesising Reinforcement Learning Policies Through Set-Valued Inductive Rule Learning

Coppens, Youri (author), Steckelmacher, Denis (author), Jonker, C.M. (author), Nowe, A.S.P. (author)

Today’s advanced Reinforcement Learning algorithms produce black-box policies, that are often difficult to interpret and trust for a person. We introduce a policy distilling algorithm, building on the CN2 rule mining algorithm, that distills the policy into a rule-based decision system. At the core of our approach is the fact that an RL...

conference paper 2021

document

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks

Pérez-Dattari, Rodrigo (author), Celemin, Carlos (author), Ruiz-del-Solar, Javier (author), Kober, J. (author)

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN...

conference paper 2020

document

Influence-Based Abstraction in Deep Reinforcement Learning

Suau, M. (author), Congeduti, E. (author), Starre, R.A.N. (author), Czechowski, A.T. (author), Oliehoek, F.A. (author)

thousands, or even millions of state variables. Unfortunately, applying reinforcement learning algorithms to handle complex tasks becomes more and more challenging as the number of state variables increases. In this paper, we build on the concept of influence-based abstraction which tries to tackle such scalability issues by decomposing large...

conference paper 2019