Searched for: +
(1 - 20 of 32)

Pages

document
Cheng, Ji (author), Xue, Bo (author), Jiaxiang, Y. (author), Zhang, Qingfu (author)
Multi-objective Stochastic Linear bandit (MOSLB) plays a critical role in the sequential decision-making paradigm, however, most existing methods focus on the Pareto dominance among different objectives without considering any priority. In this paper, we study bandit algorithms under mixed Pareto-lexicographic orders, which can reflect...
journal article 2024
document
Ribeiro, M.J. (author)
Increasing delays and congestion reported in many aviation sectors indicate that the current centralised operational model is rapidly approaching saturation levels. Air Traffic Control (ATC) system is not expected to keep pace with the ever-increasing demand for air transportation. Its capacity is still limited by the available controllers, and...
doctoral thesis 2023
document
Yang, Q. (author), Spaan, M.T.J. (author)
Without an assigned task, a suitable intrinsic objective for an agent is to explore the environment efficiently. However, the pursuit of exploration will inevitably bring more safety risks.<br/>An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown.<br/>In this paper, we propose a...
conference paper 2023
document
Albers, N. (author), Neerincx, M.A. (author), Brinkman, W.P. (author)
Despite their prevalence in eHealth applications for behavior change, persuasive messages tend to have small effects on behavior. Conditions or states (e.g., confidence, knowledge, motivation) and characteristics (e.g., gender, age, personality) of persuadees are two promising components for more effective algorithms for choosing persuasive...
conference paper 2023
document
Groot, D.J. (author), Ribeiro, M.J. (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)
The number of unmanned aircraft operating in the airspace is expected to grow exponentially during the next decades. This will likely lead to traffic densities that are higher than those currently observed in civil and general aviation, and might require both a different airspace structure compared to conventional aviation, as well as different...
conference paper 2023
document
Van Der Linde, Stan (author), De Kok, Willem (author), Bontekoe, Tariq (author), Feld, S. (author)
Compiling a quantum circuit for specific quantum hardware is a challenging task. Moreover, current quantum computers have severe hardware limitations. To make the most use of the limited resources, the compilation process should be optimized. To improve currents methods, Reinforcement Learning (RL), a technique in which an agent interacts...
conference paper 2023
document
van Tilburg, Jasper (author), Cavalcante Siebert, L. (author), Cremer, Jochen (author)
This paper presents a decentralized Multi-Agent Reinforcement Learning (MARL) approach to an incentive-based Demand Response (DR) program, which aims to maintain the capacity limits of the electricity grid and prevent grid congestion by financially incentivizing residential consumers to reduce their energy consumption. The proposed approach...
conference paper 2023
document
Zhao, Zheyu (author), Cheng, H. (author), Xu, Xiaohua (author)
Massive terminal users have brought explosive need of data residing at edge of overall network. Multiple Mobile Edge Computing (MEC) servers are built in/near base station to meet this need. However, optimal distribution of these servers to multiple users in real time is still a problem. Reinforcement Learning (RL) as a framework to solve...
conference paper 2023
document
Ferreira de Brito, B.F. (author)
Autonomous robots will profoundly impact our society, making our roads safer, reducing labor costs and carbon dioxide (CO2) emissions, and improving our life quality. However, to make that happen, robots need to navigate among humans, which is extremely difficult. Firstly, humans do not explicitly communicate their intentions and use intuition...
doctoral thesis 2022
document
Pierotti, J. (author)
One of the world’s biggest challenges is that living beings have to share a limited amount of resources. As people of science, we strive to find innovative ways to better use these resources, to reach and positively affect more and more people. In the field of optimization, we aim at finding an optimal allocation of limited sets of resources to...
doctoral thesis 2022
document
Vergara Barrios, P.P. (author), Salazar, Mauricio (author), Giraldo, Juan S. (author), Palensky, P. (author)
In this paper, a Reinforcement Learning (RL)-based approach to optimally dispatch PV inverters in unbalanced distribution systems is presented. The proposed approach exploits a decentralized architecture in which PV inverters are operated by agents that perform all computational processes locally; while communicating with a central agent to...
journal article 2022
document
Ponnambalam, C.T. (author), Kamran, Danial (author), Simão, T. D. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author)
conference paper 2022
document
Badea, C. (author), Groot, D.J. (author), Morfin Veytia, A. (author), Ribeiro, M.J. (author), Dalmau, Ramon (author), Ellerbroek, Joost (author), Hoekstra, J.M. (author)
Air traffic demand has increased at an unprecedented rate in the last decade (albeit interrupted by the COVID pandemic), but capacity has not increased at the same rate. Higher levels of automation and the implementation of decision-support tools for air traffic controllers could help increase capacity and catch up with demand. The air traffic...
conference paper 2022
document
Jarne Ornia, D. (author), Mazo, M. (author)
We present an approach to safely reduce the communication required between agents in a Multi-Agent Reinforcement Learning system by exploiting the inherent robustness of the underlying Markov Decision Process. We compute robustness certificate functions (off-line), that give agents a conservative indication of how far their state measurements...
conference paper 2022
document
Jarne Ornia, D. (author), Mazo, M. (author)
We present an approach to reduce the communication of information needed on a Distributed Q-Learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a Distributed Q-Learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents sharing a value function explore the...
conference paper 2022
document
Tang, Shi Yuan (author), Oliehoek, F.A. (author), Irissappane, Athirai A. (author), Zhang, Jie (author)
Cross-Entropy Method (CEM) is a gradient-free direct policy search method, which has greater stability and is insensitive to hyperparameter tuning. CEM bears similarity to population-based evolutionary methods, but, rather than using a population it uses a distribution over candidate solutions (policies in our case). Usually, a natural...
conference paper 2021
document
Li, Guangliang (author), Whiteson, Shimon (author), Dibeklioğlu, Hamdi (author), Hung, H.S. (author)
Interactive reinforcement learning provides a way for agents to learn to solve tasks from evaluative feedback provided by a human user. Previous research showed that humans give copious feedback early in training but very sparsely thereafter. In this paper, we investigate the potential of agent learning from trainers’ facial expressions via...
conference paper 2021
document
Rijsdijk, J. (author), Wu, L. (author), Perin, G. (author), Picek, S. (author)
Deep learning represents a powerful set of techniques for profiling side-channel analysis. The results in the last few years show that neural network architectures like multilayer perceptron and convolutional neural networks give strong attack performance where it is possible to break targets protected with various coun-termeasures....
journal article 2021
document
Muench, C. (author), Oliehoek, F.A. (author), Gavrila, D. (author)
Modeling possible future outcomes of robot-human interactions is of importance in the intelligent vehicle and mobile robotics domains. Knowing the reward function that explains the observed behavior of a human agent is advantageous for modeling the behavior with Markov Decision Processes (MDPs). However, learning the rewards that determine...
journal article 2021
document
Igl, Maximilian (author), Farquhar, Gregory (author), Luketina, Jelena (author), Böhmer, J.W. (author), Whiteson, Shimon (author)
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. For example, most RL algorithms collect new data throughout training, using a non-stationary behaviour policy. Due to the transience of this non-stationarity, it is often not explicitly addressed in deep RL and a single neural network is continually...
conference paper 2021
Searched for: +
(1 - 20 of 32)

Pages