MS

Miguel Suau de Castro

14 records found

Authored

Reinforcement learning techniques have demonstrated great promise in tackling sequential decision-making problems. However, the inherent complexity of real-world scenarios presents significant challenges for its application. This thesis takes a fresh approach that explores the un ...
Reinforcement learning agents may sometimes develop habits that are effective only when specific policies are followed. After an initial exploration phase in which agents try out different actions, they eventually converge toward a particular policy. When this occurs, the distrib ...

Influence-Augmented Local Simulators

A Scalable Solution for Fast Deep RL in Large Networked Systems

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods us ...

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight si ...

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning. Many real-world problems, however, exhibit overly complex dynamics, which makes their full-scale simulation computationally slow. In this paper, we sh ...
How can we plan efficiently in a large and complex environment when the time budget is limited? Given the original simulator of the environment, which may be computationally very demanding, we propose to learn online an approximate but much faster simulator that improves over tim ...
Deep Reinforcement Learning (RL) is a promising technique towards constructing intelligent agents, but it is not always easy to understand the learning process and the factors that impact it. To shed some light on this, we analyze the Latent State Representations (LSRs) that deep ...
Recent years have seen a surge of algorithms and architectures for deep Re- inforcement Learning (RL), many of which have shown remarkable success for various problems. Yet, little work has attempted to relate the performance of these algorithms and architectures to what the resu ...
How can we plan efficiently in real time to control an agent in a complex environment that may involve many other agents? While existing sample-based planners have enjoyed empirical success in large POMDPs, their performance heavily relies on a fast simulator. However, real-world ...
thousands, or even millions of state variables. Unfortunately, applying reinforcement learning algorithms to handle complex tasks becomes more and more challenging as the number of state variables increases. In this paper, we build on the concept of influence-based abstraction wh ...

Contributed

Learning What to Attend to

Using bisimulation metrics to explore and improve upon what a deep reinforcement learning agent learns

We analyze the internal representations that deep Reinforcement Learning (RL) agents form of their environments and whether these representations correspond to what such agents should ideally learn. The purpose of this comparison is both a better understanding of why certain algo ...
Traffic congestion is a problem of tremendous size that affects many people. Using Reinforcement Learning to find a light control policy can ease traffic congestion and decrease travel time for vehicles. This paper specifically looks at the effect of using different reward functi ...
Time series forecasting has been proved to be relatively easier for stationary time series, compared to non-stationary time series. This research proposes a method to partially omit the non-stationarity of the data using prioritized sampling. Using multiple feature selection meth ...