Search results | TU Delft Repositories

Searched for: subject%3A%22Deep%255C+reinforcement%255C+learning%22

(1 - 2 of 2)

document: Exploring the Effects of Conditioning Independent Q-Learners on the Sufficient Statistic for Dec-POMDPs
Mandersloot, A.V. (author), Oliehoek, F.A. (author), Czechowski, A.T. (author)
In this study, we investigate the effects of conditioning Independent Q-Learners (IQL) not solely on the individual action-observation history, but additionally on the sufficient plan-time statistic for Decentralized Partially Observable Markov Decision Processes. In doing so, we attempt to address a key shortcoming of IQL, namely that it is...
conference paper 2020

document: Exploring the effects of conditioning Independent Q-Learners on the sufficient plan-time statistic for Dec-POMDPs
Mandersloot, A.V. (author)
The Decentralized Partially Observable Markov Decision Process is a commonly used framework to formally model scenarios in which multiple agents must collaborate using local information. A key difficulty in a Dec-POMDP is that in order to coordinate successfully, an agent must decide on actions not only using its own information, but also by...
master thesis 2020