- document
-
Mandersloot, A.V. (author), Oliehoek, F.A. (author), Czechowski, A.T. (author)In this study, we investigate the effects of conditioning Independent Q-Learners (IQL) not solely on the individual action-observation history, but additionally on the sufficient plan-time statistic for Decentralized Partially Observable Markov Decision Processes. In doing so, we attempt to address a key shortcoming of IQL, namely that it is...conference paper 2020
- document
-
Mandersloot, A.V. (author)The Decentralized Partially Observable Markov Decision Process is a commonly used framework to formally model scenarios in which multiple agents must collaborate using local information. A key difficulty in a Dec-POMDP is that in order to coordinate successfully, an agent must decide on actions not only using its own information, but also by...master thesis 2020