Searched for: author%3A%22Spaan%2C+M.T.J.%22
(1 - 6 of 6)
document
Neustroev, G. (author), Ponnambalam, C.T. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Reinforcement learning requires exploration, leading to repeated execution of sub-optimal actions. Naive exploration techniques address this problem by changing gradually from exploration to exploitation. This approach employs a wide search resulting in exhaustive exploration and low sample-efficiency. More advanced search methods explore...
conference paper 2020
document
de Nijs, F. (author), Theocharous, Georgios (author), Vlassis, Nikos (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Personalized recommendations are increasingly important to engage users and guide them through large systems, for example when recommending points of interest to tourists visiting a popular city. To maximize long-term user experience, the system should consider issuing recommendations sequentially, since by observing the user's response to a...
conference paper 2018
document
de Nijs, F. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
Resource constraints frequently complicate multi-agent planning problems. Existing algorithms for resource-constrained, multi-agent planning problems rely on the assumption that the constraints are deterministic. However, frequently resource constraints are themselves subject to uncertainty from external influences. Uncertainty about constraints...
conference paper 2018
document
de Nijs, F. (author), Walraven, E.M.P. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will...
conference paper 2017
document
Scharpff, J.C.D. (author), Roijers, Diederik M. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP (MMDP) setting such structure is not present. We propose a new optimal solver...
conference paper 2016
document
de Nijs, F. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
When multiple independent agents use a limited shared resource, they need to coordinate and thereby their planning problems become coupled. We present a resource assignment strategy that decouples agents using marginal utility cost, allowing them to plan individually. We show that agents converge to an expected cost curve by keeping a history of...
conference paper 2016
Searched for: author%3A%22Spaan%2C+M.T.J.%22
(1 - 6 of 6)