Searched for: +
(1 - 10 of 10)
document
de Nijs, F. (author), Walraven, E.M.P. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
In domains such as electric vehicle charging, smart distribution grids and autonomous warehouses, multiple agents share the same resources. When planning the use of these resources, agents need to deal with the uncertainty in these domains. Although several models and algorithms for such constrained multiagent planning problems under...
journal article 2021
document
Neustroev, G. (author), Ponnambalam, C.T. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Reinforcement learning requires exploration, leading to repeated execution of sub-optimal actions. Naive exploration techniques address this problem by changing gradually from exploration to exploitation. This approach employs a wide search resulting in exhaustive exploration and low sample-efficiency. More advanced search methods explore...
conference paper 2020
document
Scharpff, J.C.D. (author), Schraven, D.F.J. (author), Volker, Leentje (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
The next step in the use of innovative, dynamic and performance-based contracts for service delivery by contractors could be use of monetary incentives to stimulate self-regulation of the network. Because it is currently unclear how performance-based payments in network tenders can effectively encourage network members to coordinate their own...
journal article 2020
document
Scharpff, J.C.D. (author), Schraven, D.F.J. (author), Volker, Leentje (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
This white paper describes the Road Maintenance Planning game, a game that simulates planning, coordination and execution of maintenance projects in the domain of infrastructural maintenance. In particular, the game models the dynamic contracting procedure of Volker et al. (2014), an innovative way of contracting public works to a team group of...
other 2019
document
de Nijs, F. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Demand response refers to the concept that power consumption should aim to match supply, instead of supply following demand. It is a key technology to enable the successful transition to an electricity system that incorporates more and more intermittent and uncontrollable renewable energy sources. For instance, loads such as heat pumps or...
book chapter 2019
document
de Nijs, F. (author), Theocharous, Georgios (author), Vlassis, Nikos (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Personalized recommendations are increasingly important to engage users and guide them through large systems, for example when recommending points of interest to tourists visiting a popular city. To maximize long-term user experience, the system should consider issuing recommendations sequentially, since by observing the user's response to a...
conference paper 2018
document
de Nijs, F. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
Resource constraints frequently complicate multi-agent planning problems. Existing algorithms for resource-constrained, multi-agent planning problems rely on the assumption that the constraints are deterministic. However, frequently resource constraints are themselves subject to uncertainty from external influences. Uncertainty about constraints...
conference paper 2018
document
de Nijs, F. (author), Walraven, E.M.P. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)
Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will...
conference paper 2017
document
Scharpff, J.C.D. (author), Roijers, Diederik M. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP (MMDP) setting such structure is not present. We propose a new optimal solver...
conference paper 2016
document
de Nijs, F. (author), Spaan, M.T.J. (author), de Weerdt, M.M. (author)
When multiple independent agents use a limited shared resource, they need to coordinate and thereby their planning problems become coupled. We present a resource assignment strategy that decouples agents using marginal utility cost, allowing them to plan individually. We show that agents converge to an expected cost curve by keeping a history of...
conference paper 2016
Searched for: +
(1 - 10 of 10)