- document
-
Ponnambalam, C.T. (author)Reinforcement learning (RL) models the learning process of humans, but as exciting advances are made that use increasingly deep neural networks, some of the fundamental strengths of human learning are still underutilized by RL agents. One of the most exciting properties of RL is that it appears to be incredibly flexible, requiring no model or...doctoral thesis 2023
- document
- Ponnambalam, C.T. (author), Kamran, Danial (author), Simão, T. D. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author) conference paper 2022
- document
-
Kamran, Danial (author), Simão, T. D. (author), Yang, Q. (author), Ponnambalam, C.T. (author), Fischer, Johannes (author), Spaan, M.T.J. (author), Lauer, Martin (author)The use of reinforcement learning (RL) in real-world domains often requires extensive effort to ensure safe behavior. While this compromises the autonomy of the system, it might still be too risky to allow a learning agent to freely explore its environment. These strict impositions come at the cost of flexibility and applying them often relies...conference paper 2022
- document
-
Ponnambalam, C.T. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author)Behavior cloning is a method of automated decision-making that aims to extract meaningful information from expert demonstrations and reproduce the same behavior autonomously. It is unlikely that demonstrations will exhaustively cover the potential problem space, compromising the quality of automation when out-of-distribution states are...conference paper 2021
- document
-
Smit, Jordi (author), Ponnambalam, C.T. (author), Spaan, M.T.J. (author), Oliehoek, F.A. (author)Offline reinforcement learning (RL), or learning from a fixed data set, is an attractive alternative to online RL. Offline RL promises to address the cost and safety implications of tak- ing numerous random or bad actions online, a crucial aspect of traditional RL that makes it difficult to apply in real-world problems. However, when RL is na...conference paper 2021
- document
-
Neustroev, G. (author), Ponnambalam, C.T. (author), de Weerdt, M.M. (author), Spaan, M.T.J. (author)Reinforcement learning requires exploration, leading to repeated execution of sub-optimal actions. Naive exploration techniques address this problem by changing gradually from exploration to exploitation. This approach employs a wide search resulting in exhaustive exploration and low sample-efficiency. More advanced search methods explore...conference paper 2020
- document
-
Ponnambalam, C.T. (author), Oliehoek, F.A. (author), Spaan, M.T.J. (author)The goal in behavior cloning is to extract meaningful information from expertdemonstrations and reproduce the same behavior autonomously. However, theavailable data is unlikely to exhaustively cover the potential problem space. As aresult, the quality of automated decision making is compromised without elegantways to handle the encountering of...conference paper 2020