Print Email Facebook Twitter Policy derivation methods for critic-only reinforcement learning in continuous spaces Title Policy derivation methods for critic-only reinforcement learning in continuous spaces Author Alibekov, Eduard (Czech Technical University) Kubalik, Jiri (Czech Technical University) Babuska, R. (TU Delft Learning & Autonomous Control; Czech Technical University) Date 2018 Abstract This paper addresses the problem of deriving a policy from the value function in the context of critic-only reinforcement learning (RL) in continuous state and action spaces. With continuous-valued states, RL algorithms have to rely on a numerical approximator to represent the value function. Numerical approximation due to its nature virtually always exhibits artifacts which damage the overall performance of the controlled system. In addition, when continuous-valued action is used, the most common approach is to discretize the action space and exhaustively search for the action that maximizes the right-hand side of the Bellman equation. Such a policy derivation procedure is computationally involved and results in steady-state error due to the lack of continuity. In this work, we propose policy derivation methods which alleviate the above problems by means of action space refinement, continuous approximation, and post-processing of the V-function by using symbolic regression. The proposed methods are tested on nonlinear control problems: 1-DOF and 2-DOF pendulum swing-up problems, and on magnetic manipulation. The results show significantly improved performance in terms of cumulative return and computational complexity. Subject Reinforcement learningContinuous actionsMulti-variable systemsOptimal controlPolicy derivationOptimization To reference this document use: http://resolver.tudelft.nl/uuid:b79399a4-b131-4a74-bac0-7e96930c9b1b DOI https://doi.org/10.1016/j.engappai.2017.12.004 Embargo date 2020-02-05 ISSN 0952-1976 Source Engineering Applications of Artificial Intelligence, 69, 178-187 Bibliographical note Accepted Author Manuscript Part of collection Institutional Repository Document type journal article Rights © 2018 Eduard Alibekov, Jiri Kubalik, R. Babuska Files PDF root_R2.pdf 925.29 KB Close viewer /islandora/object/uuid:b79399a4-b131-4a74-bac0-7e96930c9b1b/datastream/OBJ/view