Print Email Facebook Twitter Symbolic method for deriving policy in reinforcement learning Title Symbolic method for deriving policy in reinforcement learning Author Alibekov, Eduard (Czech Technical University) Kubalìk, Jiřì (Czech Technical University) Babuska, R. (TU Delft OLD Intelligent Control & Robotics; Czech Technical University) Contributor Bullo, Francesco (editor) Prieur, Christophe (editor) Giua, Alessandro (editor) Date 2016 Abstract This paper addresses the problem of deriving a policy from the value function in the context of reinforcement learning in continuous state and input spaces. We propose a novel method based on genetic programming to construct a symbolic function, which serves as a proxy to the value function and from which a continuous policy is derived. The symbolic proxy function is constructed such that it maximizes the number of correct choices of the control input for a set of selected states. Maximization methods can then be used to derive a control policy that performs better than the policy derived from the original approximate value function. The method was experimentally evaluated on two control problems with continuous spaces, pendulum swing-up and magnetic manipulation, and compared to a standard policy derivation method using the value function approximation. The results show that the proposed method and its variants outperform the standard method. Subject Genetic programmingSociologyStatisticsLearning (artificial intelligence)StandardsCyberneticsTrajectory To reference this document use: http://resolver.tudelft.nl/uuid:086f4ffd-09e9-4033-a6f3-5c7358705052 DOI https://doi.org/10.1109/CDC.2016.7798684 Publisher IEEE, Piscataway, NJ, USA ISBN 978-1-5090-1837-6 Source Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC) Event 55th IEEE Conference on Decision and Control, CDC 2016, 2016-12-12 → 2016-12-14, Las Vegas, United States Bibliographical note Accepted Author Manuscript Part of collection Institutional Repository Document type conference paper Rights © 2016 Eduard Alibekov, Jiřì Kubalìk, R. Babuska Files PDF Symbolic_Method_for_Deriv ... ersion.pdf 870.27 KB Close viewer /islandora/object/uuid:086f4ffd-09e9-4033-a6f3-5c7358705052/datastream/OBJ/view