Print Email Facebook Twitter Knowledge gradient exploration in online kernel-based LSPI Part of: BNAIC 2013: Proceedings of the 25th Benelux Conference on Artificial Intelligence· list the conference papers Title Knowledge gradient exploration in online kernel-based LSPI Author Yahyaa, S. Manderick, B. Date 2013-11-08 Abstract We introduce online kernel-based LSPI (or least squares policy iteration) which combines feature of online LSPI and offline kernel-based LSPI. The knowledge gradient is used as exploration policy in both online LSPI and online kernel-based LSPI in order to compare their performance on 2 discrete Markov decision problems. Automatic feature selection in online kernel-based LSPI, which is a result of the approximate linear dependency based kernel sparsification, improves the performance when compared to online LSPI. To reference this document use: http://resolver.tudelft.nl/uuid:02f49672-936c-430f-a30e-243388aeabe4 Part of collection Conference proceedings Document type conference paper Rights (c) 2013 Yahyaa, S.; Manderick, B. Files PDF paper_18.pdf 273.37 KB Close viewer /islandora/object/uuid:02f49672-936c-430f-a30e-243388aeabe4/datastream/OBJ/view