Spiking Neural Networks Based Data Driven Control

Liu, Y.

Spiking Neural Networks Based Data Driven Control

An Illustration Using Cart-Pole Balancing Example

Master thesis (2023)

Authors

Y. Liu Electrical Engineering, Mathematics and Computer Science

Contributors

Wei Pan Robot Dynamics (mentor)

Martijn Wisse Robot Dynamics (graduation committee member)

R. Bishnoi Computer Engineering (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Machine Learning SNN Intelligent Control

To reference this document use:

http://resolver.tudelft.nl/uuid:f37c63d1-1ba0-4ba6-bd22-6f09b3ef091f

More Info

expand_more

Published Date

19-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Machine learning can be effectively applied in control loops to robustly make optimal control decisions. There is increasing interest in using spiking neural networks (SNNs) as the apparatus for machine learning in control engineering, because SNNs can potentially offer high energy efficiency and new SNN-enabling neuromorphic hardwares are being rapidly developed. A defining character of control problems is that environmental reactions and delayed rewards must be considered. While reinforcement learning (RL) provides the fundamental mechanisms to address such problems, realizing these mechanisms in SNN learning has been underexplored. Previously, schemes of spike timing dependent plasticity (STDP) learning modulated by factors of temporal difference (TD-STDP) or reward (R-STDP) have been proposed for RL with SNN. Here we designed and implemented an SNN controller to explore and compare these two schemes by considering Cart-Pole balancing as a representative example. While the TD-based learning rules are very general ones, the resulted model exhibits rather slow convergence, producing noisy and imperfect results even after prolonged training. We show that by integrating the understanding of the dynamics of the environment into the reward function of R-STDP, a robust SNN-based controller can be learnt much more efficiently than by TD-STDP. The work of this master thesis project has also been published as a paper in Electronics, Vol. 12.

Files

Thesis_Yuxiang_Liu.pdf

(pdf | 5.6 Mb)

License info not available