Radar-based gesture recognition with spiking neural networks

More Info
expand_more

Abstract

Radar-based sensors are used to perceive their environment and objects of interest in a contactless manner and with robust performance in all weather and light conditions. One of the main drawbacks is the energy needed for the processing of radar data in order to extract its valuable information. Spiking neural networks are an emerging type of neural networks that aim to reduce the energy footprint of their computations while maintaining acceptable performance. To do so, the data is encoded through time in binary spikes to help leverage the low cost of additions. This is in stark opposition to the much higher cost of multiplications that are highly present in conventional artificial neural networks. The drawback of this energy gain is that the rate encoding adds an extra time dimension, hence increasing the latency between the acquisition of the radar data and the recognition of the corresponding gesture class.
More specifically, this work uses an air-marshalling dataset from the literature to exemplify a gesture recognition problem. The first step is to replicate the well-known radar processing pipeline, and classification approach based on conventional neural networks to reach high classification accuracies. A validation accuracy of 98.5% and a test accuracy of 59.8% are reached on the full dataset (11 classes) and 86.7% on their 5 best classes (test set), which is about the same performance reported in the original dataset baseline.
The following steps propose an adaptation of this non-spiking pipeline to its spiking equivalent by optimising the trade-off between the model’s latency, its memory requirements and its accuracy. This work also develops a strategy to tune spiking networks’ thresholds to make the process of developing a spiking equivalent more efficient. For example, the spiking network can reach 94.1% validation accuracy using 100 encoding steps and only 4.7% of the initial memory requirements, and reach 46.8% on the test set. However, this trade-off can be shifted towards lower latency, lower memory, or higher accuracy according to the desired requirements.

Files

Thesis_Lucie_de_Ghellinck.pdf
Unknown license
warning

File under embargo until 26-08-2025