Temporal Dynamics Modelling for People Counting in Point Clouds

An Extension on PointNet and MARS through LSTM Integration

More Info
expand_more

Abstract

The People Counting Problem requires calculating the number of people in a region of interest. This is needed in crowd-monitoring scenarios but has become increasingly problematic when relying on video cameras, as they raise privacy concerns. Instead, we propose using a mmWave radar to detect people by creating point clouds from their radar signal reflections. This approach, however, can pose challenges when people walk closely together because their individual point clouds overlap and are seen as a single, larger cloud. It is difficult to count how many individuals this large point cloud holds, which can lead to miscounting the people in the scene. One approach to address this issue is leveraging the time dimension in people walking sequences, which can be done with Long Short-Term Memory (LSTM) models. Given this, we investigate how two state-of-the-art models, PointNet and MARS, perform for people counting from point clouds when extended through LSTMs. The results show how both PointNet and MARS improve performance when extended by LSTMs. Particularly, despite having over double the parameters, MARS+LSTM outperforms PointNet+LSTM in terms of accuracy and computational efficiency. MARS+LSTM can effectively capture small changes in the local structure of point clouds between frames, which PointNet loses due to max pooling. This highlights the importance of selecting a model architecture, like the CNN in MARS, that aligns with the data characteristics to maximise performance.