Temporal Dynamics Modelling for People Counting in Point Clouds

An Extension on PointNet and MARS through LSTM Integration

Bachelor thesis (2024)

Authors

M. Escribano Esteban Electrical Engineering, Mathematics and Computer Science

Contributors

Marco Zuniga Networked Systems - (mentor)

G. Vaidya Networked Systems - (mentor)

M. Weinmann Computer Graphics and Visualisation - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science

More Info

expand_more

To reference this document use:

http://resolver.tudelft.nl/uuid:089b5996-b1f6-4b9a-badf-f0d4154fa5c0

Published Date

26-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The People Counting Problem requires calculating the number of people in a region of interest. This is needed in crowd-monitoring scenarios but has become increasingly problematic when relying on video cameras, as they raise privacy concerns. Instead, we propose using a mmWave radar to detect people by creating point clouds from their radar signal reflections. This approach, however, can pose challenges when people walk closely together because their individual point clouds overlap and are seen as a single, larger cloud. It is difficult to count how many individuals this large point cloud holds, which can lead to miscounting the people in the scene. One approach to address this issue is leveraging the time dimension in people walking sequences, which can be done with Long Short-Term Memory (LSTM) models. Given this, we investigate how two state-of-the-art models, PointNet and MARS, perform for people counting from point clouds when extended through LSTMs. The results show how both PointNet and MARS improve performance when extended by LSTMs. Particularly, despite having over double the parameters, MARS+LSTM outperforms PointNet+LSTM in terms of accuracy and computational efficiency. MARS+LSTM can effectively capture small changes in the local structure of point clouds between frames, which PointNet loses due to max pooling. This highlights the importance of selecting a model architecture, like the CNN in MARS, that aligns with the data characteristics to maximise performance.

Files

Final_Paper_Marina_Escribano_E... (.pdf)

(.pdf | 1.23 Mb)