The effect of recentness of consumer-grade wearable training data on the ability of a DNN to identify users

None, None

The effect of recentness of consumer-grade wearable training data on the ability of a DNN to identify users

Bachelor Thesis (2023)

Author(s)

N.A. van der Voort (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Naseri Jahfari – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

R. Ghorbani – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

D.M.J. Tax – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

G. Lan – Graduation committee member (TU Delft - Embedded Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Pattern Recognition Bioinformatics Wearables Person Identification

To reference this document use:

https://resolver.tudelft.nl/uuid:72ec051e-afe1-413d-b37f-9e961c342e70

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

30-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Heart rate data and other data collected by consumer-grade wearable devices can give away quite useful information about the user. It can for example be used by machine learning algorithms such as Deep Neural Networks (DNN) to learn patterns about cardiovascular disease and fitness, or be used for identification. Heart rate patterns can also change quickly within the span of several months, which could make older heart rate data less useful when training a DNN. This paper shows that the DNN did indeed perform significantly worse when trying to identify people on older data compared to recent data. The accuracy calculated from the test set was 63.64% when trained on the most recently available training data, in comparison to 33.88% when trained on the least recent data which was more than 200 days older. When changing the recentness of training data only for a single user, there was also always an improvement in the accuracy of the model to identify that particular person. The accuracy to identify all users however did not necessarily increase, and sometimes even decreased. Using more data for training still outperforms using a smaller amount of samples of more recent data by slight margins, showing the trade-off between the recentness of data and the amount of data used for training. However, if fast training times are required, taking the most recent data windows can still lead to a similar performance as when training on all available data.

Files

RP_final_paper.pdf

(pdf | 0.555 Mb)

License info not available