Pallas: Novel Sound Classification at the Edge

None, None

Pallas: Novel Sound Classification at the Edge

Master Thesis (2024)

Author(s)

M. Groenenboom (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Marco A. Zuñiga Zamalloa – Mentor (TU Delft - Networked Systems)

Kaitai Liang – Graduation committee member (TU Delft - Cyber Security)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Data augmentation Noise pollution Audio classification Machine listening Environmental sound classification

To reference this document use:

https://resolver.tudelft.nl/uuid:ae0b272f-455d-468a-8965-fc263e14a3dc

More Info

expand_more

Publication Year

2024

Language

English

Copyright

Graduation Date

27-02-2024

Awarding Institution

Delft University of Technology

Programme

['Electrical Engineering | Embedded Systems']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Sound pollution is becoming an increasingly pressing issue in today’s world. To effectively address it, it must be measured. To this end, Serval was developed, an edge-ai powered sound recognition solution. Its lack of accuracy, however, makes it difficult to deploy. This thesis examines the potential for improving this solution while staying within its technical limitations in order to raise the accuracy to satisfactory levels. Multiple aspects of Serval were evaluated and compared to the current stateof-the-art: its data augmentation, the embedding it uses, and the hardware it runs on. Alternatives for each of these components were evaluated and each aspect was optimized.
The results show that after these improvements, the single-label F1-score increased from 0.60 to 0.76, and the single- and multi-label combined F1-score increased from 0.64 to 0.67. Finally, power consumption has been reduced by 14%, partially thanks to the usage of specialized hardware. One issue that has yet to be adequately addressed is the size of the dataset. By increasing the number of samples, the accuracy could be further improved.

Files

Msc_Thesis_Public.pdf

(pdf | 3.55 Mb)

License info not available