A Data Perspective on Ethical Challenges in Voice Biometrics Research

Journal Article (2025)
Author(s)

Anna Leschanowsky (Fraunhofer Institute for Integrated Systems and Devices Technology IISB)

Casandra Rusti (University of Southern California)

Carolyn Quinlan (University of Toronto)

Michaela Pnacek (York University)

Lauriane Gorce (Mines Paris – PSL)

Wiebke (Toussaint) Toussaint (TU Delft - Information and Communication Technology)

Research Group
Information and Communication Technology
DOI related publication
https://doi.org/10.1109/TBIOM.2024.3446846
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Information and Communication Technology
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Issue number
1
Volume number
7
Pages (from-to)
118-131
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Speaker recognition technology, deployed in sectors like banking, education, recruitment, immigration, law enforcement, and healthcare, relies heavily on biometric data. However, the ethical implications and biases inherent in the datasets driving this technology have not been fully explored. Through a longitudinal study of close to 700 papers published at the ISCA Interspeech Conference in the years 2012 to 2021, we investigate how dataset use has evolved alongside the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field and examines their usage patterns. The analysis reveals significant shifts in data practices since the advent of deep learning: a small number of datasets dominate speaker recognition training and evaluation, and the majority of studies evaluate their systems on a single dataset. For four key datasets–Switchboard, Mixer, VoxCeleb, and ASVspoof–we conduct a detailed analysis of metadata and collection methods to assess ethical concerns and privacy risks. Our study highlights numerous challenges related to sampling bias, re-identification, consent, disclosure of sensitive information and security risks in speaker recognition datasets, and emphasizes the need for more representative, fair, and privacy-aware data collection in this domain.

Files

A_Data_Perspective_on_Ethical_... (pdf)
(pdf | 1.66 Mb)
- Embargo expired in 21-02-2025
License info not available