Search results | TU Delft Repositories

document

Comparing and Analyzing Different Speech Conversion Techniques for Transforming Dysarthric to Normal Speech

Liu, Jingxian (author)

Dysarthric speech, characterized by articulation problems and a slower speech rate, shows lower automatic speech recognition (ASR) performance compared to normal speech. To improve performance, researchers often try to enhance dysarthric speech to be more like normal speech before passing it through an ASR trained on normal speech. In this...

master thesis 2024

document

Voice Based Interfaces for Supermarket robots using Large Language Models

Nandkumar, CHANDRAN (author)

This thesis presents the design and evaluation of a comprehensive system for developing voice-based interfaces to support users in supermarkets. These interfaces enable customers to convey their needs across both generic and specific queries. While current state-of-the-art systems like GPTs by OpenAI are easily accessible and adaptable,...

master thesis 2024

document

Effects on voice hearing distress and social functioning of unguided application of a smartphone app — A randomized controlled trial

Jongeneel, Alyssa (author), Delespaul, Philippe (author), Tromp, N. (author), Scheffers, Dorien (author), van der Vleugel, Berber (author), de Bont, Paul (author), Kikkert, Martijn (author), Croes, Carlos F. (author), van den Berg, David (author)

Background: Temstem is a smartphone app developed with and for clinical voice hearing individuals with the aim to reduce their voice hearing distress and improve social functioning. Methods: A randomized controlled trial with adult outpatients suffering from distressing and frequent auditory verbal hallucinations (AVH) was conducted....

journal article 2024

document

Designing the visual aspect of Lynk&Co's future in-car voice assitant: Designing Lynk&Co's in-car voice assistant for the European Market in 2025-2028

Leclaire, Xavier (author)

Voice Assistants (VAs) have gained traction in cars, promising safer, more convenient driving experiences (Braun et al, 2021). These Intelligent Voice Assistants (IVAs) offer hands-free control over navigation, entertainment, and climate, reducing distractions and enhancing safety. IVAs also provide context-aware interactions, improving...

master thesis 2023

document

The Impact of Vocal Communication and its Personalization on Intention to Use of Chatbots Using Behavioral Activation to Support Patients Experiencing Depression

Doan, Kevin (author)

The 21st century has seen a significant increase in the global prevalence of mental health problems, affecting almost a billion people. These conditions not only reduce the quality of life for individuals but also lead to stigmatization, discrimination, and social isolation. The COVID-19 pandemic has further exacerbated mental health issues,...

master thesis 2023

document

Exploring Data Augmentation in Bias Mitigation Against Non-Native-Accented Speech

Zhang, Y. (author), Herygers, Aaricia (author), Patel, T.B. (author), Yue, Z. (author), Scharenborg, O.E. (author)

Automatic speech recognition (ASR) should serve every speaker, not only the majority “standard” speakers of a language. In order to build inclusive ASR, mitigating the bias against speaker groups who speak in a “non-standard” or “diverse” way is crucial. We aim to mitigate the bias against non-native-accented Flemish in a Flemish ASR system....

conference paper 2023

document

Beyond data transactions: a framework for meaningfully informed data donation

Gomez Ortega, A. (author), Bourgeois, Jacky (author), Hutiri, Wiebke (author), Kortuem, G.W. (author)

As we navigate physical (e.g., supermarket) and digital (e.g., social media) systems, we generate personal data about our behavior. Researchers and designers increasingly rely on this data and appeal to several approaches to collect it. One of these is data donation, which encourages people to voluntarily transfer their (personal) data...

journal article 2023

document

What is Sensitive About (Sensitive) Data? Characterizing Sensitivity and Intimacy with Google Assistant Users

Gomez Ortega, A. (author), Bourgeois, Jacky (author), Kortuem, G.W. (author)

Digital technologies have increasingly integrated into people's lives, continuously capturing their behavior through potentially sensitive data. In the context of voice assistants, there is a misalignment between experts, regulators, and users on whether and what data is 'sensitive', partly due to how data is presented to users; as single...

conference paper 2023

document

Designing trustworthy Voice Assistants for healthcare: Theory and practice of Voice Assistants for the Outpatient Clinic Healthy Pregnancy

Hagens, Emma (author)

In the Netherlands, the healthcare sector is facing increasing staff shortages and the demand for adequately trained healthcare personnel is expected to increase in the coming years. Shortages in obstetric care mean that not all women and their partners receive the care they need before, during and after pregnancy. To counter these shortages,...

master thesis 2022

document

The effects on speech detection of low sample frequency audio data

Uno, Taichi (author)

The interactions between human and machines are now common in our daily life. The audio data of human communication is a rich source of information, but it is con- sidered privacy-invasive for machines to listen to it. By reducing sampling frequency, it is possible to preserve privacy by making conversation unclear while still being possible to...

bachelor thesis 2022

document

Mitigating bias against non-native accents

Zhang, Yuanyuan (author)

Automatic Speech Recognition (ASR) systems have seen substantial improvements in the past decade; however, not for all speaker groups. Recent research shows that bias exists against different types of speech, including non-native accents, in state-of-the-art (SOTA) ASR systems. To attain inclusive speech recognition, i.e., ASR for everyone...

master thesis 2022

document

Few shot emotion recognition using intelligent voice assistants and wearables: Learning from few samples of speech and physiological signals

Kapadia, Mihir (author)

Emotion Recognition is one of the vastly studied areas of affective computing. Attempts have been made to design emotion recognition systems for everyday settings. The ubiquitous nature of Intelligent voice assistants (IVAs) in households, make them a great anchor for the introduction of emotion recognition technology to consumers. The existing...

master thesis 2022

document

Mitigating bias against non-native accents

Zhang, Y. (author), Zhang, Yixuan (author), Halpern, B.M. (author), Patel, T.B. (author), Scharenborg, O.E. (author)

Automatic speech recognition (ASR) systems have seen substantial improvements in the past decade; however, not for all speaker groups. Recent research shows that bias exists against different types of speech, including non-native accents, in state-of-the-art (SOTA) ASR systems. To attain inclusive speech recognition, i.e., ASR for everyone...

journal article 2022

document

Towards Identity Preserving Normal to Dysarthric Voice Conversion

Huang, Wen-Chin (author), Halpern, B.M. (author), Violeta, Lester Phillip (author), Scharenborg, O.E. (author), Toda, Tomoki (author)

We present a voice conversion framework that converts normal speech into dysarthric speech while preserving the speaker identity. Such a framework is essential for (1) clinical decision making processes and alleviation of patient stress, (2) data augmentation for dysarthric speech recognition. This is an especially challenging task since the...

conference paper 2022

document

The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition

Prananta, Luke (author), Halpern, B.M. (author), Feng, S. (author), Scharenborg, O.E. (author)

In this paper, we investigate several existing and a new state-of-the-art generative adversarial network-based (GAN) voice conversion method for enhancing dysarthric speech for improved dysarthric speech recognition. We compare key components of existing methods as part of a rigorous ablation study to find the most effective solution to...

journal article 2022

document

Design Guidelines for Inclusive Speaker Verification Evaluation Datasets

Hutiri, Wiebke (author), Gorce, Lauriane (author), Ding, Aaron Yi (author)

Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensures the security of voice-driven technologies. As a type of biometrics, it is necessary that SV is unbiased, with consistent and reliable performance across speakers irrespective of their demographic, social and economic attributes. Current SV...

journal article 2022

document

Ethical Self-Disclosing Voice User Interfaces for Delivery of News

Rao, Shruti (author), Resendez, Valeria (author), El Ali, Abdallah (author), Cesar, Pablo (author)

Voice User Interfaces (VUIs) such as Alexa and Google Home that use human-like design cues are an increasingly popular means for accessing news. Self-disclosure in particular may be used to build relationships of trust with users who may reveal intimate details about themselves. This information can be (mis)used by algorithms to tailor and...

conference paper 2022

document

A Conversational User Interface for Instructional Maintenance Reports

Kernan Freire, S. (author), Niforatos, E. (author), Rusak, Z. (author), Aschenbrenner, D. (author), Bozzon, A. (author)

Maintaining a complex system, such as a modern production line, is a knowledge-intensive task. Many firms use maintenance reports as a decision support tool. However, reports are often poor quality and tedious to compile. A Conversational User Interface (CUI) could streamline the reporting process by validating the user's input, eliciting...

conference paper 2022

document

Towards Trustworthy Edge Intelligence: Insights from Voice-Activated Services

Hutiri, Wiebke (author), Ding, Aaron Yi (author)

In an age of surveillance capitalism, anchoring the design of emerging smart services in trustworthiness is urgent and important. Edge Intelligence, which brings together the fields of AI and Edge computing, is a key enabling technology for smart services. Trustworthy Edge Intelligence should thus be a priority research concern. However,...

conference paper 2022

document

Momentary effects of Temstem, an app for voice-hearing individuals: Results from naturalistic data from 1048 users

Jongeneel, Alyssa (author), Libedinsky, Ilan (author), Reinbergen, Anouk (author), Tromp, N. (author), Delespaul, Philippe (author), Riper, Heleen (author), van der Gaag, Mark (author), van den Berg, D.A. (author)

Background: Temstem is a mobile application developed in cooperation with voice-hearing persons to help them cope with distressing voices. After psychoeducation about voice hearing, Temstem offers two functions: Silencing is a mode designed to inhibit voice activity through the processing of incompatible language; the Challenging mode...

journal article 2022

Pages

Pages