Estimating Intention To Speak Using Non-Verbal Vocal Behavior

None, None

Estimating Intention To Speak Using Non-Verbal Vocal Behavior

Bachelor Thesis (2023)

Author(s)

J.A. van Marken (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Hayley Hung – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

A.W.F.A.M. Elnouty – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Machine Learning Non-Verbal Communication Intention Estimation

To reference this document use:

https://resolver.tudelft.nl/uuid:7580ec2f-165f-4901-92a6-b1a5e61f4b0e

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

28-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This research aims to answer the question whether non-verbal vocal behavior can be used to estimate intention to speak. To answer this question data from a dutch social networking event is used to gather intentions to speak. The intentions to speak are split up in two categories: successful and unsuccessful intentions. The unsuccessful intentions are further split up into two categories: unsuccessful intentions to start speaking and unsuccessful intentions to continue speaking. The perceived unsuccessful intentions to speak are gathered by manually annotating a 10-minute segment of the networking event and successful intentions to speak are automatically extracted using Voice Activity Detection. From the audio, non-verbal vocal features are extracted to train a machine learning model to predict if there is an intention to speak. The model is trained on successful intentions to speak and evaluated on both successful and unsuccessful intentions to speak. From the experiment results it was concluded that the model predicted intention to speak better than random guessing.

Files

Final_Paper_2.pdf

(pdf | 0.631 Mb)

License info not available