Meeting audio data summarization and visualization using ASR and NLP tools within the context of captured meeting data of the Shape Language

None, None

Meeting audio data summarization and visualization using ASR and NLP tools within the context of captured meeting data of the Shape Language

Bachelor Thesis (2025)

Author(s)

E. Milinović (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

S. Tan – Mentor (TU Delft - Interactive Intelligence)

E. Salas Gironés – Mentor (TU Delft - Interactive Intelligence)

Maria Soledad Pera – Graduation committee member (TU Delft - Web Information Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Natural Language Processing (NLP) Meeting summarization Data Visualization Automatic Speech Recognition Speech-to-text

To reference this document use:

https://resolver.tudelft.nl/uuid:43021189-fb0e-444f-9531-670dbe3d089f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

31-01-2025

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Meetings are a vital part of discussions and negotiations. Unfortunately, individuals often leave with a vague understanding of the topics covered during the meeting and tend to forget even more of what transpired as time goes on. Driven by previous research that attempts to solve the issue by using architectural shapes as a way of removing ambiguity along with recent advancements in Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) this research attempts to improve user understanding of key topics discussed in meetings by combining ASR models with NLP tools to create a visual summary that would improve user understanding of key topics covered during meetings. To achieve this the research utilizes the speech-to-text transcription and speaker identification capabilities of the WhisperX model with noun phrase extraction features provided by Spacy and key topic recognition functionality of Microsoft's DeBERTa model. Finally, the data is presented as a node-based graph utilizing the D3.js library. The results show that the system is able to identify between 33% - 58% of meeting key topics. This shows the potential of combining ASR models with NLP tools for creating concise meeting summaries but also raises new questions such as why some topics were missed, how the system performance can be improved, and how to design an optimal user interface for such a task.

Files

E_Milinovic_Final_Paper.pdf

(pdf | 8.94 Mb)

License info not available