Meeting audio data summarization and visualization using ASR and NLP tools within the context of captured meeting data of the Shape Language

More Info
expand_more

Abstract

Meetings are a vital part of discussions and negotiations. Unfortunately, individuals often leave with a vague understanding of the topics covered during the meeting and tend to forget even more of what transpired as time goes on. Driven by previous research that attempts to solve the issue by using architectural shapes as a way of removing ambiguity along with recent advancements in Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) this research attempts to improve user understanding of key topics discussed in meetings by combining ASR models with NLP tools to create a visual summary that would improve user understanding of key topics covered during meetings. To achieve this the research utilizes the speech-to-text transcription and speaker identification capabilities of the WhisperX model with noun phrase extraction features provided by Spacy and key topic recognition functionality of Microsoft's DeBERTa model. Finally, the data is presented as a node-based graph utilizing the D3.js library. The results show that the system is able to identify between 33% - 58% of meeting key topics. This shows the potential of combining ASR models with NLP tools for creating concise meeting summaries but also raises new questions such as why some topics were missed, how the system performance can be improved, and how to design an optimal user interface for such a task.

Files

E_Milinovic_Final_Paper.pdf
(pdf | 8.94 Mb)
License info not available