Applying Large-Scale Weakly Supervised Automatic Speech Recognition to Air Traffic Control

Master Thesis (2023)
Author(s)

J.L.P.M. van Doorn (TU Delft - Aerospace Engineering)

Contributor(s)

Junzi Sun – Mentor (TU Delft - Control & Simulation)

J.M. Hoekstra – Graduation committee member (TU Delft - Control & Simulation)

Patrick Jonk – Mentor (Royal Netherlands Aerospace Centre NLR)

Vincent de Vries – Graduation committee member (Royal Netherlands Aerospace Centre NLR)

Faculty
Aerospace Engineering
Copyright
© 2023 Jan Laurenszoon van Doorn
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Jan Laurenszoon van Doorn
Graduation Date
11-12-2023
Awarding Institution
Delft University of Technology
Programme
Aerospace Engineering
Related content

Whisper Large V2 - ATCO2

https://www.doi.org/10.57967/hf/1376

Whisper Large V2 - ATCOSIM

https://www.doi.org/10.57967/hf/1374

Whisper Large V2 - ATCO2 - ATCOSIM

https://www.doi.org/10.57967/hf/1375
Faculty
Aerospace Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The application of automatic speech recognition in the air traffic control domain has been researched extensively. However, its primary application remains in the training and simulation of air traffic controllers. This is due to the insufficient performance of automatic speech recognition in specific environments, such as air traffic control, where strong performance and safety requirements are paramount. This study demonstrates how a large-scale, weakly supervised automatic speech recognition model, Whisper, could meet these performance requirements and establish a new approach to air traffic control communication. Fine-tuning Whisper in the air traffic control domain resulted in a word error rate of 13.5% on the ATCO2 dataset and 1.17% on the ATCOSIM dataset. Furthermore, the study reveals that fine-tuning with region-specific data can enhance performance by up to 60% in real-world scenarios.

Files

License info not available