Repository hosted by TU Delft Library

Home · Contact · About · Disclaimer ·
 

The impact of tone language and non-native language listening on measuring speech quality

Publication files not online:

Author: Ebem, D.U. · Beerends, J.G. · Vugt, J. van · Schmidmer, C. · Kooij, R.E. · Uguru, J.O.
Type:article
Date:2011
Source:AES: Journal of the Audio Engineering Society, 9, 59, 647-655
Identifier: 443021
Keywords: Acoustics and Audiology · American English · Background noise · Cultural backgrounds · Cultural context · Cultural environment · High quality · Impact of noise · Non-native language · Objective speech quality · Signal characteristic · Signal level · Signal to noise · Significant impacts · Speech quality · Speech signals · Tone languages · Voice quality · Behavioral research · Communication systems · Signal to noise ratio · Speech communication · Communication & Information · PNS - Performance of Networks & Services ; · TS - Technical Sciences

Abstract

The extent to which the modeling used in objective speech quality algorithms depends on the cultural background of listeners as well as on the language characteristics using American English and Igbo, an African tone language is investigated. Two different approaches were used in order to separate behavioral aspects from speech signal aspects. In the first approach degraded American English sentences were presented to Igbo listeners and American listeners, showing that Igbo subjects are more disturbed by additive noise in comparison to other degradations than American subjects. In the second approach objective modeling, using ITU-T P.863 (POLQA), showed that Igbo subjects listening to degraded Igbo speech are more disturbed by background noise and low-level listening than predicted by the P.863 standard, which was trained on Western languages using native listeners. The most likely conclusion is that low-level signal parts of the Igbo tone language are relatively more important than lowlevel signal parts of American English. In judging the quality of their own language Igbo listeners thus need more signal level and more signal-to-noise ratio for perceiving high quality than American subjects require in judging their own language. When Igbo subjects judge the quality of American speech samples the impact of noise is overestimated but low-level listening does not have a significant impact on the perceived speech quality. The results show that one cannot build a universal objective speech quality measurement system but that adaptation toward the behavior of a set of subjects is necessary. Further investigation into the impact of tone language signal characteristics and the behavior of subjects who are raised in a specific cultural environment is necessary before a new speech quality measure for assessing voice quality in that environment can be developed. The results also suggest that speech communication systems have to be optimized dependent on the cultural context where the system is used and/or the languages for which the system is intended.