Evaluation of phoneme recognition through TDNN-OPGRU on Mandarin speech

More Info
expand_more

Abstract

This research expands past research on implementing the TDNN-OPGRU network for Automatic Phoneme Recognition on Dutch speech by implementing and testing the TDNN-OPGRU network on Mandarin speech. The goal of this research is to investigate the performance of the TDNN-OPGRU architecture when decoding phonemes in Mandarin prepared and spontaneous speech. The difference in Phoneme Error Rate between prepared and spontaneous speech is being determined, and the effect that tones have on the PER is being investigated since Mandarin is a tonal language. The results are that a substantial amount of the PER comes from substitutions that are made where only the tone is incorrectly determined. However, tone does not appear to have an impact on the difference in PER between spontaneous and prepared speech since it is responsible for an similar amount of the substitutions in both types of speech. The inclusion of tone also causes the error rate of the TDNN-OPGRU architecture on base phonemes to increase.