Distance measures for speech recognition

More Info


This report is concerned with the application of aspects of statistical pattern classification to speech recognition. It presents an extension of linear discriminant analysis to the case where the classes are unknown. This extension provides solutions to the interrelated problems of the design of acoustic representations and spectral distance measures, and allows the efficient combination of heterogeneous sets of parameters. In particular, a representation called IMELDA based on the output of a filter-bank and its changes in time is introduced. Other approaches to distance measures are discussed. It is noted that these other methods lack the ability to make efficient combinations of heterogeneous parameters, and that they require empirical adjustments in order to give good results. Tests indicate that IMELDA provides markedly superior recognition performance compared to the alternatives.


(pdf | 6.05 Mb)

Download not available