Investigations into Informed Phase Restoration of Amplitude Spectra for Audio Signals

More Info
expand_more

Abstract

Transform coding has been extensively used for audio and speech compression applications during the past decades. A widely used transform is the real-valued Modified Discrete Cosine Transform (MDCT), which is employed in several state-of-the-art codecs such as the MPEG-1 Layer III (aka mp3) and AAC. An extension to this transform is the Modulated Complex Lapped Transform (MCLT). One advantage of the MCLT over the MDCT is that the phase information is easily extractable in the former transform. However, this comes at a cost: Since the coefficients are complex, the critical sampling property of the MDCT does not hold any more. Thus, to overcome this shortcoming, it would be plausible to send the magnitude of the complex MCLT spectrum plus certain side information to the decoder and estimate the phase spectrum there to be able to reconstruct the time domain signal. In this work, the properties of the MCLT phase spectrum are investigated for simple signal classes, namely sinusoids, chirps and their harmonic extensions as a model for voiced speech. Then, algorithms for reconstruction of the phase spectrum using the magnitude spectrum and side information are proposed. Furthermore, heuristics for decreasing the computational complexity of the derived algorithms are given. For the objective evaluation, the MCLT phase of artificial harmonic chirp signals is reconstructed. The measurements show high reconstruction accuracy. Additionally a subjective listening test on synthetic chirp vowel signals with reconstructed phase indicates high perceptual quality.