A New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System

M. Debyeche; J.P Haton; A. Houacine

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

A New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System

Authors: M. Debyeche, J.P Haton, A. Houacine

Abstract:

The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.

Keywords: Hidden Markov Model, Vector Quantization, Neural Network, Speech Recognition, Arabic Language

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1057809

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2066

References:

[1] X.D. Huang, H.W. Hon, M.Y. Hwang, and K.F. Lee, "A comparative study of discrete, semi continuous, and continuous hidden Markov models," Computer Speech and Language, vol. 7, pp. 359-368, 1993.
[2] N. Morgan and H. Bourlard, "Continuous speech recognition," IEEE Signal Processing Magazine, vol. 12, no. 3, 1995.
[3] J.C Segura, A.J. Rubio, A.M. Peinado, P. Garcia, and R. Roman, "Multiple VQ hidden Markov modeling for speech recognition," Speech Communication, vol. 14, pp. 163-170, 1994.
[4] Q. Huo and C. Chan, "Contextual vector quantization for speech recognition with discrete hidden Markov model," Pattern recognition, vol. 28 no. 4, pp. 513-517, 1995.
[5] V. Digalakis, S. Tsakalidis, C. Harizakis, and L. Neumeyer, "Efficient speech recognition using sub vector quantization and discrete-mixture HMMs," Computer Speech and Language, vol. 14, pp. 33-46, 2000.
[6] F. Lefevre, "Non parametric probability estimation for HMM-based automatic speech recognition," Computer Speech and Language, vol. 17, pp. 113-136, 2003.
[7] A. Bernard and A. Alwan, "Low-bit-rat distributed speech recognition for packet-based and wireless communication," IEEE Trans. on Speech and Audio Processing, vol. 10 no. 8, pp. 570-580, 2002.
[8] R. Ethman, D.A. Subramaniam, and B.D. Rao, "Improved quantization structure using generalized HMM modeling with application to wideband speech coding," presented at IEEE Int. Conf. on Audio Speech and Signal Processing, Montreal, pp. 161-164, 2004.
[9] M.A Elkhouli, "Hearing distinction of speech sound," Arabic Linguistic and computer science, publication of Tunis university, pp. 267-295, 1989.
[10] S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 28 no. 4, pp. 357-366, 1980.
[11] Y. Linde, A. Buzo, and R.M Gray, "An algorithm for vector quantizer," IEEE Trans. on Communication, vol. 28, no.1, 1980.
[12] L.R Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proceeding of the IEEE Trans. Speech Process, vol. 77, no. 2, pp. 257-285, 1989.
[13] P. Hedelin and J. Skoglound, "Vector quantization based on Gaussian mixture models," IEEE Trans. on Speech and Audi Processing, vol. 8, no. 4, pp. 385-401, 2000.
[14] A. Likas, N. Vlassis, and J.J. Verbeck, "The global K-means clustering algorithm," Pattern Recognition, vol. 36, no. 2, pp. 451-461, 2003.