Efficient System for Speech Recognition using General Regression Neural Network
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32825
Efficient System for Speech Recognition using General Regression Neural Network

Authors: Abderrahmane Amrouche, Jean Michel Rouvaen

Abstract:

In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural network (GRNN). The relative performances of the proposed model are compared to the similar recognition systems based on the Multilayer Perceptron (MLP), the Recurrent Neural Network (RNN) and the well known Discrete Hidden Markov Model (HMM-VQ) that we have achieved also. Experimental results obtained with Arabic digits have shown that the use of nonparametric density estimation with an appropriate smoothing factor (spread) improves the generalization power of the neural network. The word error rate (WER) is reduced significantly over the baseline HMM method. GRNN computation is a successful alternative to the other neural network and DHMM.

Keywords: Speech Recognition, General Regression NeuralNetwork, Hidden Markov Model, Recurrent Neural Network, ArabicDigits.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1327514

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2140

References:


[1] L. Rabiner, "A Tutorial on hidden Markov model and selected applications", in Proc. of IEEE, Vol. 77, n┬░2, 1989.
[2] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, 1995.
[3] S. Haykin , Neural Networks: A Comprehensive Foundation", 2nd ed., Cliffs, NJ,1999.
[4] R. P. Lippman, "Review of Neural Networks for Speech Recognition" Neural Computation, n┬░1, pp.1-38, 1989.
[5] F. Jelinek, Statistical Methods for Speech Recognition, Cambridge, Massachusetts, MIT Press, 1997.
[6] A. Waibel, T. Harazawa, G. Hinton, K. Shakano and K.J. Lang, "Phoneme recognition using Time-Delay Neural Networks," IEEE Trans. On ASSP, vol. 37, n┬░3, pp. 328-339, March 1989.
[7] K. Lang, A. Waibel, and G. Hinton, "A Time Delay Neural Network architecture," Neural Networks, vol. 3, pp. 333-34, 1990.
[8] H. Bourlard, and N. Morgan "Connexionnist techniques", available: http://cslu.cse.ogi.edu/HLT survey/ch11node7.html, March 2003.
[9] H. Bourlard and C.J. Wellekens "Links between Markov models and multilayer perceptrons" in IEEE Trans on Pattern Analysis and Machine Intelligence, Vol 2, pp. 1167-1178, 1990.
[10] K. Kirschoff et al., "Novel approach to Arabic speech recognition," Final Report from the JHU Summer School Workshop, 2002.
[11] S.A. Selouani and J. Caelen "Arabic word recognition by classifiers and context", Journal of Computer Science and Technology, Vol.20, N┬░3, pp.402-410. May 2005.
[12] H. Bahi and M. Sellami,"Combination of vector quantization and HMM for Arabic speech recognition ", ACS/ IEEE Int. Conf. on Computer System and Applications AICCSA-01, pp.96-101, Beirut, Lebanon, 2001.
[13] T. Cacoulos "Estimation of a multivariate density" Ann. Inst. Math. Tokyo, Vol. 18, n┬░2, pp. 179-189, 1966.
[14] D. F. Specht "A General Regression Neural Networks" IEEE Trans. on Neural Networks, Vol. 2, n┬░6, pp. 568-576, Nov. 1991.
[15] D.F. Specht, Probabilistic Neural Networks and General Regression Neural Networks, FuzzyLogic and Neural Network Handbook, Chap3. Mac Grow Hill inc. 1995.