Analysis of Combined Use of NN and MFCC for Speech Recognition
Authors: Safdar Tanweer, Abdul Mobin, Afshar Alam
Abstract:
The performance and analysis of speech recognition system is illustrated in this paper. An approach to recognize the English word corresponding to digit (0-9) spoken by 2 different speakers is captured in noise free environment. For feature extraction, speech Mel frequency cepstral coefficients (MFCC) has been used which gives a set of feature vectors from recorded speech samples. Neural network model is used to enhance the recognition performance. Feed forward neural network with back propagation algorithm model is used. However other speech recognition techniques such as HMM, DTW exist. All experiments are carried out on Matlab.
Keywords: Speech Recognition, MFCC, Neural Network, classifier.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1099828
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3266References:
[1] L.R. Rabiner, A tutorial on Hidden Markov Models and selected applications in speech recognition, Proc. IEEE, 77(2), 1989, 257-286.
[2] L.R Rabiner J. G. Wilpon, Speaker independent isolated word recognition for a moderate size voculabary, IEEE Transaction on Acoustics, Speech Signal Processing, ASSP-27, 1979, 583-587.
[3] Picheny M; Nahamou D, Goel V, Kingbusy B, Ramabhadran S.J Saon, G Trends and Advances in Speech recognition” IBM Journal of Research and Development, Vol no-5 PP-2:1-2:18 sept-oct-2011
[4] Haykin, S., “Neural Networks A Comprehensive Approach”, Prentice Hall, 1999.
[5] L. Muda M. Begam, I. Elamvazuthi, Voice recognition algorithms using Mel Frequency Cepstral Coeffcient (MFCC) and Dynamic Time Warping (DTW) techniques, Journal of Computing, 2 (3), 2010, 138- 143.
[6] Environmental Natural sound detection and classification using content based retrierval (CBB) and MFCC by Subarta Mandal, Institutional journal of engineering research and application (IJERA) ISSN:2248- 9622, Vol:2 , issue-6 Nov-Dec 2012 PP-123-129.
[7] Chadawan Ittichaichareon, Siwat Sukasri and Tha-Weesak Yingthawornsuk” Speech recognition using MFCC, published in international conference on computer Graphics simulation and modeling (ICGSM 2012) July 28-29 2012 pattaya (Thailand).
[8] Stan Salvador and Pjilip Chan Fast DTW: Toward Accurate Dynamic time Warping in Linear time space, Florida Institute of Technology, Melbourne.
[9] Mohd Tamil, MOhd Yamani Idna Idris” Quarnic verse recitation feature extraction using MFCC AL-Quran & AL- Hadith Academy of Islamic Studeis of Malaya.
[10] M.B Herscher, R.B Cox, An adaptive isolated word speech recognition system, Proc. conf. on speech communication and Processing, Newton, MA, 1972, 89-92.
[11] Performances Analysis of learning classifier for spoken digit under Noisy condition vol.4, No-3 March 2013 in Journal of emerging trends in computing and information science.
[12] W. Ghai, N Singh, Literature review on automatic speech recognition, International Journal of Computer Applications, 41 (8), 2012, 43-50.
[13] Pramod B. Patil” Multilayered Network for LPC Based Speech Recognition”, IEEE 1998.
[14] Kung S, Digital Neural Network, Printice Hall 1993.