Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits
Authors: Ali Ganoun
Abstract:
This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.
Keywords: Speech Recognition, Spectrum Analysis, Burg Spectrum, Walsh Spectrum Analysis, Thomson Multitaper Spectrum, MFCC.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1061940
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596References:
[1] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd ed. Academic Press, Inc., 2008.
[2] J. Holmes, W. Holmes, Speech Synthesis and Recognition, 2nd ed., CRC Press, 2001.
[3] A. Ganoun and I. Almerhag, Performance Analysis of Spoken Arabic Digits Recognition Techniques, Journal of Electronic Science and Technology, vol. 10, no. 2, pp 153-157, June 2012.
[4] Beauchamp, K.G., Applications of Walsh and Related Functions, Academic Press, 1984.
[5] Percival, D. B., and A. T. Walden. Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques, Cambridge University Press, 1993.
[6] Stoica, P., and R.L. Moses, Introduction to Spectral Analysis, 1st ed., Prentice-Hall, 1997.
[7] K. Saeed and M. Nammous, A Speech-and-Speaker Identification System: Feature Extraction, Description, and Classification of Speech- Signal Image, IEEE Transactions On Industrial Electronics, vol. 54, no. 2, April 2007.
[8] Z. Hachkar et al., Comparison of MFCC and PLP Parameterization in pattern recognition of Arabic Alphabet Speech, Canadian Journal on Artificial Intelligence, Machine Learning & Pattern Recognition vol. 2, no. 3, April 2011.
[9] M. Abushariah et al., Arabic Speaker-Independent Continuous Automatic Speech Recognition Based on a Phonetically Rich and Balanced Speech Corpus, The International Arab Journal of Information Technology, vol. 9, No. 1, January 2012.
[10] T. Ganchev, M. Siafarikas and N. Fakotakis, Evaluation of speech parameterization methods for speaker recognition, Proc. of the Acoustics, vol. 18-19, pp. 105-110, 2006.