Echo State Networks for Arabic Phoneme Recognition
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32807
Echo State Networks for Arabic Phoneme Recognition

Authors: Nadia Hmad, Tony Allen

Abstract:

This paper presents an ESN-based Arabic phoneme recognition system trained with supervised, forced and combined supervised/forced supervised learning algorithms. Mel-Frequency Cepstrum Coefficients (MFCCs) and Linear Predictive Code (LPC) techniques are used and compared as the input feature extraction technique. The system is evaluated using 6 speakers from the King Abdulaziz Arabic Phonetics Database (KAPD) for Saudi Arabia dialectic and 34 speakers from the Center for Spoken Language Understanding (CSLU2002) database of speakers with different dialectics from 12 Arabic countries. Results for the KAPD and CSLU2002 Arabic databases show phoneme recognition performances of 72.31% and 38.20% respectively.

Keywords: Arabic phonemes recognition, echo state networks (ESNs), neural networks (NNs), supervised learning.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1086995

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2370

References:


[1] T. J. Reynolds, C. A. Antoniou, “Experiments in speech recognition using a modular MLP architecture for acoustic modeling, ”Information Sciences, vol.156, Mar. 2003, pp. 39-54.
[2] W. Chen. S. Chen, C.Lin, “A speech recognition method based on the sequential multi-layer perceptrons, ”Neural Networks, vol. 9, Nov. 1996, pp. 655-669.
[3] N. Hmad, T. Allen, “Biologically inspired Continuous Arabic Speech Recognition,”.In Research and Development in intelligent systems XXIX, 32nd ed. Bramer, Petridis Ed. Cambridge, UK: Springer,2012, pp. 245- 258.
[4] T. Koizumi, M. Mori, S. Taniguchi, M. Maruya, “Recurrent Neural Networks for Phoneme Recognition,”
[5] M. D. Skowronski, J. G. Harris, “Automatic speech recognition using a predictive echo state network classifier,” Science direct, Neural Networks, vol. 20, 2007,pp. 414-423.
[6] M. D. Skowronski, J. G. Harris, “Minimum mean squared error time series classification using an echo state network prediction model,” IEEE International Symposium on Circits Systems, Island of Kos, Greece, 2006, pp. 3153-3156.
[7] M. C. Ozturk, J. C. Principe, “An associative memory readout for ESNs with applications to dynamical pattern recognition, ”Science direct, Neural Networks, vol. 20, 2007. pp. 377–390.
[8] G. Holzmann, Echo State Networks with Filter Neurons and a Delay&Sum Readout with Applications in Audio Signal Processing., Thesis, Graz University of Technology, Austria, June 2008.
[9] H. Jaeger, H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless telecommunication,” Science, vol. 304, 2004, pp. 78-80.
[10] H., Jeager, Adaptive Nonlinear System Identification with Echo State Networks, 2003.
[11] D. Verstraeten, B. Schrauwen, M. D’Haene, D. Stroobandt, “An experimental unification of reservoir computing methods, ”Science direct, Neural Networks, vol. 20, 2007. pp. 391–403.
[12] M. H. Tong, A. D. Bickett, E. M. Christiansen, G. W. Cottrell, “Learning grammatical structure with Echo State Networks,” Science direct, Neural Networks, vol. 20, 2007. pp. 424–432.
[13] V. Sakenas, Distortion Invariant Feature Extraction with Echo State Networks, Jacobs University Bremen, Germany, Oct. 2010.
[14] B. Schrauwen, L. Busing, A Hierarchy of Recurrent Networks for Speech Recognition, 2010.
[15] H. Jaeger, M. Lukosevicius, D. Popovici, U. Siewert, “Optimization and Applications of Echo State Networks with Leaky Integrator Neurons,” Science direct, Neural Networks, vol. 20, 2007. pp. 335–352.
[16] T. P. Schmidt, M. A. Wiering, A. C. van Rossum, R. A.J. van Elburg, T. C. Andringa, B. Valkenier, Robust Real-Time Vowel Classification with an Echo State Network.,2010.
[17] H.J aeger, A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the "echo state network" approach, International University Bremen, 2005.
[18] I. Sutskever, Training Recurrent Neural Networks, University of Toronto, 2013.