Assamese Numeral Corpus for Speech Recognition using Cooperative ANN Architecture
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
Assamese Numeral Corpus for Speech Recognition using Cooperative ANN Architecture

Authors: Mousmita Sarma, Krishna Dutta, Kandarpa Kumar Sarma

Abstract:

Speech corpus is one of the major components in a Speech Processing System where one of the primary requirements is to recognize an input sample. The quality and details captured in speech corpus directly affects the precision of recognition. The current work proposes a platform for speech corpus generation using an adaptive LMS filter and LPC cepstrum, as a part of an ANN based Speech Recognition System which is exclusively designed to recognize isolated numerals of Assamese language- a major language in the North Eastern part of India. The work focuses on designing an optimal feature extraction block and a few ANN based cooperative architectures so that the performance of the Speech Recognition System can be improved.

Keywords: Filter, Feature, LMS, LPC, Cepstrum, ANN.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1328142

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2335

References:


[1] A. Okatan1, N. Ayanolu, S. Senycel, Voice Recognition by Cepstrum Method, Baheehir University, Faculty of Engineering, Department of Computer Eng., Istanbul, Turkey International Intelligent Knowledge Systems Society (IKS), Istanbul, Turkey.
[2] Wikipedia, the free encyclopedia"Speech corpus", en.wikipedia.org/wiki/Speechcorpus.
[3] S. Haykin, Neural Networks A Comprehensive Foundation,2nd . Pearson Education, New Delhi, 2003.
[4] K. K. Paliwal and W. B. Kleijn, Quantization of LPC Parameters,
[5] Prof. Gautam Baruah, Dept. of CSE, IIT Guwahati, tdil.mit.gov.in / assamesecodechartoct02.pdf,
[6] "The X sound in Assamese language", The Assam Tribune Editorial, March 5, 2006.
[7] Indo-Iranian. http://www.questia.com/library / encyclopedia/ indoiranian. jsp
[8] "National Institute on Deafness and Other Communication Disorders", (www.nidcd.nih.gov/health/voice/whatisvsl.htm),
[9] B. Yegnanarayana, Artificial Neural Networks, 1st Ed., PHI, New Delhi, 2003.
[10] A. P. Simpson, "Phonetic differences between male and female speech", Language and Linguistics Compass 3/2, 621640, 2009.
[11] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, 1st Ed., Prentice Hall, 1978.
[12] S. Haykins,Adaptive Filter Theory, 4th Ed., Pearson Education, New Delhi, 2002.
[13] K. Hisashi, F. T. Mano, Patent application title: Filter Circuit, mi.eng.cam.ac.uk / ajr / SA95/ node43.html.
[14] S. W. Smith, The Scientist and Engineer-s Guide to Digital Signal Processing, 2nd ed., Available at www.healthcare.analog.com / static /imported-files /tech... / dsp-book-frontmat.pdf.
[15] Introduction to Digital Filters, www.dsptutor.freeuk.com / digfilt.pdf .
[16] Introduction to DSP - filtering: design by equiripple method, www.bores.com / courses / intro/ filters/4−equi.htm.
[17] Wikipedia, the free encyclopedia, Least Mean Square Filter, www.bores.com / courses / intro/ filters/4−equi.htm.
[18] Feature Extraction, cslu.cse.ogi.edu /toolkit /old /old /version 2.0a /.../ node5.html.
[19] M. P. Kesarkar, Feature Extraction for Speech Recogntion, M.Tech. Credit Seminar Report, Electronic Systems Group, EE. Dept, IIT Bombay, November, 2003.
[20] Jurafsky, Daniel and J. H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, (1st ed.). Prentice Hall, 2000.
[21] A. K. Paul, D. Das, and Md. M. Kamal, Bangla Speech Recognition System Using LPC and ANN, 3rd ed. Proccedings of Seventh International Conference on Advances in Pattern Recognition,04-06, February, 2009.
[22] G. Dede and M. H. Sazl, Speech recognition with artificial neural networks, Digital Signal Processing), Volume 20, Issue 3, Pages 763-768, May 2010.
[23] A. M. Ahmad, S. Ismail, D. F. Samaon, Recurrent Neural Network with Backpropagation through Time for Speech Recognition, Proccedings of Intemational Symposium on Communications and Information Technologies 2004 ( ISClT 2004 ) Sapporo, Japan, October 26- 29, 2004. Harlow, England: Addison-Wesley, 1999.