Comparison among Various Question Generations for Decision Tree Based State Tying in Persian Language
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
Comparison among Various Question Generations for Decision Tree Based State Tying in Persian Language

Authors: Nasibeh Nasiri, Dawood Talebi Khanmiri

Abstract:

Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.

Keywords: Decision Tree, Markov Models, Speech Recognition, State Tying.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1078295

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725

References:


[1] K. Beulen, H. Ney, "Automatic question generation for decision tree based state tying," Proc. of ICASSP '98, pp. 805-808, 12-15 May, Seattle, WA, USA, 1998.
[2] J. J. Odell, "The Use of Context in Large Vocabulary Speech Recognition," Ph.D. Thesis, Cambridge University, 1995.
[3] M. Bijankhan et al., "FARSDAT - The Speech Database of Farsi Spoken Language", Proc. 5th Australian Int. Conf. On Speech Science and Tech., Vol. 2, perth, 1994.
[4] Singh, R., Raj, B., Stern, R. M.: Automatic Clustering and Generation of Contextual Questions for Tied States in Hidden Markov Models. In Proc. ICSLP, Vol. 1, pp.117-1202, 1999
[5] Kanokphara, S., Geumann, A.,Carson-Berndsen, J.: Accessing Language Specific Linguistic Information for Triphone Model Generation: Feature Tables in a Speech Recognition System., 2nd Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, 2005.
[6] Kanokphara, S. and Carson-Berndsen, J.: Automatic Question Generation for HMM State Tying using a Feature Table. Proc. Australian Int. Conf. on Speech Science & Technology (ASST) 2004.
[7] Kanokphara. S. , Carson-Berndsen, J.: Phonetic Question Generation Using Misrecongnition,", In Proc. The Ninth Inernational Conference on TEXT , SPEECH and DIALOGE(TSD), Brno, Czech Republic, September, pp. 407-414, 2006.