Inferring Hierarchical Pronunciation Rules from a Phonetic Dictionary

Erika Pigliapoco; Valerio Freschi; Alessandro Bogliolo

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32797

Inferring Hierarchical Pronunciation Rules from a Phonetic Dictionary

Authors: Erika Pigliapoco, Valerio Freschi, Alessandro Bogliolo

Abstract:

This work presents a new phonetic transcription system based on a tree of hierarchical pronunciation rules expressed as context-specific grapheme-phoneme correspondences. The tree is automatically inferred from a phonetic dictionary by incrementally analyzing deeper context levels, eventually representing a minimum set of exhaustive rules that pronounce without errors all the words in the training dictionary and that can be applied to out-of-vocabulary words. The proposed approach improves upon existing rule-tree-based techniques in that it makes use of graphemes, rather than letters, as elementary orthographic units. A new linear algorithm for the segmentation of a word in graphemes is introduced to enable outof- vocabulary grapheme-based phonetic transcription. Exhaustive rule trees provide a canonical representation of the pronunciation rules of a language that can be used not only to pronounce out-of-vocabulary words, but also to analyze and compare the pronunciation rules inferred from different dictionaries. The proposed approach has been implemented in C and tested on Oxford British English and Basic English. Experimental results show that grapheme-based rule trees represent phonetically sound rules and provide better performance than letter-based rule trees.

Keywords: Automatic phonetic transcription, pronunciation rules, hierarchical tree inference.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1059789

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1875

References:

[1] A. Aho, Algorithms for finding patterns in strings. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science - Vol. A. MIT Press / Elsevier, pages 257-300, 1990.
[2] J. Bellegarda, A novel approach to unsupervised grapheme to phoneme conversion. Pronunciation modeling and lexicon adaptation for Spoken Language (Interspeeech-ICSLP), 2002.
[3] N. Chomsky, and M. Halle, The Sound Pattern of English, 1968. Harper and Row, New York, USA.
[4] W. Daelemans, and A. van den Bosch, Language-independent dataoriented grapheme-to-phoneme conversion. In J.P.H. van Santen, R.W. Sproat, J. Olive, and J. Hirschberg, editors, Progress in Speech Synthesis. Springer, New York, pages 77-89, 1997.
[5] R.I. Damper, and Y. Marchand, A multi-strategy approach to improving pronounciation by analogy. Computational Linguistics, 26:195-219, 2000.
[6] R.I. Damper, Y. Marchand, M.J. Adamson, and K. Gustafson, Evaluating the pronunciation component of text-to-speech systems for english: a performance comparison of different approaches. Computer Speech and Language, 13:155-176, 1999.
[7] M.J. Dedina and H.C. Nusbaum, Pronounce: A program for pronounciation by analogy. Computer Speech and Language, 5:55-64, 1991.
[8] M. Divay and A.J. Vitale, Algorithms for grapheme-phoneme translation for english and french: applications for database searches and speech synthesis. Computational Linguistics, 23:495-523, 1997.
[9] T. Dutoit, High-quality text-to-speech synthesis : an overview. Journal of Electrical & Electronics Engineering, 17:25-37, 1997.
[10] J. Hochberg, S.M. Mniszewski, T. Calleja, and G.J. Papcun, A default hierarchy for pronouncing english. IEEE Transactions on Pattern Matching and Machine Intelligence, 13:957-964, 1991.
[11] J. Lucassen, R. Mercer, An information theoretic approach to the automatic determination of phonemic baseforms Proc. ICASSP-84 (International Conference on Acoustics, Speech, and Signal Processing), 1984.
[12] C.J. Ogden, Basic English: International Second Language. Hartcourt, Brace & Jovanovich, New York, USA, 1968.
[13] V. Pagel, K. Lenzo, A.W. Black, Letter to sound rules for accented lexicon compression Proc. ICSLP-1998 (5th International Conference on Spoken Language Processing), 1998.
[14] A. Plucinski, A dynamic context shortening method for a minimumcontext grapheme-to-phoneme data-driven transducer generator. Journal of Quantitative Linguistics, 13:195-223, 2006.
[15] P.A. Taylor, A. Black, and R. Caley, The architecture of the festival speech synthesis system. The third ESCA Workshop on Speech Synthesis, 147-151, 1998.
[16] P. Taylor, Hidden Markov Models for grapheme to phoneme conversion. The 9th European Conference on Speech Communication and Technology (Interspeeech), 2005.
[17] K. Torkkola, An efficient way to learn English grapheme-to-phoneme rules automatically. Proc. ICASSP-93 (International Conference on Acoustics, Speech, and Signal Processing), 1993.