Sounds Alike Name Matching for Myanmar Language

Yuzana; Khin Marlar Tun

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32807

Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1057345

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1749

References:

[1] A.J Lait and B. Randell. An assessment of name matching algorithms. Technical report, Deptartment of Computer Science, University of Newcastle upon Tyne, 1993.
[2] D. Holmes and C. M. McCabe. Improving precision and recall for soundex retrieval. In Proceedings of the IEEE International Conference on Information Technology - Coding and Computing (ITCC), Las Vegas, 2002.
[3] L. Philips. The double-metaphone search algorithm. C/C++ User-s Journal, 18(6), 2000.
[4] N.Uzzaman , M.Khan "A Bangla Phonetic Encoding for Better Spelling Suggestions", PAN Localization Project. International Development Research Centre, Ottawa ,Canada.
[5] P. Jokinen, J. Tarhio, and E. Ukkonen. "A comparison of approximate string matching algorithms". Software Practice and Experience, 26(12):1439-1458, 1996.
[6] R.K.Joshi, K.Shroff, S.P.Mudur." A phonetic Code Based Scheme for Effective Processing of Indian Languages", 23rdInternationalization and Unicode Conference, Prague, Czech Republic , March 2003.
[7] R. Cilibrasi and P. M. Vit'anyi. Clustering by compression. IEEE Transactions on Information Theory, 51(4):1523-1545, 2005.
[8] S.U.Aqeel, S.Beitzel, E.Jensen, O.Frieder and D.Grossman. "On the Development of Name Search Techniques for Arabic", Illinois Institute of technology ,Chicago,IL 60616
[9] The International Phonetic Association. University of Glasgow, Glasgow, UK, http://www.arts.gla.ac.uk/IPA/ipa.html
[10] T. Gadd. " PHONIX: The algorithm". Program: automated Library and information systems, 24(4):363-366, 1990.