Effective Features for Disambiguation of Turkish Verbs
Authors: Zeynep Orhan, Zeynep Altan
Abstract:
This paper summarizes the results of some experiments for finding the effective features for disambiguation of Turkish verbs. Word sense disambiguation is a current area of investigation in which verbs have the dominant role. Generally verbs have more senses than the other types of words in the average and detecting these features for verbs may lead to some improvements for other word types. In this paper we have considered only the syntactical features that can be obtained from the corpus and tested by using some famous machine learning algorithms.
Keywords: Word sense disambiguation, feature selection.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1080310
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1747References:
[1] Saussure, Ferdinand de. 1974 (1916). Course in General Linguistics. Tr. Wade Baskin. Glasgow: Fontana & Collins. (Orig.: Cours de linguistique générale.Lousanne et Paris: Payot.)
[2] Canfield J.V. (Editor), 1997, Philosophy of Meaning, Knowledge and Value in the 20th Century: Routledge History of Philosophy Volume 10. British Library Cataloguing in Publication data.
[3] Ng, H.T., and Lee, H.B., 1996. Integrating multiple knowledge sources to disambiguate word sense: An exemplar-based approach. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (ACL-96), Santa Cruz.
[4] SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs A. Kilgarriff. In Proc. LREC, Granada, May 1998. Pp 581-- 588.
[5] Schutze, H., and Pedersen, J. 1995. Information Retrieval Based on Word Senses. In Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval, 161-175. Las Vegas, Nev.: University of Nevada at Las Vegas.
[6] Ido Dagan , Alon Itai, Word sense disambiguation using a second language monolingual corpus, Computational Linguistics, v.20 n.4, p.563-596, December 1994.
[7] R. Bruce and J. Wiebe. 1999. Decomposable modeling in natural language processing. Computational Linguistics, 25(2):195-207.
[8] Pedersen, T., 2001. A decision tree of bigrams is an accurate predictor of word sense. In Proceedings of the North American Chapter of the Association for Computational Linguistics, NAACL 2001, pages 79-86, Pittsburg.
[9] Fellbaum, C., Palmer, M., Dang, H.T., Delfs, L., and Wolf., S., 2001. Manual and automatic semantic annotation with WordNet. In WordNet and Other lexical resources: NAACL 2001 workshop, pages 3-10, Pittsburgh.
[10] Ng, H. T., Zelle, J., Winter, 1997, Corpus-based approaches to semantic interpretation in natural language processing - Natural Language Processing, AI Magazine.
[11] Kelly, E. and Stone, P. (1975) Computer Recognition of English Word Senses, North Holland, Amsterdam.
[12] Yarowsky, D. 1993. One Sense per Collocation. In Proceedings of the ARPA Human-Language Technology Workshop, 266-271. Washington, D.C.: Advanced Research Projects Agency.
[13] Yarowsky, D. 1994. Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French. In Proceedings of the Thirty-Second Annual Meeting of the Association for Computational Linguistics, 88-95. Somerset, N.J.: Association for Computational Linguistics.
[14] Bruce, R. and J. Wiebe. 1994. Word-sense disambiguation using decomposable models. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 139-- 146.
[15] Rada Mihalcea, August 2002, Instance Based Learning with Automatic Feature Selection Applied to Word Sense Disambiguation, in Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taiwan.
[16] Pedersen, Ted and Rebecca Bruce. 1997. A new supervised learning algorithm for word sense disambiguation. In Proceedings of the 14th National Conference on Artificial Intelligence (AAAI-97), Providence, RI.
[17] R. Mooney. 1996. Comparative experiments on disambiguating word senses: An illustration of the role of bias in machine learning. In Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing (EMNLP-1996), pages 82-91, Philadelphia.
[18] Leacock, C., Towell, G. and Voorhees, E. M., 1993 "Corpus-based statistical sense resolution." In Proceedings of the ARPA Human Languages Technology Workshop.
[19] Gale, W., K. Church, and D. Yarowsky. ``Work on Statistical Methods for Word Sense Disambiguation.'' In Proceedings, AAAI Fall Symposium on Probabilistic Approaches to Natural Language. Cambridge, MA, pp. 54-60, 1992.
[20] Yarowsky, D. `` Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora.'' In Proceedings, COLING-92. Nantes, pp. 454-460, 1992.
[21] R. Mooney. 1996. Comparative experiments on disambiguating word senses: An illustration of the role of bias in machine learning. In Proceedings of the 1996 Conference on Empirical Methods in Natural Language Processing (EMNLP-1996), pages 82-91, Philadelphia.
[22] Pedersen, T., 2001. A decision tree of bigrams is an accurate predictor of word sense. In Proceedings of the North American Chapter of the Association for Computational Linguistics, NAACL 2001, pages 79-86, Pittsburg.
[23] Yarowsky, D. `` Hierarchical Decision Lists for Word Sense Disambiguation.'' Computers and the Humanities, 34(2):179-186, 2000.
[24] H. T. Ng. 1997. Exemplar-Base Word Sense Disambiguation: Some Recent Improvements. In Procs. of the 2nd Conference on Empirical Methods in Natural Language Processing, EMNLP.
[25] C. Cardie. 1993. A case-based approach to knowledge acquisition for domain-specific sentence analysis. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pages 798-803, Washington, DC.
[26] Veenstra, A. van den Bosch, J., S. Buchholz, W. Daelemans, and J. Zavrel. 2000 Memory-based word sense disambiguation Computers and the Humanities, 34:171-177.
[27] Daeleman, W.,Machine Learning of Language: A Model and a Problem, ESSLLI'2002 Workshop on Machine Learning Approaches in Computational Linguistics, August 5 - 9, 2002, Trento, Italy.
[28] G. Escudero, L. Mrquez, and G. Rigau. 2000, Naive Bayes and Exemplar-Based Approaches to Word Sense Disambiguation Revisited. In Proceedings of the 14th European Conference on Artificial Intelligence, ECAL.
[29] Lee, Yoong Keok, & Ng, Hwee Tou. An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation. Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP-2002). pp. 41-48, 2002.
[30] D.W. Aha and R.L. Bankert. 1994. Feature selection for case-based classification of cloud types: An empirical comparison. In Proceedings of the AAAI-94 Workshop on Case-Based Reasoning, pages 106-112, Seattle, WA.
[31] A.W. Moore and M.S. Lee. 1994. Efficient algorithms for minimizing cross validation error. In International Conference on Machine Learning, pages 190-198, New Brunswick.
[32] C. Cardie. 1996. Automating feature set selection for case-based learning of linguistic knowledge. In Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP, pages 113- 126, Somerset, New Jersey.
[33] P. Domingos. 1997. Context-sensitive feature selection for lazy learners. Artificial Intelligence Review, (11):227-253.
[34] Mihalcea, R., Instance Based Learning with Automatic Feature Selection Applied to Word Sense Disambiguation, in Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taiwan, August 2002.
[35] Y─▒lmaz, O., September, 1994, Design and implementation of a verb lexicon and sense disambiguator for Turkish, MS. Thesis, Bilkent University, Ankara, Turkey.
[36] Orhan Z., Altan Z., 2003, "Disambiguation of Turkish Word Senses By Supervised Statistical Methods", International XII. Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN 2003).
[37] Nart B. Atalay, Kemal Oflazer, Bilge Say, The Annotation Process in the Turkish treebank, in Proceedings of the EACL Workshop on Linguistically Interpreted Corpora-LINC, April 13-14, 2003, Budapest.
[38] Fellbaum C., 1998, WordNet: An Electronic Lexical Database. The MIT press.
[39] Ciaramita M., Johnson M. , 2004, "Multi-Component Word Sense Disambiguation" Proceedings of Senseval-3: The Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 97-100.
[40] Stamou, S., Oflazer, K., Pala, K., Christodoulakis, D., Cristea, D., Tufis, D., Koeva, S., Totkov, G., Dutoit, D., Grigoriadou, M., BalkaNet: A multilingual Semantic Network for Balkan Languages, in Proceedings of the First International WordNet Conference, Mysore India, January 2002.
[41] O. Bilgin, Çetınoğlu, Ö., Oflazer, K., Building a Wordnet for Turkish, Romanian Journal of Information Science and Technology, Volume 7, Numbers 1-2, 2004, 163-172.
[42] WEKA system, http://www.cs.waikato.ac.nz/ml/weka.