Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines

Sang-Soo Kim; Seong-Bae Park; Sang-Jo Lee

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32797

Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines

Authors: Sang-Soo Kim, Seong-Bae Park, Sang-Jo Lee

Abstract:

In this paper, we propose a method of resolving dependency ambiguities of Korean subordinate clauses based on Support Vector Machines (SVMs). Dependency analysis of clauses is well known to be one of the most difficult tasks in parsing sentences, especially in Korean. In order to solve this problem, we assume that the dependency relation of Korean subordinate clauses is the dependency relation among verb phrase, verb and endings in the clauses. As a result, this problem is represented as a binary classification task. In order to apply SVMs to this problem, we selected two kinds of features: static and dynamic features. The experimental results on STEP2000 corpus show that our system achieves the accuracy of 73.5%.

Keywords: Dependency analysis, subordinate clauses, binaryclassification, support vector machines.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1084642

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1550

References:

[1] K.-J. Seo, A Korean language parser using syntactic dependency relations between word-phrases, M.S. Thesis, KAIST, 1993.
[2] S.-B. Park and B.-T. Zhang, ''Text Chunking by Combining Hand-Crafted Rules and Memory-Based Learning,'' In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 497--504, 2003.
[3] H.-P. Shin, ''Maximally Efficient Syntactic Parsing with Minimal Resources,'' In Proceedings of the Conference on Hangul and Korean Language Information Processing, pp. 242-244, 1999. (In Korean)
[4] H.-J. Lee, S.-B. Park, S.-J. Lee, and S.-Y Park, ''Clause Boundary Recognition Using Support Vector Machines,'' In Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence, pp. 505--514, 2006.
[5] X. Carreras and L. Marquez,''Boosting Trees for Clause Splitting,'' In Proceedings of the 5th Conference on Computational Natural Language Learning, pp. 1-3, 2001.
[6] A. Molina and F. Pla, ''Clause Detection using HMM,'' In Proceedings of the 5th Conference on Computational Natural Language Learning, pp. 70-72, 2001.
[7] K. Uchimoto, S. Sekine, and H. Isahara, ''Japanese Dependency Structure Analysis Based on Maximum Entropy Models,'' In Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics, pp. 196-203, 1999.
[8] T. Kudo and Y. Matsumoto, ''Japanese Dependency Structure Analysis Based on Support Vector Machines,'' In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 18-25, 2000.
[9] J. Gao and H. Suzuki, ''Unsupervised Learning of Dependency Structure of Language Modeling,'' In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 521-528, 2003.
[10] T. Utsuro, S. Nishiokauama, M. Fujio, and Y. Matsumoto, ''Analyzing Dependencies of Japanese Subordinate Clauses based on Statistics of Scope Embedding Preference,'' In Proceedings of the 1st Conference on North American Chapter of the Association for Computational Linguistics, pp. 110-117, 2000.
[11] H.-J. Lee, S.-B. Park, S.-J. Lee, and S.-Y Park, ''Clause Boundary Recognition Using Support Vector Machines,'' In Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence, pp. 505-514, 2006.
[12] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge University Press, 2000.
[13] T. Joachims, ''Text Categorization with Support Vector Machines: Learning with Many Relevant Features,'' In Proceedings of the European Conference on Machine Learning, pp. 137--142, 1998.
[14] T. Joachims, Making Large-Scale SVM Learning Practical, LS8, Universitaet Dortmund, 1998.