Addressing Scalability Issues of Named Entity Recognition Using Multi-Class Support Vector Machines
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32771
Addressing Scalability Issues of Named Entity Recognition Using Multi-Class Support Vector Machines

Authors: Mona Soliman Habib

Abstract:

This paper explores the scalability issues associated with solving the Named Entity Recognition (NER) problem using Support Vector Machines (SVM) and high-dimensional features. The performance results of a set of experiments conducted using binary and multi-class SVM with increasing training data sizes are examined. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. A simple machine learning approach is used that eliminates prior language knowledge such as part-of-speech or noun phrase tagging thereby allowing for its applicability across languages. No domain-specific knowledge is included. The accuracy measures achieved are comparable to those obtained using more complex approaches, which constitutes a motivation to investigate ways to improve the scalability of multiclass SVM in order to make the solution more practical and useable. Improving training time of multi-class SVM would make support vector machines a more viable and practical machine learning solution for real-world problems with large datasets. An initial prototype results in great improvement of the training time at the expense of memory requirements.

Keywords: Named entity recognition, support vector machines, language independence, bioinformatics.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1078583

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636

References:


[1] S. Abe, Support Vector Machines for Pattern Classification. London: Springer-Verlag, 2005.
[2] E. Alpaydin, Introduction to Machine Learning. Cambridge, MA: The MIT Press, 2004.
[3] T. Ban and S. Abe, "Spatially Chunking Support Vector Clustering Algorithm," in Proc. of the IEEE International Joint Conference on Neural Networks, Grenoble, France, 2004.
[4] M. Barros de Almeida, A. de Padua Braga, et al., "SVM-KM: Speeding SVMs Learning with A Priori Cluster Selection and K-Means," in Proc. of the 6th Brazilian Symposium on Neural Networks, 2000.
[5] K. P. Bennett and C. Campbell, "Support Vector Machines: Hype or Hallelujah?," SIGKDD Explor. Newsl., vol. 2, pp. 1-13, 2000.
[6] D. Boley and D. Cao, "Training Support Vector Machine using Adaptive Clustering," in Proc. of the 4th SIAM International Conference on Data Mining, Lake Buena Vista, Florida, 2004.
[7] R. Collobert, F. Sinz, et al., "Large Scale Transductive SVMs," Journal of Machine Learning Research, pp. 1687-1712, 2006.
[8] R. Collobert, F. Sinz, et al., "Trading Convexity for Scalability," in Proc. of the 23rd international conference on Machine learning, Pittsburgh, PA, 2006.
[9] K. Crammer and Y. Singer, "On the Algorithmic Implementation of Multi-class SVMs," Journal of Machine Learning Research, vol. 2, pp. 265-292, 2001.
[10] C. Giuliano, A. Lavelli, et al., "Simple Information Extraction (SIE)," ITC-irst, Istituto per la Ricerca Scientifica e Tecnologica, 2005.
[11] M. S. Habib and J. Kalita, "Language and Domain-Independent Named Entity Recognition: Experiment using SVM and High-Dimensional Features," in Proc. of the 4th Biotechnology and Bioinformatics Symposium (BIOT-2007), Colorado Springs, CO, 2007.
[12] C.-W. Hsu and C.-C. Lin, "A Comparison of Methods for Multi-Class Support Vector Machines," IEEE Transactions on Neural Networks, vol. 13, pp. 415-425, 2002.
[13] T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," in Proc. of the European Conference on Machine Learning, 1998.
[14] T. Joachims, "Making Large-Scale SVM Learning Practical," in Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, Eds.: MIT-Press, 1999.
[15] T. Joachims, Learning to Classify Text Using Support Vector Machine. Norwell, MA: Kluwer Academic, 2002.
[16] T. Joachims, "A Support Vector Method for Multivariate Performance Measures," in Proc. of the International Conference on Machine Learning (ICML), 2005.
[17] T. Joachims, "Training Linear SVMs in Linear Time," in Proc. of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2006.
[18] V. Kecman, Learning and Soft Computing. London, UK: The MIT Press, 2001.
[19] J. D. Kim, T. Ohta, et al., "GENIA Corpus--Semantically Annotated Corpus for Bio-Textmining," Bioinformatics, vol. 19 Suppl 1, pp. 180- 182, 2003.
[20] J.-D. Kim, T. Ohta, et al., "Introduction to the Bio-Entity Recognition Task at JNLPBA," in Proc. of the 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA'2004), Geneva, Switzerland, 2004.
[21] U. H.-G. Kreßel, "Pairwise Classification and Support Vector Machines," in Advances in Kernel Methods: Support Vector Learning. Cambridge, MA: MIT Press, 1999, pp. 255-268.
[22] K.-J. Lee, Y.-S. Hwang, et al., "Biomedical Named Entity Recognition using Two-Phase Model Based on SVMs," Journal of Biomedical Informatics, vol. 37, pp. 436-447, 2004.
[23] H. Lei and V. Govindaraju, "Half-Against-Half Multi-class Support Vector Machines," in Proc. of the 6th International Workshop on Multiple Classifier Systems, Seaside, CA, USA, 2005.
[24] K.-R. M├╝ller, S. Mika, et al., "An Introduction to Kernel-Based Learning Algorithms," IEEE Transactions on Neural Networks, vol. 12, pp. 181- 120, 2001.
[25] K.-M. Park, S.-H. Kim, et al., "Incorporating Lexical Knowledge into Biomedical NE Recognition," in Proc. of the 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA'2004), Geneva, Switzerland, 2004.
[26] J. C. Platt, N. Cristianini, et al., "Large Margin DAGs for Multiclass Classification," in Advances in Neural Information Processing Systems, vol. 12, S. A. Solla, T. K. Leen, and K.-R. M¨uller, Eds. Cambridge, MA: MIT Press, 2000, pp. 547-553.
[27] M. Rössler, "Adapting an NER-System for German to the Biomedical Domain," in Proc. of the 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA'2004), Geneva, Switzerland, 2004.
[28] T. Serafini and L. Zanni, "On the Working Set Selection in Gradient Projection-based Decomposition Techniques for Support Vector Machines," Optimization Methods and Software, pp. 583-596, 2005.
[29] Y. Song, E. Kim, et al., "POSBIOTM-NER in the Shared Task of BioNLP/NLPBA 2004," in Proc. of the 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA'2004), Geneva, Switzerland, 2004.
[30] I. Tsochantaridis, T. Hofmann, et al., "Support Vector Learning for Interdependent and Structured Output Spaces," in Proc. of the 21st International Conference on Machine Learning (ICML), Alberta, Canada, 2004.
[31] I. Tsochantaridis, T. Joachims, et al., "Large Margin Methods for Structured and Interdependent Output Variables," Journal of Machine Learning Research (JMLR), vol. 6, pp. 1453-1484, 2005.
[32] V. N. Vapnik, Statistical Learning Theory. New York, NY: John Wiley & Sons, 1998.
[33] Y. Wong and H. T. Ng, "One Class per Named Entity: Exploiting Unlabeled Text for Named Entity Recognition," in Proc. of the 20th International Joint Conference on Artificial Intelligence (IJCAI-07), Hyderabad, India, 2007.
[34] G. Zhou and J. Su, "Exploring Deep Knowledge Resources in Biomedical Name Recognition," in Proc. of the 2004 Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA'2004), Geneva, Switzerland, 2004.