TY - JFULL AU - Mona Soliman Habib PY - 2008/2/ TI - Addressing Scalability Issues of Named Entity Recognition Using Multi-Class Support Vector Machines T2 - International Journal of Computer and Information Engineering SP - 16 EP - 26 VL - 2 SN - 1307-6892 UR - https://publications.waset.org/pdf/12114 PU - World Academy of Science, Engineering and Technology NX - Open Science Index 13, 2008 N2 - This paper explores the scalability issues associated with solving the Named Entity Recognition (NER) problem using Support Vector Machines (SVM) and high-dimensional features. The performance results of a set of experiments conducted using binary and multi-class SVM with increasing training data sizes are examined. The NER domain chosen for these experiments is the biomedical publications domain, especially selected due to its importance and inherent challenges. A simple machine learning approach is used that eliminates prior language knowledge such as part-of-speech or noun phrase tagging thereby allowing for its applicability across languages. No domain-specific knowledge is included. The accuracy measures achieved are comparable to those obtained using more complex approaches, which constitutes a motivation to investigate ways to improve the scalability of multiclass SVM in order to make the solution more practical and useable. Improving training time of multi-class SVM would make support vector machines a more viable and practical machine learning solution for real-world problems with large datasets. An initial prototype results in great improvement of the training time at the expense of memory requirements. ER -