Indonesian News Classification using Support Vector Machine
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
Indonesian News Classification using Support Vector Machine

Authors: Dewi Y. Liliana, Agung Hardianto, M. Ridok

Abstract:

Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.

Keywords: classification, Indonesian news, text processing, support vector machine

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1074439

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3435

References:


[1] W. S. Maulsby, "Getting in News", in Mondry, 2008, pp. 132-133
[2] A. Z. Arifin, and A. N. Setiono, "Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering", Institut Teknologi Sepuluh Nopember(ITS). Surabaya. http://mail.itssby. edu/~agusza/SITIAKlasifikasiEvent.pdf.
[3] I. Saputra, "Analisa Dan Implementasi Klasifikasi Berita Berbahasa Indonesia Menggunakan Metode Naive Bayes Analysis and Implementation of Classification Indonesian News With Naive Bayes Method". Institut Teknologi Telkom. Bandung.
[4] M. Srinivas, and A. H. Sung. "Feature Selection for Intrusion Detection Using Neural Networks and Support Vector Machines", in Journal of Department of Computer Science, MIT. USA, 2003.
[5] Y. Yang, and X. Liu, " A Re-examination of Text Categorization Methods", Proceedings of SIGIR-99, 22nd ACM International Conference on Research and Development in Information Retrieval, 1999, pp. 42-49
[6] Tala, and Z. Fadillah, 2003, "A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia". Master of Logic Project. Institute for Logic, Language and Computation, Universiteit van Amsterdam, 2003 The Netherlands www.illc.uva.nl/Publications/ResearchReports/MoL-200302.text.pdf.
[7] J. C. Platt, "Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machine", Microsoft research, 1998.
[8] N. Cristianini, and J. Shawe-Taylor, "An Introduction to Support Vector Machines" Cambridge, UK: Cambridge University Press, 2000.