Daniel Morariu and Lucian N. Vintan and Volker Tresp
Feature Selection Methods for an Improved SVM Classifier
470 - 476
2008
2
2
International Journal of Computer and Information Engineering
https://publications.waset.org/pdf/5384
https://publications.waset.org/vol/14
World Academy of Science, Engineering and Technology
Text categorization is the problem of classifying text
documents into a set of predefined classes. After a preprocessing
step, the documents are typically represented as large sparse vectors.
When training classifiers on large collections of documents, both the
time and memory restrictions can be quite prohibitive. This justifies
the application of feature selection methods to reduce the
dimensionality of the documentrepresentation vector. In this paper,
three feature selection methods are evaluated Random Selection,
Information Gain (IG) and Support Vector Machine feature selection
(called SVM_FS). We show that the best results were obtained with
SVM_FS method for a relatively small dimension of the feature
vector. Also we present a novel method to better correlate SVM
kernels parameters (Polynomial or Gaussian kernel).
Open Science Index 14, 2008