WASET
	%0 Journal Article
	%A Jan Pomikalek and  Radim Rehurek 
	%D 2007
	%J International Journal of Industrial and Manufacturing Engineering
	%B World Academy of Science, Engineering and Technology
	%I Open Science Index 9, 2007
	%T The Influence of Preprocessing Parameters on Text Categorization
	%U https://publications.waset.org/pdf/6976
	%V 9
	%X Text categorization (the assignment of texts in natural language into predefined categories) is an important and extensively studied problem in Machine Learning. Currently, popular techniques developed to deal with this task include many preprocessing and learning algorithms, many of which in turn require tuning nontrivial internal parameters. Although partial studies are available, many authors fail to report values of the parameters they use in their experiments, or reasons why these values were used instead of others. The goal of this work then is to create a more thorough comparison of preprocessing parameters and their mutual influence, and report interesting observations and results.

	%P 504 - 507