Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30174
Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Authors: M. Govindarajan, R. M.Chandrasekaran

Abstract:

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.

Keywords: Text Data Mining, Comparative Cross-validation, Radial Basis Function, runtime, accuracy.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1072126

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1252

References:


[1] Oliver Buchtala, Manual Klimek and Bernhard Sick, Member, IEEE " Evolutionary Optimization of Radial Basis Function Classifier for Data Mining Applications", IEEE Transactions on systems,man,andcybernets,vol.35,No.5, October,2005
[2] Blake, C., & Merz, C. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/˜mlearn/MLRepository.html.
[3] C. L. Bauer. A direct mail customer purchase model. Journal of Direct Marketing, 2:16-24, 1988.
[4] Dietterich, T. (1998). Approximate statistical tests for comparing supervised classification learning algorithms.Neural Computation, 10, 1895-1923.
[5] Friedman, J., Bentley, J., &Finkel, R. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3, 209-226.
[6] Jiawei Han, Micheline Kamber " Data Mining - Concepts and Techniques" Elsevier, 2003, pages 359 to 365.
[7] N. Jovanovic, V. Milutinovic, and Z. Obradovic, Member, IEEE, "Foundations of Predictive Data Mining" (2002)
[8] J. M. Sousa, U. Kaymak, and S. Madeira. A comparative study of fuzzy target selection methods in direct marketing. In Proceedings of the 11th IEEE International Conference on Fuzzy Systems, Hawaii, USA, May 2002.
[9] Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of International Joint Conference on Artificial Intelligence (pp. 1137-1143).
[10] Margaret H.Dunham, "Data Mining- Introductory and Advanced Topics" Pearson Education, 2003, page 112.
[11] Mitchell, T. (1997). Machine learning. New York: McGraw-Hill.
[12] Naohiro lshiil, Eisuke suchiya, Yongguangao and Nobuhiko yamaguchi, "Combining Classification Improvements by Ensemble Processing" Proceedings of the 2005 Third ACIS Int'l Conference on Software Engineering Research, Management and Applications (SERA-05) 0- 7695-2297-1/05 $20.00 ┬® 2005 IEEE
[13] Ross, S. (1988). A first course in probability. New York: Macmillan.
[14] Sara Madeira Joao M.Sousa, "Comparison of target selection methods in direct Marketing" Technical University of Lisbon, Institution Superior T-echicio, Dept.Mechanical Eng./IDMEC, 1049-001 Lisbon, Portugal (2002).
[15] Vapnik, V. (1998). Statistical learning theory. New York: Wiley.