Rank-Based Chain-Mode Ensemble for Binary Classification
Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu
Abstract:
In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.
Keywords: Consensus, curse of correlation, imbalanced classification, rank-based chain-mode ensemble.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 747References:
[1] A. C. Braun, U. Weidner, and S. Hinz, “Classification in high-dimensional feature spaces assessment using SVM, IVM and RVM with focus on simulated EnMap data,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 5, no. 2, pp. 436-443, Apr. 2012.
[2] M. Asafuddoula, B. Verma, and M. Zhang, “A Divide-and-Conquer-Based Ensemble Classifier Learning by Means of Many-Objective Optimization,” IEEE Transactions on Evolutionary Computation, vol. 22, no. 5, pp. 762-777, Dec. 2017.
[3] S. Li, P. Wang, and L. Goel, “Wind Power Forecasting Using Neural Network Ensembles With Feature Selection,” IEEE Transactions on Sustainable Energy, vol. 6, no. 4, pp. 1447-1456, Oct. 2015.
[4] D. Huang, J. Lai, and C. Wang, “Robust Ensemble Clustering Using Probability Trajectories,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 5, pp. 1312-1326, May. 2016.
[5] D. Huang, J. Lai, and C. Wang, “Locally Weighted Ensemble Clustering,” IEEE Transactions on Cybernetics, vol. 48, no. 5, pp. 1460-1473, May. 2018.
[6] Z. Yu, P. Luo, J. You, H. Wong, H. Leung, S. Wu, J. Zhang, and G. Han, “Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 3, pp. 701-714, Mar. 2016.
[7] L. Xu, A. Krzyzak, and C. Y. Suen, “Methods of combining multiple classifiers and their applications to handwriting recognition,” IEEE Trans. Syst., Man Cybern., vol. 22, no. 3, pp. 418-435, Jun. 1992.
[8] L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms. USA: NY: New York: Wiley-Interscience, 2004, ch. 6.
[9] L. I. Kuncheva, “Diversity in multiple classifier systems,” Information Fusion, vol. 6, no. 1, pp. 3-4, Mar. 2005.
[10] P. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, Pearson, 2006.
[11] A. Bagnall, J. Lines, J. Hills and A. Bostrom, “Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 9, pp. 2522-2535, Sept. 2015.
[12] L. Breiman, Bagging Predictors, Springer, Aug. 1996, pp. 123-140.
[13] E. Bauer, and R. Kohavi, An Empirical Comparison Of Voting Classification Algorithms: Bagging, Boosting, And Variants, Springer, July. 1999, pp. 105-139.
[14] S. Merler, B. Caprile, and C. Furlanello, “Parallelizing AdaBoost by weights dynamics,” Journal of Computational Statistics & Data Analysis, vol. 51, no. 5, pp. 2487-2498, Feb, 2007.
[15] C. Zhang, and J. Zhang, “A local boosting algorithm for solving classification problems,” Journal of Computational Statistics & Data Analysis, vol. 52, no. 4, pp. 1928-1941, Jan, 2008.
[16] T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832-844, Aug. 1998.
[17] L. Breiman, Random forests. Springer, Oct. 2001, pp. 5-32.
[18] P. Geurts, D. Ernst, and L. Wehenke, Extremely Randomized Trees. Springer, Apr. 2006, pp. 3-42.
[19] Z. Yu, L. Li, and J. Liu, and G. Han, “Hybrid Adaptive Classifier Ensemble,” IEEE Transactions on Cybernetics, vol. 45, no. 2, pp. 177-190, Feb. 2015.
[20] Y. Sun, K. Tang, L. L. Minku, S. Wang, and X. Yao, “Online Ensemble Learning of Data Streams with Gradually Evolved Classes,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 6, pp. 1532-1545, June. 2016.
[21] Y. Sun, L. L. Minku, S. Wang, and X. Yao, “A learning framework for online class imbalance learning,” Conf. 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL), pp. 36-45.
[22] Y. Sun, L. L. Minku, S. Wang, and X. Yao, “Resampling-Based Ensemble Methods for Online Class Imbalance Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 5, pp. 1356-1368, May. 2015.
[23] D. Tao, X. Tang, X. Li, and X. Wu, “Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1088-1099, July 2006.
[24] B. G. Buchanan, and E.G. Shortliffe, Rule based expert systems, USA: MA: Addison-Wesley, 1984.
[25] S. Shilen, Multiple binary tree classifiers, Pattern Recognition, 1990, pp. 757-763.
[26] W. L. Buntine, “A theory of learning classification rules. Doctoral Dissertation,” PhD Dissertation, School of Computing Science, University of Technology. Sydney, Australia, Nov, 1992.
[27] P. Derbeko, R. E. Yaniv, and R. Meir, “Variance optimized bagging,” Conf. 2002 European Conference on Machine Learning, pp. 60-72.
[28] J. V. Hansen, “Combining predictors: Meta machine learning methods and bias, variance & ambiguity decompositions,” PhD Dissertation. Aurhus University, 2000.
[29] A. F. R, Rahman, and M. C. Fairhurst, “A new hybrid approach in combining multiple experts to recognize hand-written numerals,” Pattern Recognition Letters, vol. 18: no. 8, pp. 781-790, Aug, 1997.
[30] K. Tumer, and J. Ghosh, “Robust Order Statistics based Ensembles for Distributed Data Mining,” Advances in distributed and parallel knowledge discovery. AAAI/MIT Press, Cambridge, pp 185-210.
[31] M. Tavallaee, E. Bagheri, W. Lu, and A. Ghorbani, “A Detailed Analysis of the KDD CUP 99 Data Set,” Submitted to Second IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), 2009.