Combining Bagging and Boosting
Authors: S. B. Kotsiantis, P. E. Pintelas
Abstract:
Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.
Keywords: data mining, machine learning, pattern recognition.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1059761
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2561References:
[1] Opitz D. & Maclin R., Popular Ensemble Methods: An Empirical Study, Artificial Intelligence Research, Vol. 11, 1999, pp. 169-198.
[2] Dietterich, T.G., Ensemble methods in machine learning. In Kittler, J., Roli, F., eds.: Multiple Classifier Systems. Lecture Notes Computer Sciences, Vol. 1857, 2001, pp. 1-15.
[3] Breiman L., Bagging Predictors. Machine Learning, Vol. 24, No. 3, 1996, pp. 123-140.
[4] Freund Y. and Robert E. Schapire. Experiments with a New Boosting Algorithm, Proceedings of ICML-96, pp. 148-156.
[5] Webb G. I., MultiBoosting: A Technique for Combining Boosting and Wagging, Machine Learning, Vol. 40, 2000, pp. 159-196.
[6] Melville P., Mooney R., Constructing Diverse Classifier Ensembles using Artificial Training Examples, Proceedings of IJCAI-2003, pp.505- 510, Acapulco, Mexico.
[7] Bauer, E. & Kohavi, R., An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, Vol. 36, 1999, pp. 105-139.
[8] Blake, C.L. & Merz, C.J., UCI Repository of machine learning databases. Irvine, CA: University of California, 1998, Department of Information and Computer Science. (http://www.ics.uci.edu/~mlearn/MLRepository.html)
[9] Bosch, A. and Daelemans W., Memory-based morphological analysis. Proceedings of 37th Annual Meeting of the ACL, 1999, University of Maryland, pp. 285-292 (http://ilk.kub.nl/~antalb/ltuia/week10.html).
[10] Kotsiantis, S., Pierrakeas, C. and Pintelas, P., Preventing student dropout in distance learning systems using machine learning techniques, Lecture Notes in AI, Springer-Verlag Vol 2774, 2003, pp 267-274.
[11] Salzberg, S., On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach, Data Mining and Knowledge Discovery, Vol. 1, 1997, pp. 317-328.
[12] Quinlan J.R., C4.5: Programs for machine learning. 1993, Morgan Kaufmann, San Francisco.
[13] Domingos P. & Pazzani M., On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, Vol. 29, 1997, pp. 103-130.
[14] Holte, R. C., Very simple classification rules perform well on most commonly used datasets, Machine Learning, Vol. 11, 1993, pp. 63-90.
[15] Iba, W., & Langley, P., Induction of one-level decision trees, Proceedings of Ninth International Machine Learning Conference, 1992. Aberdeen, Scotland.
[16] Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S., Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, Vol. 26, 1998, pp. 1651-1686.
[17] Furnkranz, J., Separate-and-Conquer Rule Learning, Artificial Intelligence Review, Vol. 13, 1999, pp. 3-54.
[18] Jensen F., An Introduction to Bayesian Networks. 1996, Springer.
[19] Murthy, Automatic Construction of Decision Trees from Data: A Multi- Disciplinary Survey, Data Mining and Knowledge Discovery, Vol. 2, 1998, pp. 345-389.