\r\ndecision making across numerous different classes. Multi-class

\r\nclassification has several applications, such as face recognition, text

\r\nrecognition and medical diagnostics. The objective of this article is

\r\nto analyze an adapted method of Stacking in multi-class problems,

\r\nwhich combines ensembles within the ensemble itself. For this

\r\npurpose, a training similar to Stacking was used, but with three

\r\nlevels, where the final decision-maker (level 2) performs its training

\r\nby combining outputs from the tree-based pair of meta-classifiers

\r\n(level 1) from Bayesian families. These are in turn trained by pairs

\r\nof base classifiers (level 0) of the same family. This strategy seeks to

\r\npromote diversity among the ensembles forming the meta-classifier

\r\nlevel 2. Three performance measures were used: (1) accuracy, (2)

\r\narea under the ROC curve, and (3) time for three factors: (a)

\r\ndatasets, (b) experiments and (c) levels. To compare the factors,

\r\nANOVA three-way test was executed for each performance measure,

\r\nconsidering 5 datasets by 25 experiments by 3 levels. A triple

\r\ninteraction between factors was observed only in time. The accuracy

\r\nand area under the ROC curve presented similar results, showing

\r\na double interaction between level and experiment, as well as for

\r\nthe dataset factor. It was concluded that level 2 had an average

\r\nperformance above the other levels and that the proposed method

\r\nis especially efficient for multi-class problems when compared to

\r\nbinary problems.","references":"[1] M. F. F. Oliveira, \u201cAn\u00b4alise de mercado: uma ferramenta de mapeamento\r\nde oportunidades de neg\u00b4ocio em t\u00b4ecnicas de geomarketing e aprendizado\r\nde m\u00b4aquina,\u201d 2016.\r\n[2] S. B. Kotsiantis, I. Zaharakis, and P. Pintelas, \u201cSupervised machine\r\nlearning: A review of classification techniques,\u201d 2007.\r\n[3] S. Chebrolu, A. Abraham, and J. P. Thomas, \u201cFeature deduction and\r\nensemble design of intrusion detection systems,\u201d Computers & security,\r\nvol. 24, no. 4, pp. 295\u2013307, 2005.\r\n[4] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, and\r\nS. U. Khan, \u201cThe rise of \u201cbig data\u201d on cloud computing: Review and\r\nopen research issues,\u201d Information Systems, vol. 47, pp. 98\u2013115, 2015.\r\n[5] G. Malik and M. Tarique, \u201cOn machine learning techniques for multi\r\nclass classification,\u201d International Journal of Advancements in Research\r\n& Technology, vol. 3, no. 2, 2014.\r\n[6] J. Liu, S. Ranka, and T. Kahveci, \u201cClassification and feature selection\r\nalgorithms for multi-class cgh data,\u201d Bioinformatics, vol. 24, no. 13,\r\npp. i86\u2013i95, 2008.\r\n[7] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical\r\nmachine learning tools and techniques. Morgan Kaufmann, 2016.\r\n[8] M. Fernandez-Delgado, E. Cernadas, S. Barro, and D. Amorim, \u201cDo we\r\nneed hundreds of classifiers to solve real world classification problems,\u201d\r\nJ. Mach. Learn. Res, vol. 15, no. 1, pp. 3133\u20133181, 2014.\r\n[9] J. A. Saez, J. Luengo, and F. Herrera, \u201cEvaluating the classifier behavior\r\nwith noisy data considering performance and robustness: the equalized\r\nloss of accuracy measure,\u201d Neurocomputing, vol. 176, pp. 26\u201335, 2016.\r\n[10] Y. Ren, L. Zhang, and P. N. Suganthan, \u201cEnsemble classification\r\nand regression-recent developments, applications and future directions\r\n[review article],\u201d IEEE Computational Intelligence Magazine, vol. 11,\r\nno. 1, pp. 41\u201353, 2016.\r\n[11] J.-C. Levesque, C. Gagne, and R. Sabourin, \u201cBayesian hyperparameter\r\noptimization for ensemble learning,\u201d arXiv preprint arXiv:1605.06394,\r\n2016.\r\n[12] L. M. Vriesmann, Selec\u00b8 \u02dcao Din\u02c6amica de Subconjunto de Classificadores.\r\nPhD thesis, Pontif\u00b4\u0131cia Universidade Cat\u00b4olica do Paran\u00b4a, 2012.\r\n[13] R. Vilalta and Y. Drissi, \u201cA perspective view and survey of\r\nmeta-learning,\u201d Artificial Intelligence Review, vol. 18, no. 2, pp. 77\u201395,\r\n2002.\r\n[14] A. K. Seewald, Towards understanding stacking: studies of a general\r\nensemble learning scheme. na, 2003. [15] S. D\u02c7zeroski and B. \u02c7 Zenko, \u201cIs combining classifiers with stacking\r\nbetter than selecting the best one?,\u201d Machine learning, vol. 54, no. 3,\r\npp. 255\u2013273, 2004.\r\n[16] R. Lorbieski and S. Nassar, \u201cPerformance evaluation in multi-level\r\nensemble.,\u201d 2017. Manuscript submitted for publication.\r\n[17] A. Ledezma, R. Aler, A. Sanchis, and D. Borrajo, \u201cGa-stacking:\r\nEvolutionary stacked generalization,\u201d Intelligent Data Analysis, vol. 14,\r\nno. 1, pp. 89\u2013119, 2010.\r\n[18] D. H. Wolpert, \u201cStacked generalization,\u201d Neural networks, vol. 5, no. 2,\r\npp. 241\u2013259, 1992.\r\n[19] G. Sigletos, G. Paliouras, C. D. Spyropoulos, and M. Hatzopoulos,\r\n\u201cCombining information extraction systems using voting and stacked\r\ngeneralization,\u201d Journal of Machine Learning Research, vol. 6, no. Nov,\r\npp. 1751\u20131782, 2005.\r\n[20] L. Breiman, \u201cBagging predictors,\u201d Machine learning, vol. 24, no. 2,\r\npp. 123\u2013140, 1996.\r\n[21] K. M. Ting and I. H. Witten, \u201cIssues in stacked generalization,\u201d J. Artif.\r\nIntell. Res.(JAIR), vol. 10, pp. 271\u2013289, 1999.\r\n[22] G. Tsirogiannis, D. Frossyniotis, J. Stoitsis, S. Golemati, A. Stafylopatis,\r\nand K. Nikita, \u201cClassification of medical data with a robust multi-level\r\ncombination scheme,\u201d in Neural Networks, 2004. Proceedings. 2004\r\nIEEE International Joint Conference on, vol. 3, pp. 2483\u20132487, IEEE,\r\n2004.\r\n[23] T. Li, S. Zhu, and M. Ogihara, \u201cUsing discriminant analysis for\r\nmulti-class classification: an experimental investigation,\u201d Knowledge and\r\ninformation systems, vol. 10, no. 4, pp. 453\u2013472, 2006.\r\n[24] A. K. Tanwani, J. Afridi, M. Z. Shafiq, and M. Farooq, \u201cGuidelines\r\nto select machine learning scheme for classification of biomedical\r\ndatasets,\u201d in European Conference on Evolutionary Computation,\r\nMachine Learning and Data Mining in Bioinformatics, pp. 128\u2013139,\r\nSpringer, 2009.\r\n[25] T. Windeatt and R. Ghaderi, \u201cCoding and decoding strategies for\r\nmulti-class learning problems,\u201d Information Fusion, vol. 4, no. 1,\r\npp. 11\u201321, 2003.\r\n[26] G. Tsoumakas and I. Vlahavas, \u201cRandom k-labelsets: An ensemble\r\nmethod for multilabel classification,\u201d in European Conference on\r\nMachine Learning, pp. 406\u2013417, Springer, 2007.\r\n[27] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera,\r\n\u201cAn overview of ensemble methods for binary classifiers in multi-class\r\nproblems: Experimental study on one-vs-one and one-vs-all schemes,\u201d\r\nPattern Recognition, vol. 44, no. 8, pp. 1761\u20131776, 2011.\r\n[28] A. Jurek, Y. Bi, S. Wu, and C. Nugent, \u201cA survey of commonly used\r\nensemble-based classification techniques,\u201d The Knowledge Engineering\r\nReview, vol. 29, no. 05, pp. 551\u2013581, 2014.\r\n[29] E. Menahem, L. Rokach, and Y. Elovici, \u201cTroika\u2013an improved stacking\r\nschema for classification tasks,\u201d Information Sciences, vol. 179, no. 24,\r\npp. 4097\u20134122, 2009. [30] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H.\r\nWitten, \u201cThe weka data mining software: an update,\u201d ACM SIGKDD\r\nexplorations newsletter, vol. 11, no. 1, pp. 10\u201318, 2009.\r\n[31] A. Prodromidis, P. Chan, and S. Stolfo, \u201cMeta-learning in distributed\r\ndata mining systems: Issues and approaches,\u201d Advances in distributed\r\nand parallel knowledge discovery, vol. 3, pp. 81\u2013114, 2000.\r\n[32] Y. Freund, R. E. Schapire, et al., \u201cExperiments with a new boosting\r\nalgorithm,\u201d in icml, vol. 96, pp. 148\u2013156, 1996.\r\n[33] N. Friedman, D. Geiger, and M. Goldszmidt, \u201cBayesian network\r\nclassifiers,\u201d Machine learning, vol. 29, no. 2-3, pp. 131\u2013163, 1997.\r\n[34] J. Quinlan, \u201cC4. 5: Programs for empirical learning morgan kaufmann,\u201d\r\nSan Francisco, CA, 1993.\r\n[35] G. H. John and P. Langley, \u201cEstimating continuous distributions in\r\nbayesian classifiers,\u201d in Proceedings of the Eleventh conference on\r\nUncertainty in artificial intelligence, pp. 338\u2013345, Morgan Kaufmann\r\nPublishers Inc., 1995.\r\n[36] W. Iba and P. Langley, \u201cInduction of one-level decision trees,\u201d in\r\nProceedings of the ninth international conference on machine learning,\r\npp. 233\u2013240, 1992.\r\n[37] K. Bache and M. Lichman, \u201cUci machine learning repository,\u201d 2013.\r\n[38] D. J. Hand and R. J. Till, \u201cA simple generalisation of the area under the\r\nroc curve for multiple class classification problems,\u201d Machine learning,\r\nvol. 45, no. 2, pp. 171\u2013186, 2001.\r\n[39] B. Cohen, Explaining Psychological Statistics. Wiley, 2013.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 135, 2018"}