Review and Comparison of Associative Classification Data Mining Approaches
Authors: Suzan Wedyan
Abstract:
Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed such as Classification based Association (CBA), Classification based on Multiple Association Rules (CMAR), Class based Associative Classification (CACA), and Classification based on Predicted Association Rule (CPAR). This paper surveys major AC algorithms and compares the steps and methods performed in each algorithm including: rule learning, rule sorting, rule pruning, classifier building, and class prediction.
Keywords: Associative Classification, Classification, Data Mining, Learning, Rule Ranking, Rule Pruning, Prediction.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1336440
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6634References:
[1] Witten, I., and Frank, E. (2000) Data mining: practical machine learning tools and techniques with Java implementations. San Francisco: Morgan Kaufmann.
[2] Fayyad, U., and Irani, K. (1993) Multi—interval discretization of continues-valued attributes for classification learning. Proceedings of IJCAI, pp. 1022-1027. 1993.
[3] Agrawal, R., Imielinski, T., Swami, A (1993) Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference, pp. 207–216, 1993.
[4] Liu, B., Hsu, W., and Ma, Y. (1998) Integrating classification and association rule mining. Proceedings of the Knowledge Discovery and Data Mining Conference- KDD, pp. 80-86. New York, NY.
[5] Yin, X., and Han, J. (2003) CPAR Classification based on predictive association rule. Proceedings of the –the SIAM International Conference on Data Mining -SDM, pp. 369-376, 2003.
[6] Veloso A., Meira W., Gonçalves M., Zaki. M (2007) Multi-label Lazy Associative Classification. Proceedings of the Principles of Data Mining and Knowledge Discovery - PKDD, pp. 605-612, 2007
[7] Li X., Qin D, and Yu C. (2008) ACCF: Associative Classification Based on Closed Frequent Itemsets. Proceedings of the Fifth International Conference on Fuzzy Systems and Knowledge Discovery -. FSKD. pp. 380-384, 2008.
[8] Ye Y., Jiang Q., and Zhuang W. (2008) Associative classification and post-processing techniques used for malware detection. Proceedings of the 2nd International Conference on Anti-counterfeiting, Security and Identification, 2008 –ASID, pp. 276-279, 2008.
[9] Niu Q., Xia S. and Zhang L. (2009). Association Classification Based on Compactness of Rules. Proceedings of the Second International Workshop on Knowledge Discovery and Data Mining - WKDD, pp.245-247, 2009.
[10] Quinlan, J. (1993) C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
[11] Jensen, D., and Cohen, P. (2000) Multiple comparisons in induction algorithms. Machine Learning 38(3), (pp. 309 – 338), 2000
[12] Li, W., Han, J., and Pei, J. (2001) CMAR: Accurate and efficient classification based on multiple-class association rule. Proceedings of the IEEE International Conference on Data Mining –ICDM, pp. 369-376, 2001.
[13] Antonie, M., Zaïane, O. (2002) Text document categorization by term association. Proceedings of the IEEE International Conference on Data Mining, (pp. 19-26). Maebashi City, Japan
[14] .Xu, X., Han, G., and Min H. (2004) A novel algorithm for associative classification of images blocks. Proceedings of the fourth IEEE International Conference on Computer and Information Technology, (pp. 46-51), 2004.
[15] Antonie, M., Zaïane, O. (2004). An associative classifier based on positive and negative rules. Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (pp. 64 - 69). Paris, France.
[16] Baralis, E., and Torino, P. (2002) A lazy approach to pruning classification rules. Proceedings of the 2002 IEEE ICDM'02, (pp. 35).
[17] Thabtah, F., Cowling, P., and Peng, Y. (2004) MMAC: A new multi-class, multi-label associative classification approach. Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM ’04), (pp. 217-224). Brighton, UK.
[18] Thabtah, F., Cowling, P., and Peng, Y. (2005) MCAR: Multi-class classification based on association rule approach. Proceeding of the 3rd IEEE International Conference on Computer Systems and Applications (pp. 1-7).Cairo, Egypt.
[19] Tang Z. and Liao Q. (2007). A New Class Based Associative Classification Algorithm. IMECS 2007: 685-689, 2007.
[20] Kundu G., Islam M., Munir S. and Bari M. (2008). ACN: An Associative Classifier with Negative Rules, 11th IEEE International Conference on Computational Science and Engineering. pp. 369-375,.
[21] Wang X., Yue K., Niu W., and Shi Z. (2011). An approach for adaptive associative classification. Expert Systems with Applications: An International Journal, Volume 38 Issue 9, pp. 11873-11883, 2011.
[22] Thabtah F (2006): Rule Preference Effect in Associative Classification Mining. Journal of Information and Knowledge Management, volume 5(1):13-20, 2006. WorldScinet, 2006.
[23] Zaki, M., Parthasarathy, S., Ogihara, M., and Li, W. (1997) New algorithms for fast discovery of association rules. Proceedings of the 3rd KDD Conference (pp. 283-286), 1997.
[24] Zaki, M., and Gouda, K. (2003) Fast vertical mining using diffsets. Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 326 – 335, 2003
[25] Han J., Lin T. Y., Li J., Cercone N. (2007). Constructing Associative Classifiers from Decision Tables. Proceedings of the International conference: Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing –RSFDGrC , pp. 305-313, 2007
[26] Øhrn, A.(2001) ROSETTA Technical Reference Manual. Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway
[27] Pawlak, Z. (1991): Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht
[28] Agrawal, R., and Srikant, R. (1994) Fast algorithms for mining association rule. Proceedings of the 20th International Conference on Very Large Data Bases- VLDP,pp. 487-499, 1994.
[29] Michalski, R. (1980) Pattern recognition as rule-guided induction inference. IEEE Trans. on Pattern Analysis and Machine Intelligence 2, 349–361, 1980.
[30] Liu, B., Ma, Y., and Wong, C-K. (2001) Classification using association rules: weakness and enhancements. In Vipin Kumar, et al, (eds) Data mining for scientific applications, 2001.
[31] Merz, C., and Murphy, P. (1996) UCI repository of machine learning databases. Irvine, CA, University of California, Department of Information and Computer Science.
[32] Kundu G., Munir S., Md. Islam M., and Murase K. (2007) A Novel Algorithm for Associative Classification. Proceedings of the International Conference on Neural Information Processing- ICONIP, pp. 453-459, 2007.
[33] Huang, Z., Zhou, Z., He, T., & Wang, X. (2011, November). "ACAC: Associative Classification based on All-Confidence”. IEEE International Conference on Granular Computing (GrC), pp. 289-293, 2011.
[34] Zaki M., Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. Proceedings of the 2002SIAMinternational conference on data mining (SDM’02) , pp 457–473, 2002
[35] Quinlan, J., and Cameron-Jones, R. (1993) FOIL: A midterm report. Proceedings of the European Conference on Machine Learning, (pp. 3-20), Vienna, Austria.
[36] Han, J., Pei, J., and Yin, Y. (2000) Mining frequent patterns without candidate generation. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1-12, 2000.
[37] Baralis, E., Chiusano, S., and Graza, P. (2004) on support thresholds in associative classification. Proceedings of the 2004 ACM Symposium on Applied Computing, (pp. 553-558). Nicosia, Cyprus, 2004.
[38] Sun, K. and Bai, F. (2008, April). "Mining Weighted Association Rules without Pre-assigned Weights”. IEEE Transactions on Knowledge and Data Engineering, Vol. 20, Issue: 4, pp. 489 – 495, 2008.
[39] Ibrahim, S., and Chandran, K.R., (2011, November). "Compact Weighted Class Association Rule Mining using Information Gain”. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.1, No.6, 2011.
[40] Thabtah F., Mahmood Q., McCluskey L., Abdel-jaber H (2010). A new Classification based on Association Algorithm. Journal of Information and Knowledge Management, Vol 9, No. 1, pp. 55-64. World Scientific.
[41] Abumansour H., Hadi W., McCluskey L., Thabtah F. (2010). Associative Text Categorisation Rules Pruning Method. Proceedings of the Linguistic and Cognitive Approaches to Dialog Agents Symposium (LaCATODA-10), RafalRzepka (Ed.), at the AISB, Pp. 39-44. UK.
[42] Quinlan, J. (1998) Data mining tools See5 and C5.0. Technical Report, RuleQuest Research.
[43] Antonie M., Zaïane O. R. and Coman A. (2003). Associative Classifiers for Medical Images, Lecture Notes in Artificial Intelligence 2797, Mining Multimedia and Complex Data, (pp. 68-83), Springer-Verlag
[44] Jiang Y, Liu Y, Liu X, Yang S (2010) Integrating classification capability and reliability in associative classification: A β-stronger model. Expert Systems with Applications, 37(5):3953-3961, 2010.
[45] Thabtah F., Hadi W., Abdelhamid N., Issa A.(2011) Prediction Phase in Associative Classification Mining. Journal of Knowledge Engineering and Software Engineering. World Science, 2011.
[46] Wedyan S., and Wedyan F. (2013) An Associative Classification Data Mining Approach for Detecting Phishing Websites, Journal of Emerging Trends in Computing and Information Sciences, 12(5):xx-xx, 2013.