Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30169
A New Model for Discovering XML Association Rules from XML Documents

Authors: R. AliMohammadzadeh, M. Rahgozar, A. Zarnani

Abstract:

The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.

Keywords: XML, Data Mining, Association Rule Mining.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1075344

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1251

References:


[1] Braga D., A. Campi, M. Klemettinen, and P. L. Lanzi. Mining association rules from XML data. In Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery, September 4-6, Aixen-Provence, France 2002.
[2] Feng L. & T. Dillon. Mining XML-Enabled Association Rule with Templates. In Proceedings of KDID04, 2004.
[3] Nayak, R. Discovering Knowledge from XML Documents, in Wong, John, Eds. Encyclopedia of Data Warehousing and Mining. Idea Group Publications, 2005.
[4] Tan, H., T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining Patterns from XML Database," In Proc. Data Mining '05. Skiathos, Greece, 2005.
[5] M. .J. Zaki, "Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications," in IEEE Transaction on Knowledge and Data Engineering, vol. 17, no. 8, pp. 1021-1035, 2005.
[6] M. J. Zaki,.. "Efficient Mining of Trees in the Forest". SIGKDD '02, Edmonton, Alberta, Canada, ACM. 2002.
[7] Y. Chi, S. Nijssen, R.R. Muntz, J. N. Kok, "Frequent Subtree Mining An Overview," Fundamental Informatics, Special Issue on Graph and Tree Mining, 2005.
[8] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. Inkeri Verkamo, "Fast Discovery of Association Rules," Advances in Knowledge Discovery, and Data Mining, U. Fayyad et al., eds.,pp. 307- 328, Menlo Park, Calif.: AAAI Press, 1996.
[9] R. AliMohammadzadeh, M. Haghir Chehreghani, A. Zarnani, M. Rahgozar, "W3-Miner: Mining Weighted Frequent Subtree Patterns in a Collection of Trees". In Proceedings of the Second International Conference on Pattern Analysis (Budapest, Hungary, May 26-28, 2006). ICPA-06. Transaction on Engineering, Computing and Technology, ISSN 1305-5313, Pages 164-168, World Enformatika Society.
[10] M. Zaki. Efficiently mining frequent embedded unordered trees. Fundamental Informatics, 65:1-20, 2005.
[11] M. J. Zaki and C. C. Aggarwal. XRules: An effective structural classifier for XML data. In Proc. of the 2003 Int. Conf. Knowledge Discovery and Data Mining, 2003.
[12] K. Abe, S. Kawasoe, T. Asai, H. Arimura, and S. Arikawa, "Optimized Substructure Discovery for Semi-structured Data," In Proc. PKDD-02, 1-14, LNAI 2431, 2002.
[13] T. Asai, H. Arimura, T. Uno, and S. Nakano. Discovering frequent substructures in large unordered trees. In Proc. of the 6th Intl. Conf. on Discovery Science, 2003.
[14] Y. Chi, Y. Yang, and R. R. Muntz. Mining frequent rooted trees and free trees using canonical forms. Technical Report CSD-TR No. 030043, UCLA, 2003.
[15] H. Tan, T.S. Dillon, L. Feng, E. Chang, F. Hadzic, "X3-Miner: Mining Patterns from XML Database," In Proc. Data Mining '05. Skiathos, Greece, 2005.
[16] K. Wang and H. Liu, "Discovering Typical Structures of Documents: A Road Map Approach," Proc. ACM SIGIR Conf. Information Retrieval, 1998.
[17] Y. Chi, Y. Yang, and R.R. Muntz, "Indexing and Mining Free Trees," Proc. Third IEEE Int-l Conf. Data Mining, 2003.
[18] U. Ruckert and S. Kramer, "Frequent Free Tree Discovery in Graph Data," Special Track on Data Mining, Proc. ACM Symp. Applied Computing, 2004.
[19] Y. Xiao, J.-F. Yao, Z. Li, and M.H. Dunham, "Efficient Data Mining for Maximal Frequent Subtrees," Proc. Int-l Conf. Data Mining, 2003.
[20] S. Nijssen and J.N. Kok, "Efficient Discovery of Frequent Unordered Trees," Proc. First Int-l Workshop Mining Graphs, Trees, and Sequences, 2003.
[21] Y. Chi, Y. Yang, and R.R. Muntz, "HybridTreeMiner: An Efficient Algorihtm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms," Proc. 16th Int-l Conf. Scientific and Statistical Database Management, 2004.
[22] A. Termier, M-C. Rousset, and M. Sebag, "Treefinder: A First Step Towards XML Data Mining," Proc. IEEE Int-l Conf. Data Mining, 2002.
[23] D. Shasha, J. Wang, and S. Zhang, "Unordered Tree Mining with Applications to Phylogeny," Proc. Int-l Conf. Data Eng., 2004.
[24] C. Wang, M. Hong, J. Pei, H. Zhou, W. Wang, and B. Shi, "Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining," Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2004.
[25] R. AliMohammadzadeh, S. Soltan, and M. Rahgozar, "Template guided association rule mining from XML documents". In Proceedings of the 15th international Conference on World Wide Web (Edinburgh, Scotland, May 23 - 26, 2006). WWW 2006, ACM Press, New York, NY, 963-964. DOI= http://doi.acm.org/10.1145/1135777.1135966.
[26] Q Ding, K Ricords, J Lumpkin, "Deriving General Association Rules from XML Data", In Proceedings of Fourth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD'03) October 16- 18, 2003 L├╝beck, Germany.
[27] YL Chen, CH Ye, SY Wu, "Mining Predecessor-Successor Rules from DAG Data", International Journal of Intelligent Systems, 2006.
[28] C. Combi, B. Oliboni, R. Rossato. "Complex Association Rules for XML Documents". In Proceedings of the 9th International Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES05).