Post Mining- Discovering Valid Rules from Different Sized Data Sources

R. Nedunchezhian; K. Anbumani

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1075651

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1406

References:

[1] Agarwal, R. and Srikant, R,ÔÇÿFast Algorithms for Mining Association Rules, Proc. Very Large Database Conf. 1994.
[2] R.Agarwal. T.Imielinski and A. Swami, Mining Association Rules between Sets of Items in Large Databases, Proc. ACM International Conferences on Management of Data, 1993, pp.207-216.
[3] Cheung, D. Lee, S. and Kao, B., Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique, Proc. 12th Int-l Conf. Data Eng., 1996, pp. 106-114.
[4] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining. AAAI Press/The MIT Press, 1996.
[5] Han, J. Pei, J. and Yin, Y. , Mining Frequent Patterns Without Candidate Generation, Proc. ACM SIGMOD Int-l Conf. Management of Data, 2000, pp. 1-12.
[6] Jia-Wei Han and Micheline Kamber (2001), Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers.
[7] R.Nedunchezhian and K.Anbumani, Single Scan Frequent set Generation in Association Rule Mining, Proc. 1st International Computer Engineering Conference New Technologies for the Information Society, Cairo University, Egypt, 2004, 300-305.
[8] Park, J.S. Chen, M.S. and Yu, P.S., An Effective Hash Based Algorithm for Mining Association Rules, Proc. ACM SIGMOD Conf. Management of Data, 1995.
[9] Rastogi, R. and Shim, K., Mining Optimized Support Rules for Numeric Attributes, Proc. ACM SIGMOD Conf. Management of Data, 1999.
[10] Simovici, Dan A. Cristofor, Laurentiu and Cristofor, Dana, Galois Connections and Data mining, J.UCS: Journal of Universal Computer Science, 2000.
[11] Webb, G.I., Efficient Search for Association Rules, Proc. ACM SIGKDD Int-l Conf. Knowledge Discovery and Data Mining, 2000, pp. 99-107.
[12] Wu, Xindong and Zhang, Shichao, Synthesizing High- Frequency Rules from Different Data Sources, IEEE Trans. Knowledge and Data Eng., vol. 15, no.2., Mar/Apr 2003.