A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32799
A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)

Authors: Rekha Kandwal, Kamal K.Bharadwaj

Abstract:

Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.

Keywords: Censored production rules, cumulative learning, data mining, machine learning.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1055922

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1443

References:


[1] Han, J., Kamber, M. "Data mining: Concepts and Techniques" Academic Press (2001).
[2] Bharadwaj, K.K., Jain, N.K.: Hierarchical Censored Production Rules (HCPRs) System, Data and Knowledge Engineering, vol.8 (North Holland), 1992.
[3] Bharadwaj, K.K., Neerja, Goel, G.C.: Hierarchical Censored Production Rules (HCPRs) Systems Employing the Dampster-Shafer Uncertainty Calculus, Information and Software technology, Butterworth-Heinemann Ltd. (U.K.) Vol. 36 No., 155-164, 1994.
[4] Jain, N.K. ,Bharadwaj, K.K.,: Some Learning Techniques in Hierarchical Censored Production Rules( HCPRs) System, International Journal of intelligent systems, John Wiley & sons, Inc.,vol. 13,pp 319-344, 1997.
[5] Quinlan, J.R. (1986): Induction of Decision tress: Machine learning;1(1);81-106,1986.
[6] Adriaan, P., Zantingre, D. "Data Mining", Addison Wesley, 1999.
[7] Michalski, R.S., Winston, P.H., Variable Precision Logic, Artificial intelligence,29,121-146,1986.
[8] Jain,N.K., Bharadwaj K.K. and, Norian Marrengallo " Extended Hierarchical Censored Production Rules System", vol. 9, no 3-4, journal of Intelligence Systems, UK ,1999.
[9] Ananthanarayana, V.S., Murty, M.N., Subramanian, D.K.: Dynamic Data Mining, Proceedings of the International Conference, KBCS- 2002.
[10] Sebastian Thrun, Christos Faloutsos, Tom Mitchell, Larry Wasserman: Automated Learning and Discovery: State-Of-The Art and Research Topics in a Rapidly Growing Field, CMU_CALD-98- 100, September 1998.
[11] Ryszard S. Michalski, Pavel Brazdil: Introduction, Special Issue on Multistrategy learning, Machine Learning, vol 50, pp 219-222, 2003.
[12] Bing Liu , Minqing Hu and Wynne Hsu, "Intuitive Representation of Decision Trees Using General Rules and Exceptions" American Association for Artificial Intelligence,2000.
[13] Nikola K.Kasabov. "Foundation of Neural Networks, Fuzzy systems, and Knowledge Engineering" The MIT Press (2001).
[14] Brian Babcock, Shivnath Babu, Mayur data, Rajeev Motwani, and Jennifer Widom: Models and Issues in data Stream Systems, Proceeding of 21st ACM Symposium on Principles of Database Systems (PODS 2002).
[15] Guozhu Dong, Jiawei Han, laks V.S. Lakshmanan, Jian Pei, Haixun Wang, Philip S. Yu: Online Mining of changes from data Streams: Research Problems and Preliminary Results, In Proceedings of the 2003 ACM SIGMOID Workshop on Management and Processing of data Streams.