Commenced in January 2007
Paper Count: 30184
Learning Classifier Systems Approach for Automated Discovery of Censored Production Rules
Abstract:In the recent past Learning Classifier Systems have been successfully used for data mining. Learning Classifier System (LCS) is basically a machine learning technique which combines evolutionary computing, reinforcement learning, supervised or unsupervised learning and heuristics to produce adaptive systems. A LCS learns by interacting with an environment from which it receives feedback in the form of numerical reward. Learning is achieved by trying to maximize the amount of reward received. All LCSs models more or less, comprise four main components; a finite population of condition–action rules, called classifiers; the performance component, which governs the interaction with the environment; the credit assignment component, which distributes the reward received from the environment to the classifiers accountable for the rewards obtained; the discovery component, which is responsible for discovering better rules and improving existing ones through a genetic algorithm. The concatenate of the production rules in the LCS form the genotype, and therefore the GA should operate on a population of classifier systems. This approach is known as the 'Pittsburgh' Classifier Systems. Other LCS that perform their GA at the rule level within a population are known as 'Mitchigan' Classifier Systems. The most predominant representation of the discovered knowledge is the standard production rules (PRs) in the form of IF P THEN D. The PRs, however, are unable to handle exceptions and do not exhibit variable precision. The Censored Production Rules (CPRs), an extension of PRs, were proposed by Michalski and Winston that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: IF P THEN D UNLESS C, where Censor C is an exception to the rule. Such rules are employed in situations, in which conditional statement IF P THEN D holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence are tight or there is simply no information available as to whether it holds or not. Thus, the IF P THEN D part of CPR expresses important information, while the UNLESS C part acts only as a switch and changes the polarity of D to ~D. In this paper Pittsburgh style LCSs approach is used for automated discovery of CPRs. An appropriate encoding scheme is suggested to represent a chromosome consisting of fixed size set of CPRs. Suitable genetic operators are designed for the set of CPRs and individual CPRs and also appropriate fitness function is proposed that incorporates basic constraints on CPR. Experimental results are presented to demonstrate the performance of the proposed learning classifier system.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1057555Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1152
 Ian H. Witten and Eibe Frank, ÔÇÿDATA MINING: Practical Machine Learning Tools and Techniques-, Second Edition, Morgan Kaufmann Publishers, An imprint of Elsevier.
 U.M. Fayyad, G. P. Shapiro and P. Smyth, ÔÇÿThe KDD process for extracting useful knowledge from volumes from data-, Communication of ACM, Nov. vol. 39(11), page 27 - 34, 1996.
 Pieter Adriaans and Dolf Zantinge, in the book on DATA MINING by Pearson Ed. Publication.
 Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press.
 Holland, J.H. (1976) Adaptation. In Rosen & Snell (eds) Progress in Theoretical Biology, 4.Plenum.
 Bernado, E., Llora, X., & Garrell, J. M. (2001). XCS and GALE: a comparative study of two learning classifier systems with six other learning algorithms on classification tasks. In Fourth International Workshop on Learning Classifier Systems - IWLCS-2001 pages 337- 341.
 Jaume Bacardit and Martin V. Butz, Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, Passeig Bonanova 8, 08022, Barcelona, [email protected], ÔÇÿData Mining in Learning Classifer Systems:Comparing XCS with GAssist-
 Holland, J.H., ÔÇÿ Escaping Brittleness: The possibilities of General Purpose Learning Algorithms applied to Parallel rule - based systems-, ÔÇÿMachine Learning: An Artificial Intelligence approach-, Vol. II, pages 593 - 623, 1978.
 John H. Homes, Pier Luca lanzi, Wolfgang Stolzmann, Stewart W. Wilson, ÔÇÿLearning Classifier Systems: New Models, Successful Applications.
 Kenneth A. De Jong, ÔÇÿLearning with Genetic Algorithms: An Overview-, Machine Learning, 1988.
 Michalski, R.S. and Winston, P.H. 1986, ÔÇÿVariable precision logic-, Artificial Intelligence, Vol 29, pp 121-146.
 Winston, P.H., ÔÇÿ Learning by augmenting rules and accumulating censors-, MIT AI Laboratory, AIM - 678, Cambridge, MA, 1982; also in R.S. Michalski, j.G. Carbonell and T. M Mitchell (Eds.), ÔÇÿMachine learning: An Artificial Intelligence Approach 2 ( Morgan Kaufmann, Los Altos, CA, 1986).
 Alwyn Barry, John Holmes, and Xavier Llora, ÔÇÿData Mining using Learning Classifier Systems- , pages 21 - 23
 Pierre Bonelli, Alexandre Parodi, Sandip Sen, and Stewart W. Wilson. ÔÇÿNEWBOOLE: A Fast GBML System-, In International Conference on Machine Learning, pages 153-159, San Mateo, California, 1990. Morgan Kaufmann.
 John H. Holmes, ÔÇÿEvolution-Assisted Discovery of Sentinel Features in Epidemiologic Surveillance-. PhD thesis, Drexel University, 1996.
 John H. Holmes, ÔÇÿA genetics-based machine learning approach to knowledge discovery in clinical data-. Journal of the American Medical Informatics Association Supplement, 1996.
 Stewart W. Wilson,- Knowledge Growth in an Artificial Animal-. In Grefenstette
, pages 16-23. Also appeared in Proceedings of the 4th Yale.
 Stewart W. Wilson,- Classifier Systems and the Animat Problem-. Machine Learning, 2:199-228, 1987. Also Research Memo RIS-36r, the Rowland Institute for Science, Cambridge, MA, 1986.
 John H. Holmes, ÔÇÿDiscovering Risk of Disease with a Learning Classifier System-, In Thomas Back, editor, Proceedings of the 7th International Conference on Genetic Algorithms (ICGA97). Morgan Kaufmann, 1997.
 Website address of an online data repository; from where Mushroom training dataset and other training data sets are taken: http://www.sgi.com/tech/mlc/db/
 K.K. Bharadwaj and N.K. Jain, ÔÇÿHierarchical Censored production Rules System-, Data and Knowledge Engineering, North Holland, vol. 8, page 19 - 34, 1992.
 E. Suzuki, and J.M. Zytkow, ÔÇÿUnified algorithm for undirected discovery of exception rules-, International Journal of Intelligent Systems, vol. 20, page 673 - 691, 2005.
 N.J. Radcliffe and P.D. Surry, ÔÇÿCo - operation through hierarchical competition in genetic data mining-, EPCC - TR94 - 09, 1994.