DIFFER: A Propositionalization approach for Learning from Structured Data

Thashmee Karunaratne; Henrik Böstrom

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

DIFFER: A Propositionalization approach for Learning from Structured Data

Authors: Thashmee Karunaratne, Henrik Böstrom

Abstract:

Logic based methods for learning from structured data is limited w.r.t. handling large search spaces, preventing large-sized substructures from being considered by the resulting classifiers. A novel approach to learning from structured data is introduced that employs a structure transformation method, called finger printing, for addressing these limitations. The method, which generates features corresponding to arbitrarily complex substructures, is implemented in a system, called DIFFER. The method is demonstrated to perform comparably to an existing state-of-art method on some benchmark data sets without requiring restrictions on the search space. Furthermore, learning from the union of features generated by finger printing and the previous method outperforms learning from each individual set of features on all benchmark data sets, demonstrating the benefit of developing complementary, rather than competing, methods for structure classification.

Keywords: Machine learning, Structure classification, Propositionalization.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1076058

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231

References:

[1] Page, D. and Srinivasan, A., (2003), ILP: A Short Look back and a Longer Look Forward, Journal of machine learning research, 4(Aug):415-430.
[2] Quinlan, J. R., Cameron-Jones, R. M., (1993), FOIL, Proceedings of the 6th ECML, Lecture Notes in AI, Vol. 667, 3-20. Springer-Verlag
[3] Muggleton S.H. and Feng C. (1990). Efficient induction of logic programs, Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo.
[4] Srinivasan A,. King, R.D, and Muggleton S, (1999), The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program, Technical Report PRG-TR-08-99, Oxford University.
[5] Krogel, M-A., Rawles, S., Železn├¢, F., Flach, P. A., Lavra─ì, N., and Wrobel, S., (2003), Comparative evaluation of approaches to propositionalization, Proc.of the 13th International Conference on ILP, Lecture Notes in CS, 197-214.
[6] Lavrac, N. and Flach P., (2000), "An extended transformation approach to Inductive Logic Programming", University publication, University of Bristol.
[7] Nattee, C., Sinthupinyo, S., Numao, M., Okada, T., (2005), Inductive Logic Programming for Structure-Activity Relationship Studies on Large Scale Data, SAINT Workshops 2005: 332-335.
[8] Inokuchi, A., Washio, T., and Motoda, H., (2003), Complete mining of frequent patterns from graphs, Mining graph data, Machine Learning, 50:321-354.
[9] Debnath, A.K. Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., and Hansch, C. (1991), Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds: Correlation with molecular orbital energies and hydrophobicity, Journal Med. Chem. 34:786-797.
[10] Michie,D., Muggleton,S., Page,D., and Srinivasan,A., (1994), "To the international computing community: A new East-West challenge" Oxford University Computing laboratory, Oxford, UK.
[11] US National Toxicology program, http://ntp.niehs.nih.gov/index.cfm?objectid=3 2BA9724-F1F6-975E- 7FCE50709CB4C932
[12] Pearce D.A. (1988). The induction of fault diagnosis systems from qualitative models, Proceedings 7th National Conference on AI, Saint Paul, Minnesota.
[13] Ian H. Witten and Eibe Frank (2005) "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
[14] Gonzalez, J., Holder, L. B. and Cook, D. J. (2001), "Application of Graph-Based Concept Learning to the Predictive Toxicology Domain", Proceedings of the Predictive Toxicology Challenge Workshop.
[15] Bringmann, B., and Zimmermann, A., (2005), "Tree - Decision Trees for Tree Structured Data", Proceedings of PKDD 2005, LNAI 3721, pp. 46- 58, Springer.