Automata Theory Approach for Solving Frequent Pattern Discovery Problems

Renáta Iváncsy; István Vajk

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33132

Automata Theory Approach for Solving Frequent Pattern Discovery Problems

Authors: Renáta Iváncsy, István Vajk

Abstract:

The various types of frequent pattern discovery problem, namely, the frequent itemset, sequence and graph mining problems are solved in different ways which are, however, in certain aspects similar. The main approach of discovering such patterns can be classified into two main classes, namely, in the class of the levelwise methods and in that of the database projection-based methods. The level-wise algorithms use in general clever indexing structures for discovering the patterns. In this paper a new approach is proposed for discovering frequent sequences and tree-like patterns efficiently that is based on the level-wise issue. Because the level-wise algorithms spend a lot of time for the subpattern testing problem, the new approach introduces the idea of using automaton theory to solve this problem.

Keywords: Frequent pattern discovery, graph mining, pushdownautomaton, sequence mining, state machine, tree mining.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1084942

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1631

References:

[1] R. Agrawal and R. Srikant, "Mining sequential patterns," in ICDE -95: Proceedings of the Eleventh International Conference on Data Engineering. Washington, DC, USA: IEEE Computer Society, 1995, pp. 3-14.
[2] R. Srikant and R. Agrawal, "Mining sequential patterns: Generalizations and performance improvements," in Proc. of the 5th International Conference on Extending Database Technology, 1996.
[3] M. N. Garofalakis, R. Rastogi, and K. Shim, "Spirit: Sequential pattern mining with regular expression constraints." in VLDB, M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, Eds. Morgan Kaufmann, 1999, pp. 223-234.
[4] M. J. Zaki, "Spade: An efficient algorithm for mining frequent sequences," Machine Learning, vol. 42, no. 1-2, pp. 31-60, 2001.
[5] J. Han, J. Pei, and B. M.-A. et al., "Freespan: frequent pattern-projected sequential pattern mining," in KDD -00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM Press, 2000, pp. 355-359.
[6] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, "PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth," in In Proc. of Int. Conf. on Data Engineering, 2001, pp. 215-226.
[7] J. Ayres, J. Flannick, J. Gehrke, and T. Yiu, "Sequential pattern mining using a bitmap representation," in In Proc. of the 8th ACM SIGKDD Int.Conf. on Knowledge Discovery and Data Mining, 2002, pp. 429- 435.
[8] M. Zaki, "Efficiently mining frequent trees in a forest," in Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2002, pp. 71-80.
[9] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamoto, and S. Arikawa, "Efficient substructure discovery from large semi-structured data." In SDM, R. L. Grossman, J. Han, V. Kumar, H. Mannila, and R. Motwani, Eds. SIAM, 2002.
[10] U. Rckert and S. Kramer, "Frequent free tree discovery in graph data," in SAC -04: Proceedings of the 2004 ACM symposium on Applied computing. New York, NY, USA: ACM Press, 2004, pp. 564-570.