Mining Sequential Patterns Using I-PrefixSpan
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33090
Mining Sequential Patterns Using I-PrefixSpan

Authors: Dhany Saputra, Dayang R. A. Rambli, Oi Mean Foong

Abstract:

In this paper, we propose an improvement of pattern growth-based PrefixSpan algorithm, called I-PrefixSpan. The general idea of I-PrefixSpan is to use sufficient data structure for Seq-Tree framework and separator database to reduce the execution time and memory usage. Thus, with I-PrefixSpan there is no in-memory database stored after index set is constructed. The experimental result shows that using Java 2, this method improves the speed of PrefixSpan up to almost two orders of magnitude as well as the memory usage to more than one order of magnitude.

Keywords: ArrayList, ArrayIntList, minimum support, sequence database, sequential patterns.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1083789

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563

References:


[1] R. Agrawal and R. Srikant, "Mining Sequential Patterns," Journal Intelligent Systems, vol. 9, No.1, 1997, pp. 33 - 56.
[2] R. Agrawal and R. Srikant, "Mining Sequential Patterns: Generalization and Performance Improvements," Research Report RJ 9994, IBM Almaden Research Center, San Jose, California, December 1995.
[3] J. Han, and M. Kamber, Data Mining: Concepts and Techniques. CA: Prentice Hall, 2002, ch. 5.
[4] J. Pei, et al, "Mining Sequential Patterns by Pattern Growth: The PrefixSpan Approach," IEEE Transaction on Knowledge and Data Engineering, vol. 16, no. 11, pp.14240-1440, Nov. 2004.
[5] J. Han and J. Pei, "Mining Frequent Patterns by Pattern-Growth: Methodology and Implications", ACM SIGKDD Explorations (Special Issue on Scalable Data Mining Algorithms), vol. 2, no. 2, pp.14-20, December 2000.
[6] M. Zaki, "SPADE: An Efficient Algorithm for Mining Frequent Sequences," Journal Machine Learning, vol. 42, nos. 1-2, 2001.
[7] L.M. Yen and L. S.Y. Lee, "Fast discovery of sequential patterns through memory indexing and database partitioning," Journal Information Science and Engineering, vol. 21, pp. 109-128, 2005.
[8] J. Shirazi, Java™ Performance Tuning. CA:O-Reilly, 2003.