A Sequential Pattern Mining Method Based On Sequential Interestingness
Authors: Shigeaki Sakurai, Youichi Kitahara, Ryohei Orihara
Abstract:
Sequential mining methods efficiently discover all frequent sequential patterns included in sequential data. These methods use the support, which is the previous criterion that satisfies the Apriori property, to evaluate the frequency. However, the discovered patterns do not always correspond to the interests of analysts, because the patterns are common and the analysts cannot get new knowledge from the patterns. The paper proposes a new criterion, namely, the sequential interestingness, to discover sequential patterns that are more attractive for the analysts. The paper shows that the criterion satisfies the Apriori property and how the criterion is related to the support. Also, the paper proposes an efficient sequential mining method based on the proposed criterion. Lastly, the paper shows the effectiveness of the proposed method by applying the method to two kinds of sequential data.
Keywords: Sequential mining, Support, Confidence, Apriori property
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1074861
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1282References:
[1] R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," in Proc. of the 20th Int. Conf. Very Large Data Bases, 1994, Santiago de Chile, Chile, pp. 487-499.
[2] R. Agrawal and R. Srikant, "Mining Sequential Patterns," in Proc. of the 11th Int. Conf. Data Engineering, 1995, Taipei, Taiwan, pp. 3-14.
[3] J. Ayres, J. E. Gehrke, T. Yiu, and J. Flannick, "Sequential PAttern Mining Using Bitmaps," In Proc. of the 8th Int. Conf. on Knowledge Discovery and Data Mining, 2002, Edmonton, Alberta, Canada, pp. 429-435.
[4] J. Blanchard, F. Guillet, H. Briand, and R. Gras, "Assessing Rule Interestingness with a Probabilistic Measure of Deviation from Equilibrium," in Proc. of the 11th Int. Sympo. on Applied Stochastic Models and Data Analysis, 2005, Brest, France, pp. 191-200.
[5] S. Brin, R. Motwani, and C. Silverstein, "Beyond Market Baskets: Generalizing Association Rules to Correlations," in Proc. of the 1997 ACM SIGMOD Int. Conf. on Management of Data, 1997, Tucson, Arizona, USA, pp. 265-276.
[6] M. N. Garofalakis, R. Rastogi, and K. Shim, "SPIRIT: Sequential Pattern Mining with Regular Expression Constraints," in Proc. of the Very Large Data Bases Conf., 1999, Edinburgh, Scotland, UK, pp. 223-234.
[7] L. Geng and H. J. Hamilton, "Interestingness measures for data mining: A survey," ACM Computing Surveys, vol. 38, no. 3, article 9, 2006.
[8] Y. Ichimura, Y. Nakayama, M. Miyoshi, T. Akahane, T. Sekiguchi, Y. Fujiwara, "Text Mining System for Analysis of a Salesperson-s Daily Reports," in Proc. of Pacific Association for Computational Linguistics 2001, 2001, Kitakyushu, Japan, pp. 127-135.
[9] V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan, "Mining of Concurrent Text and Time-Series," in Proc. of the KDD-2000 Workshop on Text Mining, 2000, Boston, Massachusetts, USA, pp. 37-44.
[10] B. Lent, R. Agrawal, R. Srikant, "Discovering Trends in Text Databases," in Proc. of the 3rd Int. Conf. on Knowledge Discovery and Data Mining, 1997, Newport Beach, California, USA, pp. 227-230.
[11] K. McGarry, "A Survey of Interestingness Measures for Knowledge Discovery," the Knowledge Engineering Review, vol. 20, no. 1, pp.39- 61, 2005.
[12] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M. Hsu, "PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth," in Proc. of the 2001 Int. Conf. Data Engineering, 2001, Heidelberg, Germany, pp. 215-224.
[13] J. Pei, J. Han, W. Wang, "Mining Sequential Patterns with Constraints in Large Databases," in Proc. of the 11th ACM Int. Conf. on Information and Knowledge Management, 2002, McLean, Virginia, USA, pp. 18-25.
[14] S. Sakurai, K. Ueno, R. Orihara, "Discovery of Time Series Event Patterns based on Time Constraints from Textual Data," Int. J. of Computational Intelligence, vol. 4, no. 2, pp. 144-151, 2008.
[15] K. Shimazu, A. Momma, and K. Furukawa, "Discovering Exceptional Information from Customer Inquiry by Association Rule Miner," in Proc. of the 6th Int. Conf. on Discovery Science 2003, 2003, Sapporo, Japan, pp. 269-282.
[16] A. Silberschatz and A. Tuzhilin, "What Makes Patterns Interesting in Knowledge Discovery Systems," IEEE Trans. on Knowledge and Data Engineering, vol. 8, no. 6, pp. 970-974, Dec., 1996.
[17] R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations and Performance Improvements," in Proc. of the 5th Int. Conf. Extending Database Technology, 1996, Avignon, France, pp. 3-17.
[18] E. Suzuki and J. M. Zytkow, "Unified Algorithm for Undirected Discovery of Exception Rules," Int. J. of Intelligent Systems, vol. 20, no. 7, pp. 673-691, July, 2005.
[19] R. Swan and D. Jensen, "TimeMines: Constructing Timelines with Statistical Models of Word Usage," in Proc. of the KDD-2000 Workshop on Text Mining, 2000, Boston, Massachusetts, USA, pp. 73-80.
[20] S. -J. Yen, "Mining Interesting Sequential Patterns for Intelligent Systems," Int. J. of Intelligent Systems, vol. 20, no. 1 , pp 73-87, Jan., 2005.
[21] M. J. Zaki, "Sequence Mining in Categorical Domains: Algorithms and Applications," in Sequence Learning: Paradigms, Algorithms, and Applications, Lecture Notes in Computer Science, vol. 1828, pp. 162- 187, 2001.