Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30184
Concurrency in Web Access Patterns Mining

Authors: Jing Lu, Malcolm Keech, Weiru Chen

Abstract:

Web usage mining is an interesting application of data mining which provides insight into customer behaviour on the Internet. An important technique to discover user access and navigation trails is based on sequential patterns mining. One of the key challenges for web access patterns mining is tackling the problem of mining richly structured patterns. This paper proposes a novel model called Web Access Patterns Graph (WAP-Graph) to represent all of the access patterns from web mining graphically. WAP-Graph also motivates the search for new structural relation patterns, i.e. Concurrent Access Patterns (CAP), to identify and predict more complex web page requests. Corresponding CAP mining and modelling methods are proposed and shown to be effective in the search for and representation of concurrency between access patterns on the web. From experiments conducted on large-scale synthetic sequence data as well as real web access data, it is demonstrated that CAP mining provides a powerful method for structural knowledge discovery, which can be visualised through the CAP-Graph model.

Keywords: concurrent access patterns (CAP), CAP mining and modelling, CAP-Graph, web access patterns (WAP), WAP-Graph, Web usage mining.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1074755

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316

References:


[1] B. Liu, Web Data Mining - Exploring hyperlinks, contents and usage data. Book series: Data-Centric Systems and Applications. Springer Berlin/Heidelberg, 2007, ch. 1, 12.
[2] R. Kosala and H. Blockeel, "Web Mining Research: a survey," ACM SIGKDD Explorations Newsletter, vol. 2 Issue 1, June 2000.
[3] J. Srivastava, R. Cooley, M. Deshpande and P-T. Tan, "Web Usage ining: Discovery and applications of usage patterns from web data," SIGKDD Explorations, 2000, 1(2):12-23.
[4] J. Wang, Y. Huang, G. Wu and F. Zhang, "Web Mining: Knowledge discovery on the Web," Systems, Man and Cybernetics, IEEE SMC '99 Conference Proceedings, (Tokyo, Japan, 1999), IEEE, vol. 2, 137-141.
[5] R. Agrawal and R. Srikant, "Mining sequential patterns," Proceedings of the 11th International Conference on Data Engineering, (Taipei, Taiwan, 1995), IEEE Computer Society Press, 3-14.
[6] R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations and performance improvements," Proceedings of the Fifth International Conference on Extending Database Technology, (Avignon, France, 1996), Springer-Verlag, vol. 1057, 3-17.
[7] J. Pei, J. Han, B. Mortazavi-asl and H. Zhu, "Mining access patterns efficiently from web logs," In Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, (Kyoto, Japan, 2000), Springer, 396-407.
[8] C. I. Ezeife and Y. Lu, "Mining web log sequential patterns with position coded pre-order linked WAP-tree," International Journal of Data Mining and Knowledge Discovery, 2005, 10, 5-38.
[9] W. Wang and P. T. Cao-Thai, "Novel position-coded methods for mining web access patterns," IEEE International Conference on Intelligence and Security Informatics, 2008, 194-196.
[10] X. Tan, M. Yao and J. Zhang, "Mining maximal frequent access sequences based on improved WAP-tree," Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications, IEEE Computer Society Press, 2006, vol. 1, 616-620.
[11] J. D. Parmar and S. Garg, "Modified web access pattern (mWAP) approach for sequential pattern mining," INFOCOMP - Journal of Computer Science, June, 2007, 6(2): 46-54.
[12] J. Lu, X. F. Wang, O. Adjei and F. Hussain, "Sequential patterns graph and its construction algorithm," Chinese Journal of Computers, 2004, 27(6): 782-788.
[13] R. Agrawal, T. Imielinski and A. Swami, "Mining association rules between sets of items in large databases," Proceedings of the 1993 ACM SIGMOD, 207-216.
[14] J. Pei, J. W. Han, B. Mortazavi-Asl and H. Pinto, "PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth," Proceedings of the 17th International Conference on Data Engineering, (Heidelberg, Germany, 2001), IEEE Computer Society Press, 215-224.
[15] J. Lu, O. Adjei, W. R. Chen and J. Liu, "Post Sequential Patterns Mining: A new method for discovering structural patterns," Proceedings of the Second International Conference on Intelligent Information Processing, (Beijing, China, 2004), Springer-Verlag, 239-250.
[16] J. Lu, W. R. Chen, O. Adjei and M. Keech, "Sequential patterns postprocessing for structural relation patterns mining," International Journal of Data Warehousing & Mining, 2008, 4(3): 71-89.
[17] J. Lu, W. R. Chen and M. Keech, "Graph-based modelling of concurrent sequential patterns," International Journal of Data Warehousing & Mining, to appear.
[18] P. Tang, and M. P. Turkia, "Mining frequent web access patterns with partial enumeration," Proceedings of the 45th Annual Southeast Regional Conference, (Winston-Salem, North Carolina, USA, 2007), ACM, 226-231.
[19] R. Kohavi, C. Brodley, B. Frasca, L. Mason and Z. J. Zheng, "KDDCup 2000 Organizers' Report: Peeling the onion," SIGKDD Explorations, vol. 2, Issue 2, 86-98, 2000.
[20] L. Getoor, "Link Mining: a new data mining challenge," SIGKDD Explorations, vol. 4, Issue 2, 2003.