A Comparison of Fuzzy Clustering Algorithms to Cluster Web Messages
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
A Comparison of Fuzzy Clustering Algorithms to Cluster Web Messages

Authors: Sara El Manar El Bouanani, Ismail Kassou

Abstract:

Our objective in this paper is to propose an approach capable of clustering web messages. The clustering is carried out by assigning, with a certain probability, texts written by the same web user to the same cluster based on Stylometric features and using fuzzy clustering algorithms. Focus in the present work is on comparing the most popular algorithms in fuzzy clustering theory namely, Fuzzy C-means, Possibilistic C-means and Fuzzy Possibilistic C-Means.

Keywords: Authorship detection, fuzzy clustering, profiling, stylometric features.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1087482

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2058

References:


[1] J. Ai, J. Laffey « Web Mining as a Tool for Understanding Online Learning », MERLOT Journal of Online Learning and Teaching, Vol. 3, No. 2, June 2007.
[2] S. Arayaa, M. Silvab, R. Weberc « A methodology for web usage mining and its application to target group identification » Fuzzy Sets and Systems 148 (2004) 139–152.
[3] J. M. Carbo, J. Minguillon , E. Mort , “User navigational behavior in elearning virtual environments”. IEEE/WIC/ACM International Conference on Web Intelligence, 2005
[4] M. Chau, J. Wu , “Mining communities and their relationships in blogs: a study of online hate group”. Int. J. Human-Computer Studies, pp.57- 70, 2007
[5] H. Chen, W. Chung, J. Qin, E. Reid, M. Sageman, G. Weimann, Uncovering the Dark Web: A Case Study of Jihad on the Web. Journal of the American Society for Information Science and Technology, Vol.(59), Issue 8, pp: 1347–1359, 2008
[6] C. Correa, P. Barreiro, M. P. Diago, J. Tard_aguila C. Valero “A Comparison of Fuzzy Clustering Algorithms Applied to Feature Extraction on Vineyard”
[7] K. K. Chen , P. H. Chou, P. H. Li, M. J. Wu, “Integrating web mining and neural network for personalized e-commerce automatic service”, Expert System with applications, Vol.(37): 2898-2910, 2010
[8] O. De Vel, “Mining e-mail authorship”. In: Proc. of the Workshop on text mining in ACM international conference on knowledge discovery and data mining (KDD).
[9] S. El Manar El Bouanani, I. Kassou “Vers une méthodologie de modélisation d’une signature unique des profils Web : Module de détection des auteurs des forums web”, JADT 2012
[10] I. Farkhund, B. C. M. Fung, H. Binsalleeh, “Mining writeprints from anonymous e-mails for forensic investigation”. digital investigation, Vol.(7): 56-64, 2010
[11] Iqbal F, et al. (2010). Mining writeprints from anonymous e-mails for forensic investigation. Digit. Investig, doi:10.1016/j.diin.2010.03.003.
[12] J. Li, H. Chen, R. Zheng « From fingerprint to writeprint”. Communications of the ACM - Supporting exploratory search. Vol.(49), Issue 4, pp: 76-82, 2006
[13] K. L. Lo, M. H. Sohod, Z. Zakaria “Determination of Consumers’ Load Profiles based on Two-stage Fuzzy C-Means”, Proceedings of the 5th WSEAS Int. Conf. on Power Systems and Electromagnetic Compatibility, Corfu, Greece, August 23-25, 2005 (pp 212-217)
[14] H. Mohtasseb, A. Ahmed, “Mining Online Diaries for Blogger Identification”. Proceedings of the World Congress on Engineering (WCE). London, U.K.
[15] A. Orebaugh, J. Allnutt, “Classification of Instant Messaging Communications for Forensics Analysis”. The International Journal of Forensic Computer Science, Vol.(1): 22-28.
[16] D. Xu, H. Wang, Su K. “Intelligent Student Profiling with Fuzzy Models”. Proceedings of the 35th Hawaii International Conference on System Sciences, 2002
[17] Y. C. Yang.”Web user behavioral profiling for user identification”. Decision Support Systems, Vol.(49): 261–271.
[18] I. C. Yeh, C. H. Lien, T. M. Ting, C. H. Liu, “ Applications of web mining for marketing of online bookstore”. Expert System with applications, Vol.(36) :11249-11256, 2009
[19] X. Zhang, J. Edwards, J. Harding , “ Personalised online sales using web usage data mining”. Computers in Industry, 2007, Vol.(58): 772–782.
[20] R. Zheng, J. Li, H. Chen, Z. Huang, “A framework for authorship Identification of Online Messages: writing-Style features and classification Techniques”. Journal of The American Society For Information Science And Technology, 2006, pp: 378-393.