A Comparison of Fuzzy Clustering Algorithms to Cluster Web Messages
Authors: Sara El Manar El Bouanani, Ismail Kassou
Abstract:
Our objective in this paper is to propose an approach capable of clustering web messages. The clustering is carried out by assigning, with a certain probability, texts written by the same web user to the same cluster based on Stylometric features and using fuzzy clustering algorithms. Focus in the present work is on comparing the most popular algorithms in fuzzy clustering theory namely, Fuzzy C-means, Possibilistic C-means and Fuzzy Possibilistic C-Means.
Keywords: Authorship detection, fuzzy clustering, profiling, stylometric features.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1087482
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2039References:
[1] J. Ai, J. Laffey « Web Mining as a Tool for Understanding Online
Learning », MERLOT Journal of Online Learning and Teaching, Vol. 3,
No. 2, June 2007.
[2] S. Arayaa, M. Silvab, R. Weberc « A methodology for web usage
mining and its application to target group identification » Fuzzy Sets and
Systems 148 (2004) 139–152.
[3] J. M. Carbo, J. Minguillon , E. Mort , “User navigational behavior in elearning
virtual environments”. IEEE/WIC/ACM International
Conference on Web Intelligence, 2005
[4] M. Chau, J. Wu , “Mining communities and their relationships in blogs:
a study of online hate group”. Int. J. Human-Computer Studies, pp.57-
70, 2007
[5] H. Chen, W. Chung, J. Qin, E. Reid, M. Sageman, G. Weimann,
Uncovering the Dark Web: A Case Study of Jihad on the Web. Journal
of the American Society for Information Science and Technology,
Vol.(59), Issue 8, pp: 1347–1359, 2008
[6] C. Correa, P. Barreiro, M. P. Diago, J. Tard_aguila C. Valero “A
Comparison of Fuzzy Clustering Algorithms Applied to Feature
Extraction on Vineyard”
[7] K. K. Chen , P. H. Chou, P. H. Li, M. J. Wu, “Integrating web mining
and neural network for personalized e-commerce automatic service”,
Expert System with applications, Vol.(37): 2898-2910, 2010
[8] O. De Vel, “Mining e-mail authorship”. In: Proc. of the Workshop on
text mining in ACM international conference on knowledge discovery
and data mining (KDD).
[9] S. El Manar El Bouanani, I. Kassou “Vers une méthodologie de
modélisation d’une signature unique des profils Web : Module de
détection des auteurs des forums web”, JADT 2012
[10] I. Farkhund, B. C. M. Fung, H. Binsalleeh, “Mining writeprints from
anonymous e-mails for forensic investigation”. digital investigation,
Vol.(7): 56-64, 2010
[11] Iqbal F, et al. (2010). Mining writeprints from anonymous e-mails for
forensic investigation. Digit. Investig, doi:10.1016/j.diin.2010.03.003.
[12] J. Li, H. Chen, R. Zheng « From fingerprint to writeprint”.
Communications of the ACM - Supporting exploratory search. Vol.(49),
Issue 4, pp: 76-82, 2006
[13] K. L. Lo, M. H. Sohod, Z. Zakaria “Determination of Consumers’ Load
Profiles based on Two-stage Fuzzy C-Means”, Proceedings of the 5th
WSEAS Int. Conf. on Power Systems and Electromagnetic Compatibility,
Corfu, Greece, August 23-25, 2005 (pp 212-217)
[14] H. Mohtasseb, A. Ahmed, “Mining Online Diaries for Blogger
Identification”. Proceedings of the World Congress on Engineering
(WCE). London, U.K.
[15] A. Orebaugh, J. Allnutt, “Classification of Instant Messaging
Communications for Forensics Analysis”. The International Journal of
Forensic Computer Science, Vol.(1): 22-28.
[16] D. Xu, H. Wang, Su K. “Intelligent Student Profiling with Fuzzy
Models”. Proceedings of the 35th Hawaii International Conference on
System Sciences, 2002
[17] Y. C. Yang.”Web user behavioral profiling for user identification”.
Decision Support Systems, Vol.(49): 261–271.
[18] I. C. Yeh, C. H. Lien, T. M. Ting, C. H. Liu, “ Applications of web
mining for marketing of online bookstore”. Expert System with
applications, Vol.(36) :11249-11256, 2009
[19] X. Zhang, J. Edwards, J. Harding , “ Personalised online sales using web
usage data mining”. Computers in Industry, 2007, Vol.(58): 772–782.
[20] R. Zheng, J. Li, H. Chen, Z. Huang, “A framework for authorship
Identification of Online Messages: writing-Style features and
classification Techniques”. Journal of The American Society For
Information Science And Technology, 2006, pp: 378-393.