Comparative Study of Universities’ Web Structure Mining

Z. Abdullah; A. R. Hamdan

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 32804

Comparative Study of Universities’ Web Structure Mining

Authors: Z. Abdullah, A. R. Hamdan

Abstract:

This paper is meant to analyze the ranking of University of Malaysia Terengganu, UMT’s website in the World Wide Web. There are only few researches have been done on comparing the ranking of universities’ websites so this research will be able to determine whether the existing UMT’s website is serving its purpose which is to introduce UMT to the world. The ranking is based on hub and authority values which are accordance to the structure of the website. These values are computed using two websearching algorithms, HITS and SALSA. Three other universities’ websites are used as the benchmarks which are UM, Harvard and Stanford. The result is clearly showing that more work has to be done on the existing UMT’s website where important pages according to the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s website will act as a guideline for the web-developer to develop a more efficient website.

Keywords: Algorithm, ranking, website, web structure mining.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1110011

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1629

References:

[1] A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, and S. Raghavan, “Searching the Web,” ACM Transactions on Internet Technology (TOIT), 1 (1), pp. 2-43, 2001.
[2] B.J. Jansen, A. Spink, C. Blakely, and S. Koshman, “Defining a Session on Web Search Engines,” Journal of the American Society for Information Science and Technology, 58(6), pp. 862–871, 2007.
[3] J. Srivastava, P. Desikan, and V. Kumar, “Web Mining- Accomplishments & Future Directions,” University of Minnesota. 2000
[4] J. Fürnkranz, “Web Mining,” Data Mining and Knowledge Discovery Handbook, pp. 913-930, Springer-Verlag, 2010.
[5] M. Eirinaki, “Web Mining: A Roadmap,” Technical Report, DB-NET 2004, at http://www.engr.sjsu.edu/meirinaki/papers/NEMIS.pdf
[6] J. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” Proceeding of the 9th ACM SIAM Symposium on Discrete Algorithms, pp. 668–677, 1998.
[7] M. Lan, “Algorithms and Applications of Preference Based Ranking for Information Retrieval,” Ph.D Thesis, 2005.
[8] M. Najork, “Comparing the Effectiveness of HITS and SALSA,” Proceeding of 16th ACM Conference on Information and Knowledge Management (CIKM), 2007.
[9] R. Lempel, and S. Moran, “Rank-Stability and Rank-Similarity of Link- Based Web Ranking Algorithms in Authority-Connected Graphs,” Information Retrieval, pp. 245-264, 2005.
[10] Y Duan, J Wang, M Kam, J Canny, “Privacy preserving link analysis on dynamic weighted graph,” Computational & Mathematical Organization Theory, 11 (2), 141-159, 2005.
[11] Z. Chen, L. Tao, J. Wang, L. Wenyin, and W. Ma, “A Unified Framework for Web Link Analysis,” Proc. 3rd International Conference on Web Information Systems Engineering (WISE2002), Singapore (regular paper), pp. 63-72, Dec 2002.
[12] A. Borodin, G. O. Roberts, J. S. Rosenthal, and P. Tsaparas, “Finding Authorities and Hubs from Link Structures on the World WideWeb,” Proceedings of the 10th International World Wide Web Conference, pp. 415-429, 2001.
[13] A.N. Langville, and C.D. Meyer, “A Survey of Eigenvector Methods for Web Information Retrieval,” Journal SIAM review, 47(1), 135-161, 2005.
[14] A. Farahat, T. LoFaro, J.C. Miller, G. Rae, L.A. Ward, “Authority rankings from HITS, PageRank, and SALSA: Existence, uniqueness, and effect of initialization,” SIAM Journal on Scientific Computing, 27 (4), 1181-1201, 2006.
[15] J.C. Miller, G. Rae, and F. Schaefer, “Modifications of Kleinberg’s HITS Algorithm Using Matrix Exponentiation and Web Log Records,” Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 444-454, 2001.