{"title":"Application of a Similarity Measure for Graphs to Web-based Document Structures","authors":"Matthias Dehmer, Frank Emmert Streib, Alexander Mehler, J\u00fcrgen Kilian, Max M\u00fchlhauser","country":null,"institution":"","volume":8,"journal":"International Journal of Mathematical and Computational Sciences","pagesStart":361,"pagesEnd":366,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/15299","abstract":"Due to the tremendous amount of information provided\r\nby the World Wide Web (WWW) developing methods for mining\r\nthe structure of web-based documents is of considerable interest. In\r\nthis paper we present a similarity measure for graphs representing\r\nweb-based hypertext structures. Our similarity measure is mainly\r\nbased on a novel representation of a graph as linear integer strings,\r\nwhose components represent structural properties of the graph. The\r\nsimilarity of two graphs is then defined as the optimal alignment of\r\nthe underlying property strings. In this paper we apply the well known\r\ntechnique of sequence alignments for solving a novel and challenging\r\nproblem: Measuring the structural similarity of generalized trees.\r\nIn other words: We first transform our graphs considered as high\r\ndimensional objects in linear structures. Then we derive similarity\r\nvalues from the alignments of the property strings in order to\r\nmeasure the structural similarity of generalized trees. Hence, we\r\ntransform a graph similarity problem to a string similarity problem for\r\ndeveloping a efficient graph similarity measure. We demonstrate that\r\nour similarity measure captures important structural information by\r\napplying it to two different test sets consisting of graphs representing\r\nweb-based document structures.","references":null,"publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 8, 2007"}