Measuring Text-Based Semantics Relatedness Using WordNet
Authors: Madiha Khan, Sidrah Ramzan, Seemab Khan, Shahzad Hassan, Kamran Saeed
Abstract:
Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Keywords: GraphViz representation, semantic relatedness, similarity measurement, WordNet similarity.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.3298904
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 835References:
[1] Slimani, T. (2013). Description and Evaluation of Semantic similarity Measures Approaches. 10.
[2] E. G., & S. M. (2007). Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. 6.
[3] J. G., & E. M. (2008). Web-Based Measure of Semantic Relatedness. 15.
[4] Y. L., Bandar, Z. A., & D. M. (2003). An Approach for Measuring Semantic Similarity between Words Using Multile Information sources. 12.
[5] T. P., S. P., & J. M. (2004). WordNet::Similarity - Measuring the Relatedness of Concepts. 2.
[6] B. M., C. S., G. P., & A. G. (2002). Comparing Ontology-Based and Corpus-Based Annotations in WordNet. 6.
[7] Y. C., Q. Z., W. L., & X. C. (2017). A hybrid approach for measuring semantic similarity A hybrid approach for measuring semantic similarity. 25.
[8] Zugang Chen, Jia Song & Yaping Yang (2018). An Approach to Measuring Semantic Relatedness of Geographic Terminologies Using a Thesaurus and Lexical Database Sources. 22.
[9] Liu, Ming & Lang, Bo & Gu, Zepeng. (2017). Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology.
[10] Y. Yang and Y. Ping, "An Ontology-Based Semantic Similarity Computation Model," 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, 2018, pp. 561-564. doi: 10.1109/BigComp.2018.00096
[11] Miller George, WordNet: a lexical database for English, Communications of the ACM, vol. 38, no. 11, pp. 39-41, 1995.
[12] J. Xu, Y. Tao, H. Lin, "Semantic word cloud generation based on word embeddings", 2016 IEEE Pacific Visualization Symposium (Pacific Vis), pp. 239-243, 2016.