Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation
Authors: S. Logeswari, K. Premalatha
Abstract:
Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.
Keywords: MeSH Ontology, Concept Indexing, Annotation, semantic relations, Fuzzy c-means.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1094045
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2303References:
[1] Jonquet, Clement, Mark A. Musen, and Nigam Shah,"A system for ontology-based annotation of biomedical data," Data Integration in the Life Sciences, Springer Berlin Heidelberg, pp. 144-152, 2008.
[2] Adrien Coulet, Florent Domenach, Mehdi Kaytoue and Amedeo Napoli, "Using pattern structures for analyzing ontology-based annotations of biomedical data,” in Proc. Formal Concept Analysis,. Springer Berlin Heidelberg, pp. 76-91, 2013.
[3] Fontes, Celso Araujo, Maria Claudia Cavalcanti, and Ana Maria de C. Moura. "An Ontology-Based Reasoning Approach for Document Annotation," in Proc. IEEE Seventh International Conference on Semantic Computing (ICSC), pp. 160-167, 2013.
[4] Tsatsaronis, George, Natalia Macari, Sunna Torge, Heiko Dietze, and Michael Schroeder, "A maximum-entropy approach for accurate document annotation in the biomedical domain," J. Biomedical semantics, vol. 3, no. 1, pp.1-17, 2012.
[5] Hazman, Maryam, Samhaa R. El-Beltagy, and Ahmed Rafea, "An Ontology Based Approach for Automatically Annotating Document Segments," Int. J. Computer Science Issues (IJCSI), vol. 9, no. 2, pp.221-230, 2012.
[6] Kiryakov, Atanas, Borislav Popov, Ivan Terziev, Dimitar Manov, and Damyan Ognyanoff, "Semantic annotation, indexing, and retrieval." Web Semantics: Science, Services and Agents on the World Wide Web, vol. 2, no. 1, pp. 49-79, 2004.
[7] Cheung, Warren A., BF F. Ouellette, and Wyeth W. Wasserman, "Quantitative biomedical annotation using medical subject heading overrepresentation profiles (MeSHOPs)," BMC bioinformatics, vol.13, no. 249, pp.1-11, 2012.
[8] Chua, Watson Wei Khong, and Jung-jae Kim, "Semantic querying over knowledge in biomedical text corpora annotated with multiple ontologies," in Proc. of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pp. 400-407, 2012.
[9] W. Shuguang and H. Milos , ‘Keyword annotation of biomedical documents with graph-based similarity methods’, in Proc. of IEEE international conferences on bioinformatics and biomedicine, pp. 361- 364, 2012.
[10] Chattopadhyay, Subhagata, Dilip Kumar Pratihar, and Sanjib Chandra De Sarkar, "A Comparative Study of Fuzzy C-Means Algorithm and Entropy-Based Fuzzy Clustering Algorithms," Computing & Informatics, vol. 30, no. 4, pp. 701-720, 2011.
[11] ] Kang, Jiayin, and Wenjun Zhang, "Combination of Fuzzy C-means and Harmony Search Algorithms for Clustering of Text Document," Journal of Computational Information Systems, vol. 7, no. 16,pp. 5980-5986, 2011.
[12] Sridevi, U. K., and N. Nagaveni, "An ontology based model for document clustering," Int. J. Intelligent Information Technologies (IJIIT), vol. 7, no.3, pp. 54-69, 2011.