Incorporating Semantic Similarity Measure in Genetic Algorithm : An Approach for Searching the Gene Ontology Terms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33156
Incorporating Semantic Similarity Measure in Genetic Algorithm : An Approach for Searching the Gene Ontology Terms

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Hany T. Alashwal, Rohayanti Hassan, FarhanMohamed

Abstract:

The most important property of the Gene Ontology is the terms. These control vocabularies are defined to provide consistent descriptions of gene products that are shareable and computationally accessible by humans, software agent, or other machine-readable meta-data. Each term is associated with information such as definition, synonyms, database references, amino acid sequences, and relationships to other terms. This information has made the Gene Ontology broadly applied in microarray and proteomic analysis. However, the process of searching the terms is still carried out using traditional approach which is based on keyword matching. The weaknesses of this approach are: ignoring semantic relationships between terms, and highly depending on a specialist to find similar terms. Therefore, this study combines semantic similarity measure and genetic algorithm to perform a better retrieval process for searching semantically similar terms. The semantic similarity measure is used to compute similitude strength between two terms. Then, the genetic algorithm is employed to perform batch retrievals and to handle the situation of the large search space of the Gene Ontology graph. The computational results are presented to show the effectiveness of the proposed algorithm.

Keywords: Gene Ontology, Semantic similarity measure, Genetic algorithm, Ontology search

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1085137

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1497

References:


[1] M. Ashburner, C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis, K. Dolinski, S.S. Dwight, J.T. Eppig, M.A. Harris, D.P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J.C. Matese, J.E. Richardson, M. Ringwald, G.M. Rubin, and G.. Sherlock, "Gene ontology: tool for the unification of biology," Nat. Genet., vol. 25, no. 1, pp. 25-29, May 2000.
[2] H. Wu, Z. Su, F. Mao, V. Olman, and Y. Xu, "Prediction of functional modules based on comparative genome analysis and gene ontology application," Nucleic Acids Res., vol. 33, no. 9, pp. 2822-2837, May 2005.
[3] J.A. Young, Q.L. Fivelman, P.L. Blair, P. de la Vega, K.G. Le Roch, Y. Zhou, D.J. Carucci, D.A. Baker, and E.A. Winzeler, "The plasmodium falciparum sexual development transcriptome: a microarray analysis using ontology-based pattern identification," Mol. Biochem. Parasitol., vol. 143, no. 1, pp. 67-79, Sep. 2005.
[4] J. Espadaler, O. Romero-Isart, R.M. Jackson, and B. Oliva, "Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships," Bioinformatics, vol. 21, no. 16, pp. 3360-3368, Aug. 2005.
[5] S.M. Hauck, S. Schoeffmann, C.A. Deeg, C.J. Gloeckner, M.S. Lange, and M. Ueffing, "Proteomic analysis of the porcine interphotoreceptor matrix," Proteomics, vol. 5, no. 14, pp. 3623-3636, Sep. 2005.
[6] P.W. Lord, R.D. Stevens, A. Brass, and C.A. Goble, "Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation," Bioinformatics, vol. 19, no. 10, pp. 1275-1283, Jul. 2003.
[7] K. Eilbeck, S.E. Lewis, C.J. Mungall, M. Yandell, L. Stein, R. Durbin, and M. Ashburner, "The sequence ontology: a tool for the unification of genome annotations," Genome Biol., vol. 6, no. 5, rec. R44, Apr. 2005.
[8] J. Bard, S.Y. Rhee, and M. Ashburner, "An ontology for cell types," Genome Biol., vol. 6, no. 2, rec. R21, Jan. 2005.
[9] H.J. Feldman, M. Dumontier, S. Ling, N. Haider, and C.W. Hogue, "CO: a chemical ontology for identification of functional groups and semantic comparison of small molecules," FEBS Lett., vol. 579, no. 21, pp. 4685- 4691, Aug. 2005.
[10] J.D. Thompson, S.R. Holbrook, K. Katoh, P. Koehl, D. Moras, E. Westhof, and O. Poch, "MAO: a multiple alignment ontology for nucleic acid and protein sequences," Nucleic Acids Res., vol. 33, no. 13, pp. 4164-4171, Jul. 2005.
[11] P. Grenon, B. Smith, and L. Goldberg, "Biodynamic ontology: applying BFO in the biomedical domain," Stud. Health Technol. Inform., vol. 102, pp. 20-38, Apr. 2004.
[12] E. Ratsch, J. Schultz, J. Saric, P.C. Lavin, U. Wittig, U. Reyle, and I. Rojas, "Developing a protein-interactions ontology," Comp. Funct. Genom., vol. 4, no. 1, pp. 85-89, Feb. 2003.
[13] H. Liu, Z. Hu, and C.H. Wu, "DynGO: a tool for browsing and mining gene ontology and its associations," BMC Bioinformatics, vol. 6, rec. 201, Aug. 2005.
[14] F. Couto, M. Silva, and P. Coutinho, "Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors," presented at the 14th ACM Conf. Information and Knowledge Management, Bremen, Germany, Oct. 31 - Nov. 5, 2005.
[15] M.A. Rodriguez and M.J. Egenhofer, "Determining semantic similarity among entity classes from different ontologies," IEEE Trans. Knowledge and Data Engineering, vol. 15, no. 2, pp. 442-456, Mar. 2003.
[16] C.-C. Feng and D.M. Flewelling, "Assessment of semantic similarity between land use/land cover classification systems," Computers, Environment, and Urban Systems, vol. 28, no. 3, pp. 229-246, May 2004.
[17] G. Vigliocco, D.P. Vinson, and S. Siri, "Semantic similarity and grammatical class in naming actions," Cognition, vol. 94, no. 3, pp. B91- B100, Jan. 2005.
[18] J.L. Sevilla, V. Segura, A. Podhorski, E. Guruceaga, J.M. Mato, L.A. Martínez-Cruz, F.J. Corrales, and A. Rubio, "Correlation between gene expression and GO semantic similarity," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 4, pp. 330- 338, Oct-Dec 2005.
[19] C. Leacock and M. Chodorow, "Combining local context and WordNet similarity for word sense identification," in WordNet: An Electronic Lexical Database, C. Fellbaum, Ed. Cambridge: MIT Press, 1998, pp. 265-283.
[20] D. Lin, "An information-theoretic definition of similarity," in Proc. 15th Int. Conf. Machine Learning, Madison, WI, 1998, pp. 296-304.
[21] J.J. Jiang and D.W. Conrath, "Semantic similarity based on corpus statistics and lexical taxonomy," in Proc. 1998 Int. Conf. Research in Computational Linguistics, Taipei, Taiwan, 1998, pp. 19-33.
[22] P. Resnik, "Using information content to evaluate semantic similarity in a taxonomy," in Proc. 14th Int. Joint Conf. Artificial Intelligence, Montreal, Canada, 1995, pp. 448-453.
[23] A. Budanitsky and G. Hirst, "Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures," presented at the 2nd Meeting North American Chapter of the Association for Computational Linguistics, Pittsburgh, PA, Jun. 2-7, 2001.
[24] L. Chen, C. Luh, and C. Jou, "Generating page clippings from web search results using a dynamically terminated genetic algorithm," Information Systems, vol. 30, no. 4, pp. 299-316, Jun. 2005.
[25] M. Caramia, G. Felici, and A. Pezzoli, "Improving search results with data mining in a thematic search engine," Computers & Operations Research, vol. 31, no. 14, pp. 2387-2404, Dec. 2004.
[26] Z.Z. Nick and P. Themis, "Web search using a genetic algorithm," Internet Computing, vol. 5, no. 2, pp. 18-26, Mar. 2001.
[27] L. Tamine, C. Chrisment, and M. Boughanem, "Multiple query evaluation based on an enhanced genetic algorithm," Information Processing & Management, vol. 39, no. 2, pp. 215-231, Mar. 2003.
[28] J. Horng and C. Yeh, "Applying genetic algorithms to query optimization in document retrieval," Information Processing & Management, vol. 36, no. 5, pp. 737-759, Sep. 2000.
[29] I. Kushchu, "Web-based evolutionary and adaptive information retrieval," IEEE Trans. Evolutionary Computation, vol. 9, no. 2, pp. 117-125, Apr. 2005.
[30] S.K. Pal, V. Talwar, and P. Mitra, "Web mining in soft computing framework: relevance, state of the art and future directions," IEEE Trans. Neural Networks, vol. 13, no. 5, pp. 1163-1177, Sep. 2002.
[31] H. Chen, "Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms," J. American Society for Information Science, vol. 46, no. 3, pp. 194-216, Apr. 1995.
[32] R.M. Othman, S. Deris, R.M. Illias, Z. Zakaria, and S.M. Mohamad, "Automatic clustering of gene ontology by genetic algorithm," Int. J. Information Technology, vol. 3, no. 1, pp. 37-46, Apr. 2006.