Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32579
Application of Exact String Matching Algorithms towards SMILES Representation of Chemical Structure

Authors: Ahmad Fadel Klaib, Zurinahni Zainol, Nurul Hashimah Ahamed, Rosma Ahmad, Wahidah Hussin


Bioinformatics and Cheminformatics use computer as disciplines providing tools for acquisition, storage, processing, analysis, integrate data and for the development of potential applications of biological and chemical data. A chemical database is one of the databases that exclusively designed to store chemical information. NMRShiftDB is one of the main databases that used to represent the chemical structures in 2D or 3D structures. SMILES format is one of many ways to write a chemical structure in a linear format. In this study we extracted Antimicrobial Structures in SMILES format from NMRShiftDB and stored it in our Local Data Warehouse with its corresponding information. Additionally, we developed a searching tool that would response to user-s query using the JME Editor tool that allows user to draw or edit molecules and converts the drawn structure into SMILES format. We applied Quick Search algorithm to search for Antimicrobial Structures in our Local Data Ware House.

Keywords: Exact String-matching Algorithms, NMRShiftDB, SMILES Format, Antimicrobial Structures.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945


[1] Carlos Morel, "Bioinformatics for disease endemic countries: opportunities and challenges in science and technology development for health", Special Program for Research and Training in Tropical Diseases (TDR). Geneva, Switzerland, 2002, pp. 1-4.
[2] Chen Guang Li., "String Matching and the Knuth-Morris-Pratt Algorithm". Carleton University, Canada, 2006, pp. 1-8.
[3] Christian Charras and Thierry Lecroq, "Exact String Matching Algorithms". De Rouen University. France.
[4] Christoph Steinbeck and Stefan Kuhn. Open Content Databases and Open Source Libraries for Chemoinformatics. Cologne University Bioinformatics Center (CUBIC).
[5] Domenico Cantone and Simone Faro, "Forward-Fast-Search: Another Fast Variant of the Boyer-Moore String Matching Algorithm". Dipartimento di Matematica e informatica, Universita di Catania, Italy, 2003, pp. 10-24.
[6] Edward Reingold, Kenneth Urban and David Gries, "K-M-P string matching revisited". Department of Computer Science, Cornell University, USA, 1997, pp. 217-223.
[7] Greg Plaxton, "String Matching: Boyer-Moore Algorithm", Theory in Programming Practice. Department of Computer Science, University of Texas at Austin. 2005.
[8] Ireille R├ęgnier and Wojciech Szpankowski, "Complexity of Sequential Pattern Matching Algorithms". Barcelona, Spain, 2004, pp.187-200.
[9] Jerome Mettetal and Ross Lippert, "Brute Force Algorithms: Motif Finding". 2004, pp. 1-7.
[10] Jun Xu and Arnold Hagler, "Chemoinformatics and Drug Discovery", Partners International. USA, 2002, pp. 566-600.
[11] Kanniah Rajasekaran, Gerald DeGray, Kanniah Rajasekaran, Franzine Smith, John Sanford, and Henry Daniell. "Expression of an Antimicrobial Peptide via the Chloroplast Genome to Control Phytopathogenic Bacteria and Fungi", Department of Molecular Biology and Microbiology and Center for Discovery of Drugs and Diagnostics, University of Central Florida, Florida, 2001, pp. 203-210.
[12] Maxime Crochemore and Thierry Lecroq, "Pattern matching and text compression algorithms". Chapter 2, pp. 12-14.
[13] NMRShiftDB. Available: (Accessed February, 2007).
[14] Olivier Danvy and Henning Korsholm Rohde, "Obtaining the Boyer- Moore String-Matching Algorithm by Partial Evaluation". Department of Computer Science University of Aarhus, 2005, pp. 1-9.
[15] Peter Ertl, JME Editor. Available: (Accessed February, 2007).
[16] Prasit Palittapongarnpim, "Thailand's bioinformatics initiatives", The National Center for Genetic Engineering and Biotechnology and Department of Microbiology. Faculty of Science, Mahidol University, Bangkok, Thailand, 2002, pp. 6-8.
[17] Rahul Thathoo, Ashish Virmani, S. Sai Lakshmi, N. Balakrishnan and K. Sekar1, "TVSBS: A fast exact pattern matching algorithm for biological sequences". India, 2006, pp. 47-53.
[18] Richard L. Rowley, R. Jeremy Rowley, John L. Oscarson and W. Vincent Wilding. "Development of an Automated SMILES Pattern Matching Program to Facilitate the Prediction of Thermo physical Properties by Group Contribution Methods", Department of Chemical Engineering, Brigham Young University. Provo, Utah, 2001, pp. 1110- 1113.
[19] SMILES - A Simplified Chemical Language. Available: (Accessed March, 2007).
[20] Thomas E. Besser, Paul S. Morley, Michael D. Apley, Derek P. Burney, Paula J. Fedorka-Cray, Mark G. Papich, Josie L. Traub-Dargatz, and J. Scott Weese. "Antimicrobial Drug Use in Veterinary Medicine", 2005, pp. 617-629.
[21] Tim Bell, Matt Powell, Amar Mukherjee and Don Adjeroh, "Searching BWT compressed text with the Boyer-Moore algorithm and binary search". University of Central Florida, USA, 2001, pp. 1-10.
[22] TIMO RAITA, "Tuning the Boyer-Moore-Horspool String Searching Algorithm". University of Turku, Finland, 1992, pp. 879-884.
[23] Werner Arber, Daniel Nathans and Hamilton Smith, "DNA Mapping and Brute Force Algorithms". Berlin, Germany. pp. 1-29.
[24] Yusuke Shibata, Tetsuya Matsumoto, Masayuki Takeda, Ayumi Shinohara and Setsuo Arikawa, "A Boyer-Moore Type Algorithm for Compressed Pattern Matching". Montreal, Canada, 2004, pp.1-20.
[25] Peter Willet,John M Barnard and Geoffrey M. Down, 1998, Chemical Similarity Searching, Krebs Institute for bimolecular research and department of Information Studies,University of Sheffiled,UK, pp 983- 996.