Search results for: Tanimoto
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2

Search results for: Tanimoto

2 2D Fingerprint Performance for PubChem Chemical Database

Authors: Fatimah Zawani Abdullah, Shereena Mohd Arif, Nurul Malim

Abstract:

The study of molecular similarity search in chemical database is increasingly widespread, especially in the area of drug discovery. Similarity search is an application in the field of Chemoinformatics to measure the similarity between the molecular structure which is known as the query and the structure of chemical compounds in the database. Similarity search is also one of the approaches in virtual screening which involves computational techniques and scoring the probabilities of activity. The main objective of this work is to determine the best fingerprint when compared to the other five fingerprints selected in this study using PubChem chemical dataset. This paper will discuss the similarity searching process conducted using 6 types of descriptors, which are ECFP4, ECFC4, FCFP4, FCFC4, SRECFC4 and SRFCFC4 on 15 activity classes of PubChem dataset using Tanimoto coefficient to calculate the similarity between the query structures and each of the database structure. The results suggest that ECFP4 performs the best to be used with Tanimoto coefficient in the PubChem dataset.

Keywords: 2D fingerprints, Tanimoto, PubChem, similarity searching, chemoinformatics

Procedia PDF Downloads 260
1 The Intersection/Union Region Computation for Drosophila Brain Images Using Encoding Schemes Based on Multi-Core CPUs

Authors: Ming-Yang Guo, Cheng-Xian Wu, Wei-Xiang Chen, Chun-Yuan Lin, Yen-Jen Lin, Ann-Shyn Chiang

Abstract:

With more and more Drosophila Driver and Neuron images, it is an important work to find the similarity relationships among them as the functional inference. There is a general problem that how to find a Drosophila Driver image, which can cover a set of Drosophila Driver/Neuron images. In order to solve this problem, the intersection/union region for a set of images should be computed at first, then a comparison work is used to calculate the similarities between the region and other images. In this paper, three encoding schemes, namely Integer, Boolean, Decimal, are proposed to encode each image as a one-dimensional structure. Then, the intersection/union region from these images can be computed by using the compare operations, Boolean operators and lookup table method. Finally, the comparison work is done as the union region computation, and the similarity score can be calculated by the definition of Tanimoto coefficient. The above methods for the region computation are also implemented in the multi-core CPUs environment with the OpenMP. From the experimental results, in the encoding phase, the performance by the Boolean scheme is the best than that by others; in the region computation phase, the performance by Decimal is the best when the number of images is large. The speedup ratio can achieve 12 based on 16 CPUs. This work was supported by the Ministry of Science and Technology under the grant MOST 106-2221-E-182-070.

Keywords: Drosophila driver image, Drosophila neuron images, intersection/union computation, parallel processing, OpenMP

Procedia PDF Downloads 200