Commenced in January 2007
Paper Count: 30172
Introducing Sequence-Order Constraint into Prediction of Protein Binding Sites with Automatically Extracted Templates
Abstract:Search for a tertiary substructure that geometrically matches the 3D pattern of the binding site of a well-studied protein provides a solution to predict protein functions. In our previous work, a web server has been built to predict protein-ligand binding sites based on automatically extracted templates. However, a drawback of such templates is that the web server was prone to resulting in many false positive matches. In this study, we present a sequence-order constraint to reduce the false positive matches of using automatically extracted templates to predict protein-ligand binding sites. The binding site predictor comprises i) an automatically constructed template library and ii) a local structure alignment algorithm for querying the library. The sequence-order constraint is employed to identify the inconsistency between the local regions of the query protein and the templates. Experimental results reveal that the sequence-order constraint can largely reduce the false positive matches and is effective for template-based binding site prediction.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1056789Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
 S. E. Brenner, "A tour of structural genomics," Nature Reviews Genetics, vol. 2, pp. 801-809, Oct 2001.
 J. D. Watson, R. A. Laskowski, and J. M. Thornton, "Predicting protein function from sequence and structural data," Current Opinion in Structural Biology, vol. 15, pp. 275-284, Jun 2005.
 D. T. H. Chang, C. Y. Chen, W. C. Chung, Y. J. Oyang, H. F. Juan, and H. C. Huang, "ProteMiner-SSM: a web server for efficient analysis of similar protein tertiary substructures," Nucleic Acids Research, vol. 32, pp. W76-W82, Jul 1 2004.
 A. Shulman-Peleg, R. Nussinov, and H. J. Wolfson, "Recognition of functional sites in protein structures," Journal of Molecular Biology, vol. 339, pp. 607-633, Jun 4 2004.
 F. Ferre, G. Ausiello, A. Zanzoni, and M. Helmer-Citterich, "Functional annotation by identification of local surface similarities: a novel tool for structural genomics," BMC Bioinformatics, vol. 6, p. 194, Aug 2 2005.
 J. W. Torrance, G. J. Bartlett, C. T. Porter, and J. M. Thornton, "Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families," Journal of Molecular Biology, vol. 347, pp. 565-581, Apr 1 2005.
 C. T. Porter, G. J. Bartlett, and J. M. Thornton, "The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data," Nucleic Acids Research, vol. 32, pp. D129-D133, Jan 1 2004.
 J. A. Barker and J. M. Thornton, "An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis," Bioinformatics, vol. 19, pp. 1644-1649, Sep 1 2003.
 D. T.-H. Chang, Y.-Z. Weng, J.-H. Lin, M.-J. Hwang, and Y.-J. Oyang, "Protemot: prediction of protein binding sites with automatically extracted geometrical templates," Nucleic Acids Research, vol. 34, pp. W303-W309, 2006.
 H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne, "The Protein Data Bank," Nucleic Acids Research, vol. 28, pp. 235-242, Jan 1 2000.
 B. P. Pandey, C. Zhang, X. Z. Yuan, J. Zi, and Y. Q. Zhou, "Protein flexibility prediction by an all-atom mean-field statistical theory," Protein Science, vol. 14, pp. 1772-1777, Jul 2005.
 I. Bahar, A. R. Atilgan, and B. Erman, "Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential," Folding & Design, vol. 2, pp. 173-181, 1997.
 R. A. Laskowski, V. V. Chistyakov, and J. M. Thornton, "PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids," Nucleic Acids Research, vol. 33, pp. D266-D268, Jan 12005.
 Y. J. Oyang, S. C. Hwang, Y. Y. Ou, C. Y. Chen, and Z. W. Chen, "Data classification with radial basis function networks based on a novel kernel density estimation algorithm," IEEE Transactions on Neural Networks,vol. 16, pp. 225-236, Jan 2005.
 Y.-J. Oyang, D. T.-H. Chang, Y.-Y. Ou, H.-G. Hung, C.-P. Wu, and C.-Y. Chen, "Supervised Machine Learning with a Novel Kernel Density Estimator," 2007, p. arXiv:stat.ML/0709.2760.
 H. J. Wolfson and I. Rigoutsos, "Geometric hashing: An overview," Ieee Computational Science & Engineering, vol. 4, pp. 10-21, Oct-Dec 1997.
 C. A. Orengo and W. R. Taylor, "SSAP: Sequential structure alignment program for protein structure comparison," Computer Methods for Macromolecular Sequence Analysis, vol. 266, pp. 617-635, 1996.
 X. Pennec and N. Ayache, "A geometric algorithm to find small but highly similar 3D substructures in proteins," Bioinformatics, vol. 14, pp. 516-522, 1998.
 N. S. Boutonnet, M. J. Rooman, M. E. Ochagavia, J. Richelle, and S. J. Wodak, "Optimal Protein-Structure Alignments by Multiple Linkage Clustering - Application to Distantly Related Proteins," Protein Engineering, vol. 8, pp. 647-662, Jul 1995.
 D. E. Krane and M. L. Raymer, Fundamental concepts of bioinformatics. San Francisco: Benjamin Cummings, 2002.
 S. F. Altschul, "Amino-Acid Substitution Matrices from an Information Theoretic Perspective," Journal of Molecular Biology, vol. 219, pp. 555-565, Jun 5 1991.
 Y. Zhang and J. Skolnick, "Scoring function for automated assessment of protein structure template quality," Proteins-Structure Function and Bioinformatics, vol. 57, pp. 702-710, Dec 1 2004.
 T. Cormen, C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms, Second Edition: The MIT Press, 2001.
 A. Andreeva, D. Howorth, J. M. Chandonia, S. E. Brenner, T. J. P. Hubbard, C. Chothia, and A. G. Murzin, "Data growth and its impact on the SCOP database: new developments," Nucleic Acids Research, vol. 36, pp. D419-D425, Jan 2008.
 S. B. Needleman and C. D. Wunsch, "A general method applicable to the search for similarities in the amino acid sequence of two proteins," J Mol Biol, vol. 48, pp. 443-453, 1970.