Protein-Protein Interaction Detection Based on Substring Sensitivity Measure

Nazar Zaki; Safaai Deris; Hany Alashwal

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33132

Protein-Protein Interaction Detection Based on Substring Sensitivity Measure

Authors: Nazar Zaki, Safaai Deris, Hany Alashwal

Abstract:

Detecting protein-protein interactions is a central problem in computational biology and aberrant such interactions may have implicated in a number of neurological disorders. As a result, the prediction of protein-protein interactions has recently received considerable attention from biologist around the globe. Computational tools that are capable of effectively identifying protein-protein interactions are much needed. In this paper, we propose a method to detect protein-protein interaction based on substring similarity measure. Two protein sequences may interact by the mean of the similarities of the substrings they contain. When applied on the currently available protein-protein interaction data for the yeast Saccharomyces cerevisiae, the proposed method delivered reasonable improvement over the existing ones.

Keywords: Protein-Protein Interaction, support vector machine, feature extraction, pairwise alignment, Smith-Waterman score.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1071210

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1946

References:

[1] P. E. Bourne and H. Weissig, ''Structural bioinformatics,'' John Wiley and sons, 2003.
[2] Y. Huang, D. Frishman and I. Muchnik, ''Predicting protein-protein interactions by a supervised learning classifier,'' Computational Biology and chemistry, no. 28, pp: 291-301, 2004.
[3] J. R. Bock and D. A. Gough, ''Predicting protein-protein interactions from primary structure,'' Bioinformatics, Vol. 17 no. 5, pp:455-460, 2001.
[4] E. M. Marcotte, M. Pellegrini, M. J. Thompson, T. O. Yeates, and D. Eisenberg, ''A combined algorithm for genome-wide prediction of protein function,'' Nature, vol. 402, pp: 83-86, 1999.
[5] M. Pellegrini, E. M. Marcotte, M. J. Thompson, D. Eisenberg, and T. O. Yeates, ''Assigning protein functions by comparative genome analysis: protein phylogenetic profiles,'' In the proceedings of National Academy of Sciences, USA, vol. 96, pp: 4285-4288, 1999.
[6] F. Pazos and A. Valencia, ''Similarity of phylogenetic trees as indicator of protein-protein interaction,'' Protein Engineering, vol. 14(9), pp: 609- 614, 2001.
[7] J. Enright, I. N. Ilipoulos, C. Kyrpides, and C. A. Ouzounis, ''Protein interaction maps for complete genomes based on gene fusion events,'' Nature, vol. 402, pp: 86-90, 1999.
[8] D. Eisenberg, E. M. Marcotte, I. Xenarios, and T. O. Yeates, ''Protein function in the post-genomic era,'' Nature, vol. 405, pp: 823-826, 2000.
[9] T. Pawson and P. Nash, ''Assembly of cell regulatory systems through protein interaction domains,'' Science, vol. 300, pp: 445-452, 2003.
[10] J. Wojcik and V. Schachter, ''Protein-Protein interaction map inference using interacting domain profile pairs,'' Bioinformatics, vol. 17, pp: S296-S305, 2001.
[11] W. K. Kim, J. Park, and J. K. Suh, ''Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair,'' Genome Informatics, vol. 13, pp: 42-50, 2002.
[12] S. K. Ng, Z. Zhang, and S. H. Tan, ''integrative approach for computationally inferring protein domain interactions,'' Bioinformatics, 19, pp: 923-929, 2002.
[13] S. M. Gomez, W. S. Noble, and A. Rzhetsky, ''Learning to predict protein-protein interactios from protein sequences,'' Bioinformatics, 19, pp: 1875-1881, 2003.
[14] Smith, T. and Waterman, M. Identification of common molecular subsequences. J. Mol. Bio., 147, pp: 195-197, 1981.
[15] H. Saigo, J. Vert, N. Ueda and T. Akutsu, ''Protein homology detection using string alignment kernels,'' Bioinformatics, Vol. 20 no. 11, pp: 1682-1689, 2004.
[16] H. Alashwal, S. Deris and R. Othman, ''Comparison of Domain and Hydrophobicity Features for the Prediction of Protein-Protein Interactions using Support Vector Machines,'' International Journal of Information Technology, Vol. 3, no. 1, 1305-2403, 2006.
[17] N. M. Zaki, S. Deris, and R. M. Illias, ''Application of string kernels in protein sequence classification'' App. Bioinformatics, 1, pp: 45, 2005.
[18] C. Leslie, E. Eskin, J. Weston and W. Noble, ''Mismatch String Kernels for Discriminative Protein Classification,'' Bioinformatics, 20, pp: 67, 2004.
[19] L. Liao, and W. S. Noble, ''Combining Pairwise Sequence Similarity and Support Vector Machines for Detecting Remote Protein Evolutionary and Structural Relationships,'' J. Comp. Biol., 10, pp: 857, 2003.
[20] Cristianini, N., and J. Shawe-Taylor, ''An introduction to Support Vector Machines,'' Cambridge, UK: Cambridge University Press. 2000.
[21] Vapnik, V. N. ''Statistical Learning Theory,'' Wiley, 1998.
[22] N. M. Zaki, S. Deris, and R. M. Illias, ''Feature Extraction for Protein Homologies Detection Using Markov Models Combining Scores,'' Int. J. on Comp. Intelligence and Appl., 1, pp: 1, 2004.
[23] C. M. Deane, L. Salwinski, I. Xenarios, and D. Eisenberg, ''Protein interactions: two methods for assessment of the reliability of high throughput observations,'' Molecular & Cellular Proteomics, vol. 1(5), pp: 349-56, 2002.
[24] W. R. Pearson, ''Rapid and sensitive sequence comparisons with FASTAP and FASTA Method'', Enzymol, 183, pp: 63, 1985.
[25] C. C. Chang and C. J. Lin, ''LIBSVM : a library for support vector machines,'' 2001. Software available at http://www.csie.ntu.-edu.tw/~cjlin/libsvm. (24th March 2005).
[26] Swets, ''Measuring the accuracy of diagnostic systems,'' Science, 270, pp: 1285-1293, 1988.