Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30172
Improving Protein-Protein Interaction Prediction by Using Encoding Strategies and Random Indices

Authors: Essam Al-Daoud

Abstract:

A New features are extracted and compared to improve the prediction of protein-protein interactions. The basic idea is to select and use the best set of features from the Tensor matrices that are produced by the frequency vectors of the protein sequences. Three set of features are compared, the first set is based on the indices that are the most common in the interacting proteins, the second set is based on the indices that tend to be common in the interacting and non-interacting proteins, and the third set is constructed by using random indices. Moreover, three encoding strategies are compared; that are based on the amino asides polarity, structure, and chemical properties. The experimental results indicate that the highest accuracy can be obtained by using random indices with chemical properties encoding strategy and support vector machine.

Keywords: protein-protein interactions, random indices, encoding strategies, support vector machine.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1330861

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1199

References:


[1] H. Chua, W. Hugo, G. Liu, X. Li, L. Wong and S. Ng, "A probabilistic graph-theoretic approach to integrate multiple predictions for the protein-protein subnetwork prediction challenge," Annals of the New York Academy of Sciences, vol. 1158, pp 224-233, 2009.
[2] X. Ren and J. Xia, "Prediction of Protein-Protein Interaction Sites by Using Autocorrelation Descriptor and Support Vector Machine," Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, Lecture Notes in Computer Science, vol. 6216, pp. 76-82, 2010.
[3] L. Salwinski, C.S. Miller, A.J. Smith, F.K. Pettit, J.U. Bowie and D. Eisenberg, "The Database of Interacting Proteins," NAR vol. 32,(Database issue), D449-51, 2004.
[4] P. Pagel, S. Kovac, M. Oesterheld, B. Brauner, I. Dunger-Kaltenbach, G. Frishman, C. Montrone, P. Mark, V. St├╝mpflen, H.W. Mewes, A. Ruepp and D. Frishman, "The MIPS mammalian protein-protein interaction database," Bioinformatics vol. 21, no. 6, pp. 832-834; 2005.
[5] L. J. Jensen, M. Kuhn, M. Stark, S. Chaffron, C. Creevey, J. Muller, T. Doerks, P. Julien, A. Roth, M. Simonovic, P. Bork and cC. von Mering, "STRING 8-a global view on proteins and their functional interactions in 630 organisms," Nucleic Acids Res vol. 37 Database: D412-D416, 2009.
[6] R. Jansen, H. Yu, D. Greenbaum, Y. Kluger, N. Krogan, S. Chung, A. Emili, M. Snyder, J. Greenblatt and M. Gerstein, "A Bayesian networks approach for predicting protein-protein interactions from genomic data," Science, vol 302, pp. 449-453, 2003.
[7] V. Zhang, S. Wong, O. King and F. Roth, Predicting, "co-complexed protein pairs using genomic and proteomic data integration," BMC Bioinformatics, vol. 5, no. 1, 38, 2004.
[8] Y. Qi, J. Klein-Seetharaman and Z. Bar-Joseph,"Random forest similarity for protein-protein interaction prediction from multiple sources," Pac Symp Biocomput, pp. 531-542, 2005.
[9] Y. Qi, J. Klein-Seetharaman and Z. Bar-Joseph, "A mixture of feature experts approach for protein-protein interaction prediction,", BMC Bioinformatics, vol. 8 (S10):S6, 2007
[Online]. Available: http://www.biomedcentral.com/1471-2105/8/S10/S6.
[10] M. Li, L. Lin, X. Wang and T. Liu, "Protein-protein interaction site prediction based on conditional random fields," Bioinformatics, vol. 23, no. 5, pp. 597-604, 2007.
[11] J. Espadaler, O. Romero-Isart, R. Jackson and B. Oliva, "Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships," Bioinformatics, vol 21, no.16, pp. 3360-3368, 2005.
[12] B. Wang, L. Sheng Ge, D. Huang and H. Wong, " prediction of protein-protein interacting sites by combining SVM algorithm with Bayesian methods, " Proceedings of the Third International Conference on Natural Computation, vol. 02, pp. 329-333, 2007.
[13] Y. Wang, J. Wang, Z. Yang and N. Deng, " prediction of protein-protein interaction based only on coding sequences," The Third International Symposium on Optimization and Systems Biology (OSB-09), pp. 151- 158, September 20-22, 2009.
[14] A. Bakar, J. Taheri and A. Zomaya, "Fuzzy systems modeling for protein-protein interaction prediction in Saccharomyces cerevisie," 18th World IMACS / MODSIM Congress, Cairns, Australia July 13-17, 2009.
[15] O. N. Yaveroglu and T. Can, "Predicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles," in Proceedings of the International Conference on Bioinformatics, Computational and Systems Biology (ICBCSB'09), Singapore, World Academy of Science, Engineering and Technology, vol. 56 pp. 241-247. June 2009.
[16] J. W. Shen, J. Zhang, X. Luo, W. Zhu, K. Yu, K. Chen, Y. Li and H. Jiang, "Predicting protein-protein interactions based only on sequences information," Proc Natl Acad Sci USA, vol. 104, no. 11, pp 4337-4341, 2007.
[17] K.C. Timberlake, " The chemistry of life," in Chemistry, 5th Edition, Haper-Collins Publishers Inc, NY, 1992.
[18] J. Koolman, K.H. Rohm, Colour Atlas of Biochemistry, Thieme, Stuttgart, 1996.
[19] E. Al-Daoud, "Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification," World Academy of Science, Engineering and Technology vol. 64 pp. 202 207, 2010.
[20] C.J.Shin, S.Wong, M.J. Davis and M.A. Ragan, "Protein-protein interaction as a predictor of subcellular location," BMC Systems Biology vol. 3, no. 28, 2009.