{"title":"Proteins Length and their Phenotypic Potential","authors":"Tom Snir, Eitan Rubin","volume":30,"journal":"International Journal of Biomedical and Biological Engineering","pagesStart":320,"pagesEnd":325,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10682","abstract":"
Mendelian Disease Genes represent a collection of single points of failure for the various systems they constitute. Such genes have been shown, on average, to encode longer proteins than 'non-disease' proteins. Existing models suggest that this results from the increased likeli-hood of longer genes undergoing mutations. Here, we show that in saturated mutagenesis experiments performed on model organisms, where the likelihood of each gene mutating is one, a similar relationship between length and the probability of a gene being lethal was observed. We thus suggest an extended model demonstrating that the likelihood of a mutated gene to produce a severe phenotype is length-dependent. Using the occurrence of conserved domains, we bring evidence that this dependency results from a correlation between protein length and the number of functions it performs. We propose that protein length thus serves as a proxy for protein cardinality in different networks required for the organism's survival and well-being. We use this example to argue that the collection of Mendelian Disease Genes can, and should, be used to study the rules governing systems vulnerability in living organisms.<\/p>\r\n","references":"[1] Botstein, D. and Risch, N. (2003) Discovering genotypes underlying\r\nhuman phenotypes: past successes for mendelian disease, future\r\napproaches for complex disease, Nat Genet, 33 Suppl, pp 228-237.\r\n[2] Lopez-Bigas, N. and Ouzounis, C.A. (2004) Genome-wide identification\r\nof genes likely to be involved in human genetic disease, Nucleic Acids\r\nRes., 32, pp. 3108-3114.\r\n[3] Kondrashov, F.A., Ogurtsov, A.Y. and Kondrashov, A.S. (2004)\r\nBioinformatical assay of human gene morbidity, Nucleic Acids Res., 32,\r\npp. 1731-1737.\r\n[4] Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. and Pickard, B.S.\r\n(2005) Speeding disease gene discovery by sequence based candidate\r\nprioritization, BMC Bioinformatics, 6, pp. 55-88.\r\n[5] Jimenez-Sanchez, G., Childs, B. and Valle, D. (2001) Human disease\r\ngenes, Nature, 409, pp. 853-855.\r\n[6] Oti, M., Snel, B., Huynen, M.A. and Brunner, H.G. (2006) Predicting\r\ndisease genes using protein-protein interactions, J Med Genet. 43, pp.\r\n691-8.\r\n[7] Perez-Iratxeta, C., Bork, P. and Andrade, M.A. (2002) Association of\r\ngenes to genetically inherited diseases using data mining, Nat Genet, 31,\r\npp. 316-319.\r\n[8] Turner, F.S., Clutterbuck, D.R. and Semple, C.A. (2003) POCUS:\r\nmining genomic sequence annotation to predict disease genes, Genome\r\nBiol, 4, pp. R75.\r\n[9] Seringhaus, M., Paccanaro, A., Borneman, A., Snyder, M. and Gerstein,\r\nM. (2006) Predicting essential genes in fungal genomes. Genome Res.,\r\n16, pp 1126-1135\r\n[10] Lopez-Bigas, N., Audit, B., Ouzounis, C., Parra, G. and Guigo, R.\r\n(2005) Are splicing mutations the most frequent cause of hereditary\r\ndisease?, FEBS Lett, 579, pp. 1900-1903.\r\n[11] Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A. and McKusick,\r\nV.A. (2005) Online Mendelian Inheritance in Man (OMIM), a\r\nknowledgebase of human genes and genetic disorders. Nucleic Acids\r\nRes., pp. D514-517.\r\n[12] Drysdale, R. and The FlyBase Consortium, (2008). FlyBase : a database\r\nfor the Drosophila research community. Methods Molec. Biol. 420, pp\r\n45-59\r\n[13] Chen, N. et. al (2005) WormBase: a comprehensive data resource for\r\nCaenorhabditis biology and genomics, Nucleic Acids Res, 33, pp. D383-\r\n389.\r\n[14] Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester,\r\nE.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M., Weng, S. and Botstein,\r\nD. (2006) SGD: Saccharomyces Genome Database. Nucleic Acids Res.\r\n26, pp. 73-79.\r\n[15] Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and\r\ngraphics, Journal of Computational and Graphical Statistics 5, pp. 299-\r\n314.\r\n[16] Karlin, S., Chen, C., Gentles, A.j. and Cleary, M. (2002) Associations\r\nbetween human disease genes and overlapping gene groups and multiple\r\namino acid runs. Proc. Nat. Acad Sci 99, pp. 17008-17013\r\n[17] Jeong, H., Mason, S.P., Barab\u251c\u00edsi, A..L. and Oltvai, Z.N. (2001) Lethality\r\nand centrality in protein networks. Nature 411, pp. 41-42.\r\n[18] Batada, N.N., Hurst, L.D. and Tyers, M. (2006) Evolutionary and\r\nPhysiological Importance of Hub Proteins. PLoS Comput Biol 2, pp. e88.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 30, 2009"}