Proteins Length and their Phenotypic Potential
Mendelian Disease Genes represent a collection of single points of failure for the various systems they constitute. Such genes have been shown, on average, to encode longer proteins than 'non-disease' proteins. Existing models suggest that this results from the increased likeli-hood of longer genes undergoing mutations. Here, we show that in saturated mutagenesis experiments performed on model organisms, where the likelihood of each gene mutating is one, a similar relationship between length and the probability of a gene being lethal was observed. We thus suggest an extended model demonstrating that the likelihood of a mutated gene to produce a severe phenotype is length-dependent. Using the occurrence of conserved domains, we bring evidence that this dependency results from a correlation between protein length and the number of functions it performs. We propose that protein length thus serves as a proxy for protein cardinality in different networks required for the organism's survival and well-being. We use this example to argue that the collection of Mendelian Disease Genes can, and should, be used to study the rules governing systems vulnerability in living organisms.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1075705Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372
 Botstein, D. and Risch, N. (2003) Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease, Nat Genet, 33 Suppl, pp 228-237.
 Lopez-Bigas, N. and Ouzounis, C.A. (2004) Genome-wide identification of genes likely to be involved in human genetic disease, Nucleic Acids Res., 32, pp. 3108-3114.
 Kondrashov, F.A., Ogurtsov, A.Y. and Kondrashov, A.S. (2004) Bioinformatical assay of human gene morbidity, Nucleic Acids Res., 32, pp. 1731-1737.
 Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. and Pickard, B.S. (2005) Speeding disease gene discovery by sequence based candidate prioritization, BMC Bioinformatics, 6, pp. 55-88.
 Jimenez-Sanchez, G., Childs, B. and Valle, D. (2001) Human disease genes, Nature, 409, pp. 853-855.
 Oti, M., Snel, B., Huynen, M.A. and Brunner, H.G. (2006) Predicting disease genes using protein-protein interactions, J Med Genet. 43, pp. 691-8.
 Perez-Iratxeta, C., Bork, P. and Andrade, M.A. (2002) Association of genes to genetically inherited diseases using data mining, Nat Genet, 31, pp. 316-319.
 Turner, F.S., Clutterbuck, D.R. and Semple, C.A. (2003) POCUS: mining genomic sequence annotation to predict disease genes, Genome Biol, 4, pp. R75.
 Seringhaus, M., Paccanaro, A., Borneman, A., Snyder, M. and Gerstein, M. (2006) Predicting essential genes in fungal genomes. Genome Res., 16, pp 1126-1135
 Lopez-Bigas, N., Audit, B., Ouzounis, C., Parra, G. and Guigo, R. (2005) Are splicing mutations the most frequent cause of hereditary disease?, FEBS Lett, 579, pp. 1900-1903.
 Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A. and McKusick, V.A. (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res., pp. D514-517.
 Drysdale, R. and The FlyBase Consortium, (2008). FlyBase : a database for the Drosophila research community. Methods Molec. Biol. 420, pp 45-59
 Chen, N. et. al (2005) WormBase: a comprehensive data resource for Caenorhabditis biology and genomics, Nucleic Acids Res, 33, pp. D383- 389.
 Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M., Weng, S. and Botstein, D. (2006) SGD: Saccharomyces Genome Database. Nucleic Acids Res. 26, pp. 73-79.
 Ihaka, R. and Gentleman, R. (1996) R: A language for data analysis and graphics, Journal of Computational and Graphical Statistics 5, pp. 299- 314.
 Karlin, S., Chen, C., Gentles, A.j. and Cleary, M. (2002) Associations between human disease genes and overlapping gene groups and multiple amino acid runs. Proc. Nat. Acad Sci 99, pp. 17008-17013
 Jeong, H., Mason, S.P., Barab├ísi, A..L. and Oltvai, Z.N. (2001) Lethality and centrality in protein networks. Nature 411, pp. 41-42.
 Batada, N.N., Hurst, L.D. and Tyers, M. (2006) Evolutionary and Physiological Importance of Hub Proteins. PLoS Comput Biol 2, pp. e88.