Eukaryotic Gene Prediction by an Investigation of Nonlinear Dynamical Modeling Techniques on EIIP Coded Sequences
Authors: Mai S. Mabrouk, Nahed H. Solouma, Abou-Bakr M. Youssef, Yasser M. Kadah
Abstract:
Many digital signal processing, techniques have been used to automatically distinguish protein coding regions (exons) from non-coding regions (introns) in DNA sequences. In this work, we have characterized these sequences according to their nonlinear dynamical features such as moment invariants, correlation dimension, and largest Lyapunov exponent estimates. We have applied our model to a number of real sequences encoded into a time series using EIIP sequence indicators. In order to discriminate between coding and non coding DNA regions, the phase space trajectory was first reconstructed for coding and non-coding regions. Nonlinear dynamical features are extracted from those regions and used to investigate a difference between them. Our results indicate that the nonlinear dynamical characteristics have yielded significant differences between coding (CR) and non-coding regions (NCR) in DNA sequences. Finally, the classifier is tested on real genes where coding and non-coding regions are well known.
Keywords: Gene prediction, nonlinear dynamics, correlation dimension, Lyapunov exponent.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1072808
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1835References:
[1] M. Akhtar, "Comparison of Gene and Exon Prediction Techniques for Detection of Short Coding Regions," International Journal of Information Technology, Vol. 11, No.8, 2005.
[2] A. Krogh, I. Saira Mian, and D. Haussler, "A hidden Markov Model that Finds Genes in E. Coli DNA,"Nucleic Acids Rsearch, Vol. 22 pp. 4768- 4778, 1994.
[3] P. P. Vaidyanathan, B.-J. Yoon, "Digital filters for gene prediction applications," IEEE Asilomar Conference on Signals, and Computers, Monterey, U.S.A., Nov. 2002.
[4] A. S. Nair, S. P. Sreenadhan, "A Coding Measure Scheme Employing Electron-Ion Interaction Pseudopotential (EIIP),"Bioinformation, vol. 1, no. 6, pp. 197- 202, 2006.
[5] http://www.physik3.gwdg.de/tstool/.
[6] A. G. Mamistvalov, "n-Dimensional Moment Invariants and Conceptual Mathematical Theory of Recognition n-Dimensional Solids," IEEE Trans. on Pattern Recogn. Mach. Intell. , Vol. 20, no. 8, pp. 819-831, 1998.
[7] M. I. Owis, A. H. Abou-Zied, A. M. Youssef, and Y. M. Kadah, "Study of features based on nonlinear dynamical modeling in ECG arrhythmia detection and classification," IEEE. Trans. Biomedical Engineering, vol. 79, pp. 733-736, July 2002.
[8] R. C. Gonzalez, and R. E. Woods, Digital Image Processing, 2nd ed., Pearson Education, New York, 2001.
[9] L. Cao, A. Mees, K. Judd, and G. Froyland, "Determining of the minimum embedding dimensions of input-output time series data," Intl. Journal. Bifurcation and chaos, vol. 8, pp. 1491-1504, 1997.
[10] W. S. Pritchard, D.W. Duke, "Measuring chaos in the brain: A tutorial review of EEG dimension estimation," Brain Cogn., vol. 27, no. 3, pp. 353-397, 1995.
[11] M. Burset, R. Guigo, "Evaluation of Gene Structure Prediction Prgrams,"Genomics,http://genome.Imim.es/datasets/genomics96. 1996.
[12] S. Rogic, "Evaluation of Gene- Finding Programs," University of British Columbia, http:/ /www.cs.ubc.ca/~rogic/evaluation.