Multiple Sequence Alignment Using Optimization Algorithms
Authors: M. F. Omar, R. A. Salam, R. Abdullah, N. A. Rashid
Abstract:
Proteins or genes that have similar sequences are likely to perform the same function. One of the most widely used techniques for sequence comparison is sequence alignment. Sequence alignment allows mismatches and insertion/deletion, which represents biological mutations. Sequence alignment is usually performed only on two sequences. Multiple sequence alignment, is a natural extension of two-sequence alignment. In multiple sequence alignment, the emphasis is to find optimal alignment for a group of sequences. Several applicable techniques were observed in this research, from traditional method such as dynamic programming to the extend of widely used stochastic optimization method such as Genetic Algorithms (GAs) and Simulated Annealing. A framework with combination of Genetic Algorithm and Simulated Annealing is presented to solve Multiple Sequence Alignment problem. The Genetic Algorithm phase will try to find new region of solution while Simulated Annealing can be considered as an alignment improver for any near optimal solution produced by GAs.
Keywords: Simulated annealing, genetic algorithm, sequence alignment, multiple sequence alignment.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1082227
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2407References:
[1] N. M. Luscombe, D. Greebaum, M. Gerstein, (2001). What is bioinformatics? A proposed definition and overview of the field. Department of Molecular Biophysics and Biochemistry, Yale University, USA.
[2] R. Musick, T. Slezak, T. Crithlow, (2001). An Overview of Bioinformatics Research at Lawrence Livermore National Laboratory. Joint Genome Institute / Computing Application Organization .Lawrence Livermore National Laboratory.
[3] Online Lectures on Bioinformatics (September 2003) (Online), Available: http://lectures.molgen.mpg.de/Biol/index.html.
[4] P. C. Turner, A. G. McLennan, A. D Bates, M. R. H. White, (1997). Nucleic Acid Structure. Instant Notes in Molecular Biology C1:31-35. Bios Scientific Publishers Limited, Liverpool.
[5] M. Tompa .(2000). Lecture Notes on Biological Sequence Analysis. Technical Report #2000-06-01.Department of Computer Science and Engineering, University of Washington.
[6] W. Zhong(2003). Using Travel Salesman Problem Algorithms To determine Multiple Sequence Alignment Orders. Master Thesis, University of Georgia: Athens, Georgia.
[7] C. Karostensky, G. Gonnet. Near Optimal Multiple Sequence Alignment using a Traveling Salesman Problem approach. Swiss Federal Institute of Technology, Institute of Scientific Computing. ETH Zurich, Switzerland (2000).
[8] R. Choudry, (1999). Application of Evolutionary Algorithms for Multiple Sequence Alignment. Stanford University.
[9] D. J. Lipman, S. F. Altschul, J. D. Kececioglun, (1989). A tool for Multiple Sequence Alignment. Vol. 86.pp 4412-4415, Biochemistry. Proc. Natl. Acad. Sci. USA.
[10] M. Gribskov, J. Devereux, (1991). Similarity and Homology. Sequence Analysis Primer.3: 89-157. Stockton Press. NY.
[11] C. Notredame, (2002). Recent Progress in Multiple Sequence Alignment: A Survey., Pharmacogenomics, 3(1):131-144.
[12] D. Higgins, (1999). Multiple Sequence Alignment. Genetics Databases.9:165-183 Academic Press, London, UK .
[13] J. D. Thompson, F. Plewniak, and O. Poch. (1999). A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research, 27(13):2682-2690.
[14] Introduction to hidden markov models (June 2003) (Online), Available: www.scs.leeds.ac.uk/scsonly/teachingmaterials/HiddenMarko vModels/html dev/main.html.
[15] Introduction to HMM (April 2003) (Online) Available: http://www.cs.brown.edu/research/ai/dynamics/tutorial/Documents/Hidd enMarkovModels.html.
[16] S. R. Eddy, (1998). Profile Hidden Markov Models. Bioinformatics Review. Vol. 14. 9: 755-763. Oxford University Press.
[17] P. Baldi, S. Brunak, Y. Chauven, A. Krogh (1997). Hidden Markov Model For Human Genes: Periodic Pattern in Exon Sequence. Theoretical and Computational Methods in Genome Research, 2: 15-32. Plenum Press, New York.
[18] S. R. Eddy, (1996). Multiple Alignment Using Hidden Markov Models. Dept. of Genetics, Washington University School of Medicine St. Louis, US.
[19] M. Gen, R. Cheng. (2000). Genetic Algorithms and Engineering Optimization. John Wiley & Sons: Canada.
[20] D. Beasley, D. R. Bull, R. R. Martin, (1993). An Overview of Genetic Algorithms: Part 1, Fundamentals. Inter-University Committee on Computing. 15(2) 58-69.
[21] CCS 501 Neural Network and Genetic Algorithm http:// office1.cs.usm.my.
[21] C. Notredame, and D. G. Higgins, (1996). SAGA: sequence alignment by genetic algorithm, Nuc. Acids Res., 24(8), 1515-1524.
[22] M. Isokawa, M. Wayama, T. Shimizu, (1996). Multiple Sequence Alignment using Genetic Algorithm. Department of Information Science, Faculty of Science, Hirosaki University, Japan.
[23] J. T. Horng, C. N. Lin, B. H. Yang, and C. Y. Kao, (2001). A genetic algorithm for multiple sequence alignment. In Poster in German Conference on Bioinformatics.
[24] S. Kirkpatrick, C. D. J. Gellat, and M. P. Vecchi, (1983). Optimization by Simulated Annealing, Science, 220 pp. 671-680.