Finding Approximate Tandem Repeats with the Burrows-Wheeler Transform
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
Finding Approximate Tandem Repeats with the Burrows-Wheeler Transform

Authors: Agnieszka Danek, Rafał Pokrzywa

Abstract:

Approximate tandem repeats in a genomic sequence are two or more contiguous, similar copies of a pattern of nucleotides. They are used in DNA mapping, studying molecular evolution mechanisms, forensic analysis and research in diagnosis of inherited diseases. All their functions are still investigated and not well defined, but increasing biological databases together with tools for identification of these repeats may lead to discovery of their specific role or correlation with particular features. This paper presents a new approach for finding approximate tandem repeats in a given sequence, where the similarity between consecutive repeats is measured using the Hamming distance. It is an enhancement of a method for finding exact tandem repeats in DNA sequences based on the Burrows- Wheeler transform.

Keywords: approximate tandem repeats, Burrows-Wheeler transform, Hamming distance, suffix array

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1055359

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1539

References:


[1] R. Chakraborty, M. Kimmel, D. N. Stivers, L. J. Davison, and R. Deka, Relative mutation rates at di-, tri-, and tetranucleotide microsatellite loci, PNAS, Vol. 94, pp. 1041 AI1046, 1997
[2] S. Kruglyak, R. T. Durrett, M. D. Schug, and C. F. Aquadro, Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations, PNAS, Vol. 95, pp. 1077410778, 1998
[3] M. D. Vinces, M. Legendre, M. Caldara, M. Hagihara, K. J. Verstrepen, Unstable Tandem Repeats in Promoters Confer Transcriptional Evolvability, Science 324, 1213 (2009)
[4] C. T. McMurray, Mechanisms of trinucleotide repeat instability during human development, Nat Rev Genet. 2010 Nov; 11(11): 786-99.
[5] A. J. Jeffreys, V. Wilson, S.L. Thein, Individual-specific -fingerprints- of human DNA, Nature 316, 76 79, 1985
[6] J. L. Weber and C. Wong, Mutation of human short tandem repeats, Hum. Mol. Genet. 2 (1993), pp. 11231128.
[7] A. Merkel, N. Gemmell, Detecting short tandem repeats from genome data: opening the software black box, Brief. Bioinform. 9 (5) (2008) 355AI366.
[8] R. Pokrzywa, Application of the Burrows-Wheeler Transform for searching for tandem repeats in DNA sequences, Int. J. Bioinf. Res. Appl. vol. 5, 432-446 (2009)
[9] R. Pokrzywa, A. Polanski.: BWtrs: A tool for searching for tandem repeats in DNA sequences based on the Burrows-Wheeler transform, Genomics 96, 316-321 (2010)
[10] M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm, SRC Research Report 124, Digital Equipment Corporation, California (1994)
[11] P. Ferragina, G. Manzini, Opportunistic data structures with applications, In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pp. 390-398, IEEE Computer Society Washington, DC, USA (2000)
[12] S. Kurtz, J. V. Choudhuri, E. Ohlebusch, C. Schleiermacher, J. Stoye, R. Giegerich: REPuter: The Manifold Applications of Repeat Analysis on a Genomic Scale, Nucleic Acids Res., 29(22):4633-4642, 2001.
[13] R. Kolpakov, G. Bana, G. Kucherov, mreps: efficient and flexible detection of tandem repeats in DNA, Nucleid Acids Research 31, 3672- 3678 (2003)
[14] G. Benson, Tandem Repeats Finder: a program to analyze DNA sequences, Nucleic Acids Research 27, 573-580 (1999)
[15] Y. Wexler, Z. Yakhini, Y. Kashi, D. Geiger, Finding Approximate Tandem Repeats in Genomic Sequences, Journal of Computational Biology (2005) 928-942
[16] D. Sokol, F. Atagun, TRedD: A Database for Tandem Repeats over the Edit Distance, Database (2010)
[17] V. Boeva, M. Regnier, D. Papatsenko, V. Makeev, Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression, Bioinformatics (2006) 22 (6): 676-684
[18] G. M. Landau, J. P. Schmidt, D. Sokol, An Algorithm for Approximate Tandem Repeat, Journal of Computational Biology, 8, 1-18, 2001