Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolutionMelters DP, Bradnam KR, Young HA, Telis N, May MR, Ruby JG, Sebra R, Peluso P, Eid J, Rank D, Garcia JF, Derisi JL, Smith T, Tobias C, Ross-Ibarra J, Korf I, Chan SW
Genome Biol., 2013Abstract: Centromeres are essential for chromosome segregation, yet their DNA
sequences evolve rapidly. In most animals and plants that have been studied,
centromeres contain megabase-scale arrays of tandem repeats. Despite their
importance, very little is known about the degree to which centromere tandem
repeats share common properties between different species across different phyla.
We used bioinformatic methods to identify high-copy tandem repeats from 282
species using publicly available genomic sequence and our own data. RESULTS: Our
methods are compatible with all current sequencing technologies. Long Pacific
Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419
bp. We assumed that the most abundant tandem repeat is the centromere DNA, which
was true for most species whose centromeres have been previously characterized,
suggesting this is a general property of genomes. High-copy centromere tandem
repeats were found in almost all animal and plant genomes, but repeat monomers
were highly variable in sequence composition and length. Furthermore,
phylogenetic analysis of sequence homology showed little evidence of sequence
conservation beyond approximately 50 million years of divergence. We find that
despite an overall lack of sequence conservation, centromere tandem repeats from
diverse species showed similar modes of evolution. CONCLUSIONS: While centromere
position in most eukaryotes is epigenetically determined, our results indicate
that tandem repeats are highly prevalent at centromeres of both animal and plant
genomes. This suggests a functional role for such repeats, perhaps in promoting
concerted evolution of centromere DNA across chromosomes.