Characteristic enrichment of DNA repeats in different genomes


Using computer programs developed for this purpose, we searched for various repeated sequences including inverted, direct tandem, and homopurine- homopyrimidine mirror repeats in various prokaryotes, eukaryotes, and an archaebacterium. Comparison of observed frequencies with expectations revealed that in bacterial genomes and organelles the frequency of different repeats is either random or enriched for inverted and/or direct tandem repeats. By contrast, in all eukaryotic genomes studied, we observed an overrepresentation of all repeats, especially homopurine-homopyrimidine mirror repeats. Analysis of the genomic distribution of all abundant repeats showed that they are virtually excluded from coding sequences. Unexpectedly, the frequencies of abundant repeats normalized for their expectations were almost perfect exponential functions of their size, and for a given repeal this function was indistinguishable between different genomes.

Publication Title

Proceedings of the National Academy of Sciences of the United States of America