|
Simple Sequences in Proteins and DNA
|
Simple sequences are regions of low complexity made up of short
sequence repeats (1-6 elements). The repeats may have a tandem
organization or form segments of imperfect repeats, also called
cryptic simple sequences. In DNA, as well as in proteins, regions of
low complexity are extremely abundant. For example about 71% of the
yeast proteins show significant overall simplicity as measured by the
SIMPLE algorithm. This algorithm
has been implemented into the SIMPLE v. 3.0 program for the analysis
of simplicity in any nucleic acid or protein sequence
and can be
accessed online.
Many short repeats are believed to have originated by DNA slippage and
misaligning during replication, recombination or repair. We have
studied the codon composition in regions of genes that encode for
homopeptides in order to determine whether amino acid repeats
correlate with trinucleotide repeats in the gene. A high correlation
would be consistent with slippage while a mixture of codons could be
indicative of selection of the homopeptide region. In mammals two
populations of glutamine repeats can be clearly differentiated. The
first is encoded by pure trinucleotide tracts (CAG) and the second by
very mixed tracts (CAA/CAG). The latter type tends to be conserved in
human and mouse than the pure tracts. The results suggest that while a
subset may have been recently originated by slippage, and may
therefore be neutral, some of the polyglutamine segments appear to
have been preserved throughout evolution.
|
|
- M.M. Albà, M.F. Santibáñez-Koref and J.M Hancock.
"The comparative genomics of polyglutamine repeats: extreme difference in the codon organization of repeat-encoding regions between mammals and Drosophila."
Journal of Molecular Evolution, 52:249-259 (2001).
- M.M. Albà, M.F. Santibáñez-Koref and J.M. Hancock.
"Conservation of polyglutamine tract size between mouse and human depends on codon interruption."
Molecular Biology and Evolution, 16:1641-1644 (1999).
- M.M. Albà, M.F. Santibáñez-Koref and J.M Hancock.
"Amino acid reiterations in yeast are over-represented in particular classes of proteins and show evidence of a slippage-like mutational process."
Journal of Molecular Evolution, 49:789-797 (1999).
|
|