What is a Motif ?


Let D={A,C,G,T} be the alphabet of the nucleotide sequences. A
motif (pattern, signal...)
is an object dennoting a set of sequences on this alphabet, either in a
deterministic or probabilistic way.
Given a sequence S and a motif m, we will say that the motif m occurs in
S if any of the sequences denoted by m occurs in S.
A Hierarchy of Motif Descriptors


Sequence motifs can be described in a wide variety of ways.
 Exact Word. The description is an specific sequence in the alphabet.
CTTAAAATAA
 Consensus Sequences. The description allows for the
specification of alternative nucleotides occurring at a given position.
YTWWAAATAR (Consensus MEF2 sequence, Yu et al., 1992)
CTAAAAATAA
TTAAAAATAA
TTTAAAATAA
CTATAAATAA
TTATAAATAA
CTTAAAATAG
TTTAAAATAG
..........
 Regular Expressions. The description is built on an
extension of the original alphabet. Among the new symbols of this extended
alphabet, there symbols dennoting the alternative occurence of a number of
nucleotides at a given position, and symbols denoting that a given
position may not be present.
C..?[STA]..C[STA][^P]C
(ferredoxin, ironsulfur binding region signature, PROSITE database, Bairoch, 1991)
 Position Weigth Matrices. The description includes a
weight (score, probability, likelihood) for each symbol occuring at each
position along the motif.
Follow the link for An Introduction to Position Weigth Matrices
