|
Precise control of gene transcription initiation is one of the most important steps in the regulation of
gene expression. The information for the control of the initiation of the RNA synthesis by the RNA
polymerase II is mostly contained in the gene promoter, a region usually 200 to 2000 bp long upstream of
the transcription start site (TSS) of the gene. The transcription factors (TFs) interact with sequence
specific elements or motifs (the TF binding sites, TFBSs) in the promoter regions. The promoter region
can be seen as a linear array of binding motifs that integrates information about the current status of
the cell to alter the rate of gene transcription initiation. One promoter usually contains 10 to 50
TFBSs to harbour 5 to 15 different TFs. TFBSs are tipically 5-15 bp long. In addition, TFBSs associated to
the same TF are known to tolerate one or more specific substitutions without losing functionality.
Experimental detection of TFBSs is extremely laborious and complex. Thus, computational approaches have
been widely used to overcome the problem. However, computational searches of TFBSs on a promoter sequence
are often useless because of the high probability of predicting false positives. Recently, the phylogenetic
footprinting methods that align several promoters of related genes have been proved to be useful to
elucidate the conserved sites within the regulatory sequences. However, the training of these programs
is difficult due to the lack of abundant experimental data, specially in the case of orthologous genes.
ABS is a public database of experimentally verified orthologous transcription factor binding sites (TFBSs).
Annotations have been collected from the literature and are manually curated. For each gene,
TFBSs conserved in orthologous sequences from at least two different species must be available.
Promoter sequences as well as the original GenBank or RefSeq entries are additionally supplied in case of
future identification conflicts. The final TSS annotation has been refined using the database
dbTSS. Up to this release, 500 bps upstream the annotated transcription start site (TSS)
according to REFSEQ annotations have been always extracted to form the collection of promoter sequences
from human, mouse, rat and chicken.
For each regulatory site, the position, the motif and the sequence in which the site is present
are available in a very simple format. Cross-references to EntrezGene, PubMed and RefSeq are also
provided for each annotation. Apart from the experimental promoter annotations, predictions by
popular collections of weight matrices are also provided for each promoter sequence. In addition, global and
local alignments and graphical dotplots are also available.
Glossary
|
|
Gene:
Region of DNA that controls a discrete hereditary characteristic
Transcription:
Copying of one strand of DNA into a complementary RNA sequence by the enzyme RNA polymerase
Gene Promoter:
Sequence of DNA upstream the gene to which RNA polymerase binds to begin transcription
Transcription Factors:
Proteins required to initiate or regulate transcription in eukaryotes
Transcription Factor Binding Site:
Short fragment of DNA in the promoter recognized and bound by a certain transcription factor
Orthologous binding sites:
A binding site conserved and functional in the orthologous copy of a gene in another species
Pattern:
A word on a sequence representing a functional element
Pattern Discovery:
A search of related patterns that are previously unknown in a set of sequences
CopyRight © 2005
ABS is under GNU General Public License.
|
|