|geneid documentation:||8. geneid parameter file|
|Table of contents:|
geneid relies on a parameter file to build the predictions. The parameter file contains mostly the description of the probabilistic model on which the predictions are based. It also contains the so-called gene model at the end, the set of rules describing how to chain gene elements (such as exons) into gene predictions. Through the usage of the gene model and the options O/R, geneid offers support for the integration of predictions from multiple sources.
|The GENE MODEL|
The gene model is the list of rules describing the constrains under which, predicted gene elements must be joined together in the final output. This constrains refer to the succession of elements in the gene structure and to the range of allowed distances among them.
For instance, the rule above indicates that elements (exons) of type
Internal or Terminal, must be chained immediately after
elements of type First or Internal in the forward strand.
The third column indicates the range at which they can be chained. In this
rule, the predicted elements must be at least 40 bp and at most 11000, apart.
The equivalent rule for the reverse sense is:
The following rule specifies the constrains governing connections between
First line describes the relationship between the end of a gene
in the forward strand and the beginning of another one in the positive
strand or the end of a gene in the reverse strand.
Second rule defines the connections between the beginning of a gene
in the reverse strand and the beginning of a gene in the forward strand
or the end of another gene in reverse strand. To specify no maximum
distance constrains, the keyword Infinity must be used.
The present version of geneid predicts elements with types
First, Internal, Terminal and Single. Gene termination is
coded within the program with the features First+/Terminal-
(start) and First-/Terminal+ (end) while Single+/Single-
are both start and end from genes. Other elements in the additional
files provided externally (O/R options) are ignored when they are not
defined in any rule of the gene model.
Rules above are defining which elements are allowed to start and finish the prediction output. Option -F forces geneid to predict a complete gene structure in the input sequence. That is, either First-(Internal)*-Terminal, Terminal-(Internal)*-First or a single-exon gene.
|Acceptor Splice sites:|
geneid can predict both the putative branch point and the putative Poly Pyrimidine Tract for each predicted acceptor. In order to obtain such predictions, separate optional profiles for one of them or both additional signals must be provided in the parameter file. It is always mandatory to have a very basic acceptor profile to detect the core of the signal: AG.
Then, the structure
of the parameter file in that section would be:
Enrique Blanco Garcia © 2003