|geneid documentation:||6. Introducing external evidences: re-annotation|
|Table of contents:|
geneid allows to merge the program predictions with external evidence (annotations), such as already annotated genes. The external evidence is read from an additional file in gff format. geneid is instructed to read this file with the command line option -R filename. This input file must sorted by the starting position (column 4 in gff).
There is a difference between using options -O filename and
-R filename. In the first case, only elements extracted from file
will be assembled while, in the second one, gene predictions are built from
both file records and from geneid predictions.
If the elements in the input file are assigned an score, then they will
"compete" with geneid original predictions (if any) to be in the final gene
structure. For instance, this record will fight against ab initio predictions:
If no score is given for the element (a dot "."), then this element is
supposed to be mandatory (forced) in the final gene prediction (unless a conflicting
element with no score is also given in the input file). For instance, this record
will be in the final prediction.
The frame can be either set in the column 8 of gff format or
skipped by using the wildcar "." when is unknown. In the last case,
geneid generates 3 equivalent elements (one per possible frame),
computing the corresponding remainder in each case, keeping the frame/remainder
consistency anyway when assembling is done. For instance, this record is
internally expanded to these 3 exons, being incorporated to the set of
candidate exons to be part of final gene prediction:
By using the optional group field (column 9 in gff format), user
is able to specificy whether one annotated gene (annotation) introduced
in the input file has to be preserved if it is incorporated in the final results
or geneid predictions can be mixed within that annotation.
For instance, given this annotation (1), this gene will be preserved
in the final output. But, given this other annotation (2) for the same gene
but without setting a group identifier, we can obtain predictions such as
the following (3):
Enrique Blanco Garcia © 2001