geneid allows to merge the program predictions with external
evidence (annotations), such as already annotated genes. The external
evidence is read from an additional file in gff format. geneid
is instructed to read this file with the command line option -R filename.
This input file must sorted by the starting position (column 4 in gff).
There is a difference between using options -O filename and
-R filename. In the first case, only elements extracted from file
will be assembled while, in the second one, gene predictions are built from
both file records and from geneid predictions.
If the elements in the input file are assigned an score, then they will
"compete" with geneid original predictions (if any) to be in the final gene
structure. For instance, this record will fight against ab initio predictions:
| 
AE002566   lab_XX  Internal    66255    66323    3.27  +  1
 | 
If no score is given for the element (a dot "."), then this element is
supposed to be mandatory (forced) in the final gene prediction (unless a conflicting
element with no score is also given in the input file). For instance, this record
will be in the final prediction.
| 
AE002566   lab_XX  Internal    66255    66323    .  +  1
 | 
The frame can be either set in the column 8 of gff format or
skipped by using the wildcar "." when is unknown. In the last case,
geneid generates 3 equivalent elements (one per possible frame),
computing the corresponding remainder in each case, keeping the frame/remainder
consistency anyway when assembling is done. For instance, this record is
internally expanded to these 3 exons, being incorporated to the set of
candidate exons to be part of final gene prediction:
| 
AE002566   lab_XX  Internal    66255    66323    3.27  +  .
AE002566   lab_XX  Internal    66255    66323    3.27  +  0
AE002566   lab_XX  Internal    66255    66323    3.27  +  1
AE002566   lab_XX  Internal    66255    66323    3.27  +  2
 | 
By using the optional group field (column 9 in gff format), user
is able to specificy whether one annotated gene (annotation) introduced
in the input file has to be preserved if it is incorporated in the final results
or geneid predictions can be mixed within that annotation.
For instance, given this annotation (1), this gene will be preserved
in the final output. But, given this other annotation (2) for the same gene
but without setting a group identifier, we can obtain predictions such as
the following (3):
| 
(1)
AE002566  external Terminal  21839    22922   18.37 -  1  gene_2
AE002566  external Internal  23679    24029    7.99 -  1  gene_2
AE002566  external First     30732    30775   -1.11 -  0  gene_2
(2)
AE002566  external Terminal  21839    22922   18.37 -  1  
AE002566  external Internal  23679    24029    7.99 -  1  
AE002566  external First     30732    30775   -1.11 -  0
 (3)
AE002566     external  Terminal  21839    22922   18.37 -  1  
AE002566     external  Internal  23679    24029    7.99 -  1  
AE002566  geneid_v1.1  Internal  28002    28007    1.14 -  1 
AE002566     external  First     30732    30775   -1.11 -  0
 |