genamic.c geneid v 1.1 source documentation

Assembling the best gene from an input set of exons by using the algorithm genamic which is based on dynammic programming techniques. The best gene is the series of linked exons which have the highest sum of scores respecting the allowed rules (gene model). Optimal result is guaranteed. Exons must be sorted by acceptor (left, minor) position and within genamic, for every assembling rule (class), exons are also sorted by donor (right, major) position. The goal is to assemble the best gene finished with every exon and then, return the highest score produced gene. Essentially, for every exon in the input, there are 3 possible assemblings: remain alone, join to the gene assembled to the previous input exon or join to the best gene that finishes between the previous input exon and it. Having acceptor and donor sorting functions, every exon (and the associated gene finishing with it) is used only once, and therefore, genamic is a linear time algorithm (respect to the number of input exons). Annotations or evidences may be included among the predicted exons. Group field is then used to allow the mix between ab initio and evidence exons or to force that only exons having same group can be joined using a blocked rule. To block a rule, use the block keyword in the selected rule of the gene model.
void genamic(exonGFF* E,
             long nExons,
             packGenes* pg,
             gparam* gp)
genamic processing:
  • Prepare reverse exons to be used.
  • Sorting by donor the input set of exons.
  • For every exon, look up the dictionary to get the identifier of type.
  • For every class in which this exon is in the downstream part, get the temporary best gene according to class and frame, right now:
    • Checking maximum/minimum distances
    • Updating best temporary gene between previous and current exon
    • Assembling best temporary gene and exon, checking groups
    • Updating best gene pointer
  • Restore reverse exons.

Enrique Blanco Garcia © 2001