BuildInternalExons.c geneid v 1.2 source documentation


Description:
Internal exon construction: internal exons are DNA regions beginning in a acceptor splice site (AG) and finishing in a donor site (GT), not including any stop codon in frame. Internal exons may have the first and the last codon uncomplete (frame and remainder). Thus, to get the true frame, the length of the uncomplete codon must be added to the left position of the exon so that stops in frame with this new frame can be searched. Two exons (E,F) are allowed to be assembled (E-F) only if remainder(E) + frame(F) = 3, to build a complete codon.
Briefing:
long BuildInternalExons(site *Acceptor, long nAcceptors, 
                        site *Donor, long nDonors,
                        site *Stop, long nStops,
                        int MaxDonors,
                        char* Sequence,
                        exonGFF* Exon)
Every list of signals is sorted by position due to the signal prediction process. For every couple (Acceptor, Donor), three exons might be built according to the number of frames closed by the Stops following the current Acceptor. There is a limited maximum amount of exons to build, (parameter MaxDonors per frame), being allocated during the processing in a local array. Moreover, a limited minimum length required is provided (EXONLENGTH). If there is no room for more exons beginning by this Acceptor, the worst donor will be rejected. The output list of exons is sorted by Acceptor position. According to the nucleotides at the end of the exon, some information is computed to detect possible Stop codons when assembling with other exons.




Enrique Blanco Garcia © 2003