CookingGenes.c geneid v 1.1 source documentation


Description:
Processing of predicted genes in order to extract more information (gene score, number of exons, ...) about every separarated gene out of a multiple gene list of linked exons. Depending on the output format, protein product or genomic sequence of exons might be displayed.
Briefing:
void printProt(char* Name,
               long ngen,
               char* prot,
               long nAA,
               int mode)
Display a biological sequence in fasta format (FASTALINE chars per line), including a previously predefined header starting with ">".
void selectFeatures(char* exonType,
                    char exonStrand,
                    profile** p1,
                    profile** p2,
                    int* type1,
                    int* type2,
                    int* strand,
                    gparam* gp)
Given an exon and its list of properties, this function returns: the type of its two signals including their profile, and its strand. (useful to print in extend format)
long CookingInfo(exonGFF* eorig,
                 gene info[],
                 long* nvExons)
Post-processing of the best genes (geneid format). In geneid, genes are list of exons linked from the end of the gene to the beginning. Therefore, from the last exon of the last gene, recursively (bottom-up), predicted genes are recovered and separated. Thus, forward genes are in this way -- (BOTTOM) Terminal > Internal >... > First (TOP) -- and reverse genes in this other -- (BOTTOM) First > Internal > ... >Terminal (TOP) -- At the same time, information about gene score, first and last exon and number of exons from the current gene are recorded while pointer jumping. Moreover, number of annotations used into the final gene prediction is returned (nvExons). The last exon found must be GHOST exon, an artificial exon (marked as strand *) meaning end of prediction. Returns the number of genes found into the input list of exons (based on the standard set of features provided together with the gene model).
void PrintGene(exonGFF* start,
               exonGFF* end,
               char Name[],
               char* s,
               gparam* gp,
               dict* dAA,
               long igen,
               long nAA,
               int** tAA,
               int nExon,
               int nExons)
Print the input gene recursively (in the reversed order), by printing every exon in the selected format. Extended format is implemented here (printing for every exon, complete information about its signals in adition to the exon) by calling the routine SelectFeatures to get the types of signals.
void CookingGenes(exonGFF *e,
                  char Name[],
                  char* s,
                  gparam* gp,
                  dict* dAA)
Management of post-processing genes to pretty-printing (geneid / XML format) that includes protein and genomic sequences displayed in fasta format. gff format does not include any sequence to display.




Enrique Blanco Garcia © 2001