geneid.h geneid v 1.2 source documentation


Description:
Definitions of constants, data types and headers of functions used outside the module which contains the implementation.
Briefing:
LENGTHSi
Input sequence is divided into fragments having this length.
OVERLAP
Nucleotides shared between 2 consecutive fragments.
RSITES
Given a DNA sequence, one signal per RSITES nucleotides is estimated to be predicted on (computing NUMSITES).
REXONS
Given a DNA sequence, one exon per REXONS nucleotides is estimated to be predicted on (computing NUMEXONS).
RBSITES
Given a DNA sequence, one signal per RBSITES nucleotides is the estimated ratio of sites which must be copied between 2 fragments (computing BACKUPSITES).
RBEXONS
Given a DNA sequence, one exon per RBEXONS nucleotides is the estimated ratio of exons which must be copied between 2 fragments (computing BACKUPEXONS).
RFIRST
NUMEXONS / RFIRST initial exons are supposed to be predicted in LENGTHSi bases.
RINTER
NUMEXONS / RINTER internal exons are supposed to be predicted in LENGTHSi bases.
RTERMI
NUMEXONS / RTERMI terminal exons are supposed to be predicted in LENGTHSi bases.
RSINGL
NUMEXONS / RSINGL single genes are supposed to be predicted in LENGTHSi bases.
RORF
NUMEXONS / RSINGL ORFs are supposed to be predicted in LENGTHSi bases.
FSORT
FSORT * NUMEXONS is the maximum number of predicted exons (both strands) in LENGTHSi bases.
NUMEVIDENCES
Maximum number of annotations per locus read from optional file.
MAXHSP
Maximum number of HSPs (protein homology) read from optional file (per locus, strand and frame).
MAXNSEQUENCES
Maximum number of locus in multi-fasta files.
MAXGENE
Maximum number if predicted genes.
MAXEXONGENE
Maximum amount of exons in every gene.
MAXAA
Maximum length of predicted proteins (in amino acids).
MAXCDNA
Maximum length of exonic sequence for a gene (cDNA).
MAXISOCHORES
Maximum number of isochores in the parameters file.
EXONLENGTH, SINGLEGENELENGTH, ORFLENGTH
Minimum allowed size for internal exons, single genes and ORFs.
MINEXONLENGTH, MINSCORELENGTH
Minimum size for exons to compute protein coding potential (score if not).
ISOCONTEXT
Size of region around an exon when its G+C content is computed.
NULL_OLIGO_SCORE
Penalty for N's in scoring exons.
LOCUSLENGTH
Maximum number of chars for the name of the input sequence.
OLIGOLENGTH
Maximum length for the oligonucleotides used in Markov chains (scoring sites and exons).
VERSION
Current geneid version.
SITES, EXONS, EVIDENCE
Field feature in gff standard.
FRAMES, STRANDS, FORWARD, REVERSE, ACC, DON, STA, STO, FIRST, INTERNAL, TERMINAL, SINGLE, ORF, LENGTHCODON, PERCENT, MEGABYTE, MAXTIMES, PROT, DNA, MINUTE
Numerical constants.
sFORWARD, sREVERSE, sACC, sDON, sSTA, sSTO, sFIRST, sINTERNAL, sTERMINAL, sSINGLE, sORF, sEXON, xmlFORWARD, xmlREVERSE
String constants.
BLOCK, NONBLOCK
Mark gene model rules as blocking or not blocking. To preserve or not the group structure in gene annotations (evidences).
COFFSET
Correction because of arrays in C starts from 0 to N -1.
MAXLINE
Maximum number of characters read from input line.
MAXSTRING
Maximum allowed length of geneid messages and strings.
MAXENTRY, MAXTYPE, MAXINFO, NOTFOUND
Dictionary (hash table) definitions.
HASHFACTOR
Computing size of hash table to save best genes (sites, exons).
PARAMETERFILE
Default parameters file: filename.
FILENAMELENGTH
Maximum size for filenames.
INFI, sINFI, INF
Representing the infinity value.
MAXSCORE
Default value for annotations forced to appear in the final prediction.
FASTALINE
Lenght (maximum) for fasta lines
MESSAGE_FREQ
Display message with amount of sequence read (frequency).
NOGROUP
String to represent (ab initio) exons without group (field 9 in gff).




Enrique Blanco Garcia © 2003