GetStopCodons.c geneid v 1.1 source documentation


Description:
Search by signal: prediction of stop codons by using a Position Weighted Array (a position weight matrix where every position is a Markov chain instead of a simple nucleotide distribution function). Predicted stops score must be higher than a fixed cutoff score. For every stop the recorded position is the last coding nucleotide before the core TGA|TAG|TAA.
Briefing:
long GetStopCodons(char* s,
                   profile* p,
                   site* sc, 
                   long l1, 
                   long l2) 
Scan the input sequence applying the PWA to every fragment candidate to contain a true signal (length = profile.dimension). Applying the PWA: for every position i, look for the probability of finding the (i-k..i) oligonucleotide in this position, being the candidate a real signal, over the probability being a false signal. In every position, the Markov chain is different, and the core is the set of consecutive positions where the bias is complete (k fixed nucleotides with probability 0 or 1). If the order of Markov chain is 0 or 1, to look up the Markov string is done directly, while a loop is required for order higher than 1 (trinucleotides and so on). Candidate regions obtaining a higher than cutoff score are inserted into the result list (array). Before finishing, stop codons at the end of sequence are computed as well on regions smaller than dimension of the profile. Returned the number of final predicted stops.




Enrique Blanco Garcia © 2001