GENE MODEL STATISTICS FOR A.dorsata2 A subset of 1992 sequences (randomly chosen from the 2491 gene models) was used for training The user has selected to use 1992 gene models (80 % of total) for training and to set aside 499 annotations (20 % of total) for evaluation 9 of the gene models translate into proteins with in-frame stops within the training set and 4 in the evaluation set (seqs removed). There are 60 non-canonical donors as part of the training set There are 5 non-canonical acceptors as part of the training set There are 2 non-canonical start sites as part of the training set These gene models correspond to 3214606 coding bases and 8427530 non-coding bases Deriving a markov model for the coding potential of order 5 The intronic sequences extracted from the gene models have an average length of 707.542, with 2693.541 of SD Geneid can predict gene models having introns with a minimum length of 30 nucleotides and a maximum of 50000 bases (boundaries used in gene model) The minimum (user selected) intergenic distance was set to 200 nucleotides whereas the maximum was set to Infinity (boundaries used in gene model) The GC content of the exonic and intronic sequences is 0.364 (SD 0.069) and 0.169 (SD 0.078) respectively The gene models used for training contain 13894 exons The gene models average 7.007 exons per gene (SD 4.492) The average length of the exons (non-single) in the training set gene models is 225.556 (SD 247.885) The training set includes 62 single-exon genes (out of 1992 ) gene models The donor site profile chosen by the user spans 8 nucleotides: position 29 to 36 The acceptor site profile chosen by the user spans 26 nucleotides: position 5 to 30 The start site profile chosen by the user spans 10 nucleotides: position 28 to 37 The user chose to optimize the internal parameters of geneid based on an artificial contig and therefore NO 10x cross validation will be performed eWF range : -4.5 to -2.5 oWF range : 0.25 to 0.50 Best parameter file performance obtained using oWF: 0.40 and eWF: -4.00 Sorted performance results (best to worst) for different values of oWF and eWF: oWF eWF SN SP CC SNe SPe SNSP SNg SPg SNSPg raME raWE 0.40 -4.00 0.96 0.95 0.94 0.82 0.83 0.82 0.38 0.35 0.37 0.07 0.07 0.35 -4.00 0.96 0.95 0.94 0.82 0.83 0.82 0.37 0.36 0.37 0.08 0.07 0.45 -4.00 0.96 0.94 0.94 0.82 0.82 0.82 0.38 0.34 0.36 0.07 0.08 0.40 -4.50 0.95 0.95 0.94 0.80 0.84 0.82 0.35 0.36 0.36 0.09 0.05 0.35 -4.50 0.95 0.96 0.94 0.80 0.85 0.82 0.35 0.37 0.36 0.10 0.05 0.30 -4.00 0.95 0.95 0.94 0.81 0.83 0.82 0.35 0.35 0.35 0.08 0.07 0.45 -4.50 0.95 0.95 0.94 0.80 0.83 0.82 0.36 0.35 0.35 0.09 0.06 0.30 -4.50 0.94 0.96 0.93 0.79 0.84 0.82 0.33 0.36 0.34 0.11 0.05 0.40 -3.50 0.97 0.94 0.94 0.83 0.80 0.81 0.40 0.32 0.36 0.06 0.10 0.35 -3.50 0.96 0.94 0.94 0.83 0.80 0.81 0.39 0.32 0.35 0.06 0.10 0.45 -3.50 0.97 0.93 0.93 0.83 0.79 0.81 0.40 0.31 0.35 0.06 0.11 0.30 -3.50 0.96 0.94 0.94 0.82 0.79 0.81 0.36 0.31 0.34 0.06 0.10 0.50 -4.00 0.96 0.94 0.93 0.82 0.80 0.81 0.37 0.31 0.34 0.07 0.09 0.50 -4.50 0.96 0.94 0.93 0.80 0.82 0.81 0.35 0.33 0.34 0.08 0.06 0.50 -3.50 0.97 0.92 0.93 0.83 0.77 0.80 0.39 0.28 0.34 0.06 0.12 0.25 -4.00 0.94 0.95 0.93 0.79 0.81 0.80 0.32 0.32 0.32 0.09 0.07 0.25 -4.50 0.93 0.96 0.93 0.77 0.84 0.80 0.30 0.34 0.32 0.12 0.05 0.25 -3.50 0.95 0.94 0.93 0.81 0.78 0.80 0.34 0.29 0.31 0.07 0.11 0.40 -3.00 0.97 0.92 0.93 0.84 0.75 0.79 0.40 0.27 0.33 0.04 0.15 0.35 -3.00 0.97 0.92 0.93 0.84 0.75 0.79 0.38 0.27 0.33 0.05 0.15 0.45 -3.00 0.97 0.91 0.92 0.84 0.74 0.79 0.40 0.26 0.33 0.04 0.16 0.30 -3.00 0.96 0.93 0.93 0.83 0.74 0.79 0.36 0.26 0.31 0.05 0.16 0.50 -3.00 0.97 0.91 0.92 0.84 0.73 0.78 0.39 0.24 0.32 0.04 0.17 0.25 -3.00 0.96 0.93 0.92 0.82 0.72 0.77 0.33 0.23 0.28 0.06 0.17 0.35 -2.50 0.97 0.90 0.92 0.84 0.68 0.76 0.37 0.20 0.29 0.04 0.22 0.40 -2.50 0.97 0.90 0.91 0.84 0.68 0.76 0.38 0.20 0.29 0.04 0.22 0.45 -2.50 0.97 0.89 0.91 0.84 0.68 0.76 0.38 0.20 0.29 0.04 0.22 0.50 -2.50 0.97 0.88 0.90 0.84 0.67 0.76 0.38 0.19 0.29 0.04 0.23 0.30 -2.50 0.97 0.91 0.92 0.83 0.67 0.75 0.35 0.19 0.27 0.04 0.23 0.25 -2.50 0.96 0.91 0.91 0.82 0.65 0.73 0.32 0.17 0.25 0.04 0.25 New optimized parameter file named: A.dorsata2.geneid.optimized.param Performance of new optimized parameter file on test set: SN SP CC SNe SPe SNSP SNg SPg SNSPg raME raWE 0.96 0.95 0.94 0.81 0.82 0.82 0.43 0.40 0.42 0.07 0.07