A Course on Sequence Analysis. Practical
Gene Identification in
Genomic DNA sequences
by Roderic Guigó 7/6/97
In this practical we will run a genomic DNA sequence against
a gene prediction program, and we will analyze the predicted proteins,
if any, through the Genequiz sistem.
WWW DB Tools
We will use:
- SRS (at EBI or EMBL) to extract a genomic sequence. )
- ftp to Sanger Center (http://www.sanger.ac.uk/) to extract a genommic sequence
- GenMark (at the
EBI (http://www2.ebi.ac.uk/genemark/) , to locate potential genes in the sequences)
- MetaGene (at the
Rat Genome Database (http://rgd.mcw.edu/METAGENE/)
EBI (http://www2.ebi.ac.uk/genemark/) , to locate potential genes in the sequences)
- GeneQuiz (at EBI (http://columba.ebi.ac.uk:8765/ext-genequiz/), to analyze the predicted genes)
Step 1. Extract a genomic DNA sequence from a Genomic Center
- Go to Sanger Center Home Page (at http://www.sanger.ac.uk/
- Follow the links Human Genome Project, Cr. 6, Finished genomic sequence by ftp
- Get sequence cICB2046
- Netscape (text) save on e.g. humgenomic.fa
Step 2.Run GenMark for potential genes
- Load GeneMark (http://www2.ebi.ac.uk/genemark/)
- Select hum_49 at Species Matrix Selection
- Select protein_translations at Selecting ORF's, Selecting Regions, and Selecting Exons.
- Upload the Genomic Sequence
- Run GeneMark
- click on Region output
- cut and paste all translated regions into one long single amino
acid sequence, and save it into a file
Step 3.Run MetaGene for potential genes
- Load MetaGene (http://rgd.mcw.edu/METAGENE/)
- fill in sequence name and e-mail address
- Upload the Genomic Sequence
- submit query.
- wait for a couple of minutes, monitoring the search process
- Current Search Results
- annotate the results
- Tools. Statistics summary
- Tools. Analysis
ANALIZE THE RESULTS
Step 2.Blast the predicted aminoacid sequence
- Load a BLAST2 query submission page (at EBI or EMBL).
- Upload the predicted amino acid sequence.
- Check the number of top hits and alignments to be shown: set to 100 or so.
- Start the BLAST search: this should take a few minutes at most.
- Save output to a new file, so that you do not loose it.
- Examine output, and investigate the detected entries by using the SRS links.
Step 2.Analyze the predicted aminoacid sequence through
GeneQuiz
- Load GeneQuiz at the EBI (http://genomic.sanger.ac.uk)
- Click on GQserve
- fill in your e-mail address and a sequence name
- Copy and Paste one of the predicted ORFs'
- submit sequence