Practice on Orthology prediction

1) Copy and paste the following sequence into the most appropriate NCBI Blast search.


2)Explore the score, identity, e-value, and alignment of the best-scoring hit. What is the probable source (species) for this sequence?, why do you get so many similar hits in the same species?

Explore the "taxonomy reports" and "distance tree" tabs.

3)Repeat the search using as a limit the species name of Drosophila melanogaster [in Choose Search set, Organism]. Compare the relevant values and the alignment with the result above.

4)Get the best hit in Drosophila obtained above and use it for a reverse Blast against the human sequence. Is this a Best bidirectional hit (BBH) of our human protein? what would you conclude about their orthology or paralogy relationships?

You can repeat the same procedure for, Mouse, Chicken, Ciona intestinalia and Drosophila, if you wish, this is quite tedious so just stop when you think you got the idea.

5)Now we will explore the orthology relationships all at once. You could have collected all the blast results from the above searches and put them all into a multiple sequence file in FASTA format.

You can find this here

You can build a tree using this resource here. You can just click on "one click" mode to make your life easy. Alternatively you may want to run it on your console using your program of choice.

Stare at the tree until you figure out where the duplications are, and which sequences are orthologous or paralogous to each other, according to the original Fitch's definition

Great, Walter Fitch would be proud of you! (or not)

6)Now we will explore the orthology relationships of human TP53 in Mouse, Chicken, Ciona intestinalia and Drosophila in several on-line databases. You can search TP53 by name or by Blast searchers in the different databases. To help you compare results you can fill in a table with the type of relatioinship with human (one-to-one, etc) in each of the species.

Let us try the following databases, explore the different structures and type of information that they provide:

  • In-paranoid An extension of BBH tto include in-paralogs
  • Egg NOG A clustering method to derive orthologous groups
  • Ensembl A clustering + phylogeny-based method
  • PhylomeDB A phylogeny-based, gene-centric method

    DatabaseMouseCiona intestinalisDrosophila melanogaster

    6) Have a look at the "Quest for Orthologs" initiative and explore other databases of the many available

    7) Now you are ready to do the Orthology exercises using python