Comparative Analysis of Virus Sequences

Currently more than 600 complete virus genomes are available in the public sequence repository GenBank. Sequence comparison methods allow us to identify protein regions that have been conserved in proteins that have a common ancestor (homologues). These conserved regions generally represent functionally important domains. However homology relationships between proteins are not explicitly mapped in primary databases such as GenBank and, more specialised, secondary, databases are required. We have developed a database that organises virus protein sequences into homologous protein families, automatically identifies conserved sequence domains in the proteins and uses a consistent system to organise virus functions and taxonomic levels.

The Virus Database (VIDA) also includes links to other databases such as PDB, CATH and Swissprot. In VIDA all available sequences from the herpesvirus, poxvirus, arterivirus, coronavirus and papillomavirus can be examined. The homologous protein families have also been used for herpesvirus phylogeny reconstruction on the basis of gene content and to annotate clusters of genes derived from array-based mRNA quantitation experiments.

