Genome BioInformatics Research Lab

  IMIM * UPF * CRG * GRIB HOME SOFTWARE * gff2aplot
 
gff2aplot
 
   

Contents

PROGRAM DESCRIPTION


gff2aplot
Pair-wise alignment-plots for genomic sequences in PostScript.


Our basic goal is to provide an easy-to-use filter to get plots that will be useful in genomic analysis, and also giving enough quality to include them into your documents without losing resolution. We defined a simple format from GFF format to define pair-wise alignments, but now program can also read standard GFF for those alignments too.

"General Feature Format" (GFF) is described on the Sanger Centre gff definition page. Extended GFF-like aplot format is described in the "gff2aplot User's Manual".

We will appreciate if you can cite gff2aplot paper and/or the URL, as follows:

    Bioinformatics cover 19(18)       
Abril, J.F., Guigó, R. and Wiehe, T.
"gff2aplot: Plotting sequence comparisons."
Bioinformatics, 19(18):2477-2479 (2003).
[Bioinformatics Abstract] [PubMed Abstract]

URL:  http://genome.imim.es/software/gfftools/GFF2APLOT.html

...Thanks in advance for your collaboration.


EXAMPLES

 
SNAPSHOTS   
Following this link you can get some snapshots of gff2aplot output.

You can find several examples of the flexibility of gff2aplot in those of the following references which include the figure numbers where the program was used (those published before 2004 were made using the old GNUawk/Bash implementation).

  • gff2aplot URL was cited in the following publications:

    • R. Guigó and T. Wiehe.
      "Gene Prediction Accuracy in Large DNA Sequences."
      Fig 1. In M.Y. Galperin and E.V. Koonin, editors:
          Frontiers in Computational Genomics. Chapter 1, Pp:1-33 (Functional Genomics Series, Volume 3).
          Caister Academic Press, United Kingdom, 2003. [Table of Contents]

    • T. Wiehe et al.
      "SGP-1: prediction and validation of homologous genes based on sequence alignments."
      Figs 2 and 5, Genome Research, 11(9):1574-1583, 2001. [Abstract]

    • K. Reichwald et al.
      "Comparative sequence analysis of the MECP2-locus in human and mouse reveals new transcribed regions."
      Fig 2, Mammalian Genome, 11(3):182-190, 2000. [Abstract]

    • T. Thomson et al.
      "Fusion of the human gene for the polyubiquitination co-effector UEV-1 with kua, a newly identified gene."
      Fig 2B, Genome Research, 10(11):1743-1756, 2000. [Abstract]

  • gff2aplot was used, but not cited (sigh!), in the following publications:

    • G. Parra et al.
      "Comparative Gene Prediction in Human and Mouse."
      Fig 1, Genome Research, 13(1):108-117, 2003. [Abstract]

Do you think we have forgotten any citation ? Do not hesitate to send an email with the corresponding citation to authors and we will include it here...

HOWTOs


In this section you can find usefull tutorials on how to use gff2aplot. It will be regularly updated with new documents. Some knwoledge about unix commad-line is required, more specifically about bash shell.
Most of the snapshot figures from those HTML pages are in PNG format. Our apologies if your browser is not able to handle PNGs. Anyway, you will find a link to a PostScript and/or PDF file for the main figures.

  • First contact:
    A general introduction about using gff2aplot command-line switches, customization files and GFF input files. It starts describing how to run the program on a unix shell and few command-line switches. After that, it introduces to the customization files and the basics of customizing the program's output. It shows examples of the GFF input records. Finally, an example plot is generated in three steps.

  • Plotting WU-BLAST alignments:
    In this tutorial we will see examples on parseblast output when applying to a WU-BLAST file. There are three basic aligment formats that can be generated from a blast file by parseblast.pl, but all three must produce the same plots by gff2aplot. We will also see the raw output from gff2aplot and how to customize it a little bit.

  • SIM alignment for a multigenic syntenic region:
    We will visualize the output produced by SIM on the genomic MHC syntenic region between human and mouse. We will use ali2gff, a filter written in C, to convert its output into GFF. Once we have generated the whole genomic region figure, we will zoom into a single gene annotation. This will illustrate how you can expand a given area from your pictures, to highlight single features on multigenic annotation plots for instance.

  • Splice sites comparative analysis:
    We apply colors to different alignment features to visualize relationships among splice sites and the current gene annotation for an orthologous sequence pair between human and mouse. This tutorial also shows how to apply score-dependent color gradients to alignment features. Projecting annotation features into the alignment panels could be useful to pinpoint the real splice junctions from the rest. Current version of gff2aplot allows to emulate color blending to enhance the visualization of those projections.

  • Understanding gff2aplot layers:
    Each set of features drawn by gff2aplot is put into one of the pre-defined layers. It is obvious that it is important to know which layers we have and what can be shown on each of them. Another interesting feature that can be learnt from this tutorial, is the way gff2aplot can handle customizing parameters from command-line and customization files.

  • Visualizing PostScript output from GFFtools:
    Although this tutorial was written from examples made only with gff2ps, it equally applies to gff2aplot output. As both programs produce PostScript plots, in this howto we will try to provide some help on handling that PostScript output and converting to other formats (including bitmaps and PDF).

You are welcome to provide more examples on how did you use gff2aplot in your projects, by sending your report files or a link to your own html report/howto. Your experience will be valuable for other users, mostly for newer ones. Send an email to authors, we will try to include here your contribution as soon as possible.

NEWS

  12 Dec 2003  v2.0  Applications Note for gff2aplot has been published in Bioinformatics (it was submited on March 4, 2003; revised on June 11, 2003 and finally accepted on June 20, 2003). See program description section for a complete reference.
 
  02 Jun 2003  v2.0  Finally, gff2aplot has been fully re-implemented in Perl. Parsing process, record sorting and output throughput is much more faster. We also expect that the new program will be more portable than the previous versions.
Customization process is now more flexible, including gff2ps-like customization files (regular expressions, modularity of settings and multiple files allowed), ability to set all the variables via command-line switches.
Output PostScript code has also been improved, making the plots more compact and less prone to interpreter errors.
Old GNUawk/Bash version is no longer maintained although it is still available.
 
  26 Jan 2000  v1.9  After several bug fixes and versions not published, here is the new version for gff2aplot.
This version can work with "partial" custom files (without having to define all the variables) and can also read alignments in standard GFF format.
 
  24 Apr 1999  v1.3  Defaults are defined within the main gawk script too. Thus custom file is not mandatory now.  
  22 Mar 1999  v1.2  Zoom option was defined. Data vectors can be passed to plot functions on the third panel.  
  02 Mar 1999  v1.0  First running version for gff2aplot.
Annotation on axes is defined in GFF-format.
Alignment definition is written in pseudo GFF (APLOT format).
 

DOWNLOADING


Download last version of gff2aplot (v2.0) from our ftp server. The last version of the "gff2aplot User's Manual" is also available via ftp altough it is included in the program's tarball. This manual is not finished yet, you still can learn how to use gff2aplot from the examples described at the tutorials section. From version 2.0 we are only developing the perl implementation. The old GNUawk/Bash is no longer maintained but you can still get it from this link to gff2aplot version 1.9.

You have downloaded a gzipped tarball containing the perl version gff2aplot, all the supplementary scripts (i.e. WU-BLAST/NCBI-BLAST/SIM/BLAT filters), a README text file and few examples of what can be done with this tool. You can extract them with:

     gunzip -c gff2aplot-vX_xx.tar.gz | tar xvf -

On a Linux box you can try with:

     tar zxvf gff2aplot-vX_xx.tar.gz

A gff2aplot-vX_xx directory will be created, move into that directory and take a look to the README and INSTALL files. Once you have read them, just type:

     make

and then:

     make install

By default, the last command will move all the scripts to /usr/local/bin but if you want to place all the exec files into another directory, just define the new installation path as follows:

     make INSTALLDIR=/your/path/bin install

Another way of doing that is modifying that variable on the Makefile accordingly to your needs.
Sorry but there is not a "make test" yet...
At this point you are ready for running gff2aplot.pl and related code, see examples on how to use it from the tutorials section.

REPORTING BUGS


If you find any bug or something is not plotted properly, you can send a bug report. To easily find what's wrong, you should attach to that e-mail a tarball containing the custom file you were using when the bug ocurred, an example of your input GFF files, the PostScript file generated and a report file that you can get with the "-V" command-line option (type "gff2aplot -h" for further info on that verbose mode switch). We will try to answer as soon as possible.

FEATURE LIST


The following menu lists many features of gff2aplot:

  • The most important new feature is that gff2aplot has been fully re-written in Perl and now is faster and more robust. You can access from this link to a report in which we describe few aspects of the new implementation: GFF2APLOT: Nuts and Bolts (3.3Mbytes gzipped PDF file).
  • Comprehensive alignment plots for any GFF-feature. Attributes are defined separately so you can modify only whatsoever attributes for a given file or share same customization across different data-sets.
  • All parameters are set by default within the program, but it can be also fully configured via gff2ps-like flexible customization files. Program can handle several of such files, summarizing all the settings before producing the corresponding figure. Moreover, all customization parameters can be set via command-line switches, which allows users to play with those parameters before adding any to a customization file.
  • Source order is taken from input files, if you swap file order you can visualize alignment and its annotation with the new input arrangement.
  • All alignment scores can be visualized in a PiP box below gff2aplot area, using grey-color scale, user-defined color scale or score-dependent gradients.
  • Scalable fonts, which can also be choosen among the basic PostScript default fonts. Feature and group labels can be rotated to improve readability in both annotation axes.
  • The program is still defined as a Unix filter so it can handle data from files, redirections and pipes, writing output to standard-output and warnings to standard error.
  • gff2aplot is able to manage many physical page formats (from A0 to A10, and more -see available page sizes in its manual-), including user-defined ones. This allows, for instance, the generation of poster size genomic maps, or the use of a continuous-paper supporting plotting device, either in portrait or landscape.
  • You can draw different alignments on same alignment plot and distinguish them by using different colors for each.
  • Shape dictionary has been expanded, so that further feature shapes are now available (see manual).
  • Annotation projections through alignment plots (so called "ribbons") emulate transparencies via complementary color fill patterns. This feature allows to show color pseudo-blending when horizontal and vertical "ribbons" overlap.

WISH LIST


Although we have implemented many features, there are few ideas not ready yet for the current version of gff2aplot.
Here is a short list:

  • Finish the "gff2aplot User's Manual", including few examples from the HTML tutorials too.
  • Extra panel at page bottom where you can display any feature or functions related with X-axes sequence. Old version is still available, but still not documented, if you need that feature now.
  • Make this box as flexible as a gff2ps block or allow embeeding gff2ps figures.
  • Splitting large page formats (poster-like) into horizontal and/or vertical multiple page-sheets (in a smaller paper size).
  • Display alignment strand and frame.
  • Drawing functions for vector-data to visualize functions, spikes or bar-charts.
  • "Splicing" feature to join elements within a group.

We are open to any helpful suggestion for improving our programs. Do not hesitate to get in touch with us.

AUTHORS

Josep Francesc ABRIL FERRANDO
Roderic GUIGÓ SERRA
Thomas WIEHE

CopyRight © 1999 - 2003

gff2aplot is under GNU General Public License.

 
  Disclaimer webmaster