Oxford University Press, Molecular Biology and Evolution, 1(32), p. 275-286, 2014
Full text: Download
Our understanding of genome-wide and comparative sequence information has been broadened considerably by the databases available from the University of California Santa Cruz (UCSC) Genome Bioinformatics Department. In particular, the identification and visualization of genomic sequences, present in some species but absent in others, led to fundamental insights into gene and genome evolution. However, the UCSC tools currently enable one to visualize orthologous genomic loci for a range of species in only a single locus. For large-scale comparative analyses of such presence/absence patterns a multi-locus view would be more desirable. Such a tool would enable us to compare thousands of relevant loci simultaneously and to resolve many different questions about, e.g., phylogeny, specific aspects of genome and gene evolution, such as the gain or loss of exons and introns, the emergence of novel transposed elements, non-protein coding RNAs, and viral genomic particles. Here, we present the first tool to facilitate the parallel analysis of thousands of genomic loci for cross-species presence/absence patterns based on multi-way genome alignments. This Genome Presence/Absence Compiler (GPAC) uses annotated or other compilations of coordinates of genomic locations and compiles all presence/absence patterns in a flexible, color-coded table linked to the individual UCSC Genome Browser alignments. We provide examples of the versatile information content of such a screening system especially for 7SL-derived transposed elements, nuclear mitochondrial DNA, DNA transposons and miRNAs in primates (http://www.bioinformatics.uni-muenster.de/tools/gpac