Massive Sequence Comparisons as a Help in Annotating Genomic Sequences

Ollivier, Emmanuelle; Risler, Jean-Loup; Louis, Alexandra; Aude, Jean-Christophe

Published in

Cold Spring Harbor Laboratory Press, Genome Research, 7(11), p. 1296-1303, 2001

DOI: 10.1101/gr.177601

Cold Spring Harbor Laboratory Press, Genome Research, 7(11), p. 1296-1303

DOI: 10.1101/gr.gr-1776r

Tools

Export citation

Search in Google Scholar

Massive Sequence Comparisons as a Help in Annotating Genomic Sequences

Journal article published in 2001 by Emmanuelle Ollivier, Jean-Loup Risler, Alexandra Louis, Jean-Christophe Aude

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving forbidden

Published version: archiving restricted

Upload

Policy details

Data provided by

Abstract

An all-by-all comparison of all the publicly available protein sequences from plants has been performed, followed by a clusterization process. Within each of the 1064 resulting clusters—containing sequences that are orthologous as well as paralogous—the sequences have been submitted to a pyramidal classification and their domains delineated by an automated procedure à la PRODOM. This process provides a means for easily checking for any apparent inconsistency in a cluster, for example, whether one sequence is shorter or longer than the others, one domain is missing, etc. In such cases, the alignment of the DNA sequence of the gene with that of a close homologous protein often reveals (in 10% of the clusters) probable sequencing errors (leading to frameshifts) or probable wrong intron/exon predictions. The composition of the clusters, their pyramidal classifications, and domain decomposition, as well as our comments when appropriate, are available from http://chlora.infobiogen.fr:1234/PHYTOPROT.

Published in

Links

Tools

Massive Sequence Comparisons as a Help in Annotating Genomic Sequences

Abstract