Solving the Problem: Genome Annotation Standards before the Data Deluge

Klimke, William; O'Donovan, Claire; White, Owen; Brister, J. Rodney; Rodney Brister, J.; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D.; Tatusova, Tatiana

Published in

BioMed Central, Standards in Genomic Sciences, 1(5), p. 168-193, 2011

DOI: 10.4056/sigs.2084864

Tools

Export citation

Search in Google Scholar

Solving the Problem: Genome Annotation Standards before the Data Deluge

Journal article published in 2011 by William Klimke, Claire O'Donovan

, Owen White, J. Rodney Brister, J. Rodney Brister, Karen Clark, Boris Fedorov, Ilene Mizrachi, Kim D. Pruitt, Tatiana Tatusova

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries.

Published in

Links

Tools

Solving the Problem: Genome Annotation Standards before the Data Deluge

Abstract