Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

Gori, Fabio; Tringe, Susannah Green; Folino, Gianluigi; Hijum, S. A. F. T. van; van Hijum, Sacha Aft F. T.; Hijum, S. van; Camp, H. J. M. op den; Camp, Huub Jm Op den; Op den Camp, Huub Jm M.; Jetten, Mike Sm M.; Marchiori, Elena

Published in

BioMed Central, BMC Genomics, 1(14), p. 7

DOI: 10.1186/1471-2164-14-7

Tools

Export citation

Search in Google Scholar

Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

Journal article published in 2013 by Fabio Gori, Susannah Green Tringe

, Gianluigi Folino, S. A. F. T. van Hijum, Sacha Aft F. T. van Hijum, S. van Hijum, H. J. M. op den Camp, Huub Jm Op den Camp, Huub Jm M. Op den Camp

, Mike Sm M. Jetten, Elena Marchiori

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Abstract Background Sequencing technologies have different biases, in single-genome sequencing and metagenomic sequencing; these can significantly affect ORFs recovery and the population distribution of a metagenome. In this paper we investigate how well different technologies represent information related to a considered organism of interest in a metagenome, and whether it is beneficial to combine information obtained using different technologies. We analyze comparatively three metagenomic datasets acquired from a sample containing the anammox bacterium Candidatus ’Brocadia fulgida’ ( B. fulgida ). These datasets were obtained using Roche 454 FLX and Sanger sequencing with two different libraries (shotgun and fosmid). Results In each dataset, the abundance of the reads annotated to B. fulgida was much lower than the abundance expected from available cell count information. This was due to the overrepresentation of GC-richer organisms, as shown by GC-content distribution of the reads. Nevertheless, by considering the union of B. fulgida reads over the three datasets, the number of B. fulgida ORFs recovered for at least 80% of their length was twice the amount recovered by the best technology. Indeed, while taxonomic distributions of reads in the three datasets were similar, the respective sets of B. fulgida ORFs recovered for a large part of their length were highly different, and depth of coverage patterns of 454 and Sanger were dissimilar. Conclusions Precautions should be sought in order to prevent the overrepresentation of GC-rich microbes in the datasets. This overrepresentation and the consistency of the taxonomic distributions of reads obtained with different sequencing technologies suggests that, in general, abundance biases might be mainly due to other steps of the sequencing protocols. Results show that biases against organisms of interest could be compensated combining different sequencing technologies, due to the differences of their genome-level sequencing biases even if the species was present in not very different abundances in the metagenomes.

Published in

Links

Tools

Differences in sequencing technologies improve the retrieval of anammox bacterial genome from metagenomes

Abstract