Effect of sequence depth and length in long-read assembly of the maize inbred NC358

Ou, Shujun; Liu, Jianing; Chougule, Kapeel M.; Fungtammasan, Arkarachai; Seetharam, Arun S.; Stein, Joshua C.; Llaca, Victor; Manchanda, Nancy; Gilbert, Amanda M.; Wei, Sharon; Chin, Chen-Shan; Hufnagel, David E.; Pedersen, Sarah; Snodgrass, Samantha J.; Fengler, Kevin; Woodhouse, Margaret; Walenz, Brian P.; Koren, Sergey; Phillippy, Adam M.; Hannigan, Brett T.; Dawe, R. Kelly; Hirsch, Candice N.; Hufford, Matthew B.; Ware, Doreen

Published in

Nature Research, Nature Communications, 1(11), 2020

DOI: 10.1038/s41467-020-16037-7

Tools

Export citation

Search in Google Scholar

Effect of sequence depth and length in long-read assembly of the maize inbred NC358

Journal article published in 2020 by Shujun Ou, Jianing Liu, Kapeel M. Chougule, Arkarachai Fungtammasan, Arun S. Seetharam

, Joshua C. Stein, Victor Llaca

, Nancy Manchanda, Amanda M. Gilbert, Sharon Wei, Chen-Shan Chin, David E. Hufnagel, Sarah Pedersen, Samantha J. Snodgrass, Kevin Fengler and other authors.

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving forbidden

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

AbstractImprovements in long-read data and scaffolding technologies have enabled rapid generation of reference-quality assemblies for complex genomes. Still, an assessment of critical sequence depth and read length is important for allocating limited resources. To this end, we have generated eight assemblies for the complex genome of the maize inbred line NC358 using PacBio datasets ranging from 20 to 75 × genomic depth and with N50 subread lengths of 11–21 kb. Assemblies with ≤30 × depth and N50 subread length of 11 kb are highly fragmented, with even low-copy genic regions showing degradation at 20 × depth. Distinct sequence-quality thresholds are observed for complete assembly of genes, transposable elements, and highly repetitive genomic features such as telomeres, heterochromatic knobs, and centromeres. In addition, we show high-quality optical maps can dramatically improve contiguity in even our most fragmented base assembly. This study provides a useful resource allocation reference to the community as long-read technologies continue to mature.

Published in

Links

Tools

Effect of sequence depth and length in long-read assembly of the maize inbred NC358

Abstract