Dissemin is shutting down on January 1st, 2025

Published in

MDPI, Proceedings of the Royal Society of Victoria, 1(50), p. 117, 2020

DOI: 10.3390/proceedings2020050117

Links

Tools

Export citation

Search in Google Scholar

VANIR—NextFlow Pipeline for Viral Variant Calling and de Novo Assembly of Nanopore and Illumina Reads for High-Quality dsDNA Viral Genomes

Journal article published in 2020 by Joan Martí-Carreras ORCID, Piet Maes
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Human cytomegalovirus (HCMV), like other herpes and dsDNA viruses, possesses unique properties derived from their genome architecture. The HCMV genome is composed of two unique domains: long (L) and short (S). Each domain contains a central unique region (U; thus, UL and US, respectively) and two repeated regions (thus, TRL/IRL and TRS/IRS). Recombination between repetitive regions is possible, yielding four possible genomic isomers, found in equimolar proportion in any viral infective population. Frequent recombination and an altered selective landscape can give rise to the persistence, if not fixation, of diverse variants in culturized HCMV isolates. This phenomenon has already been discovered in AD169 and Towne strains, characterizing a 10 kbp deletion (ΔUL/b’) in commonly used viral strains. Other dsDNA viruses are known for their structural rearrangements and frequent recombination. VANIR (viral variant calling and de novo assembly using nanopore and illumina reads) is a novel analysis pipeline that benefits from both short-read (Illumina) and long-read sequencing technologies (Oxford Nanopore Technologies Ltd.) to assemble high-quality dsDNA viral genomes and detection of variants. Illumina and nanopore sequencing provide complementary information to the assembly and variant discovery. Assembly contiguity, structural variant, and repeat calling are greatly improved by nanopore read-length and base-calling and base confidence by Illumina reduced error rate and increased yield. This specialized bioinformatic analysis pipeline is encoded in the NextFlow pipeline manager and containerized in a Singularity image. This set-up allows for improved traceability, reproducibility, transportability, and speed. Through VANIR, novel point mutations and structural genome rearrangements are called from sequencing data, benefiting diversity research with attenuated lab-strains and wild-type viruses.