Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Huang, Dandan; Yi, Xianfu; Zhou, Yao; Yao, Hongcheng; Xu, Hang; Wang, Jianhua; Zhang, Shijie; Nong, Wenyan; Wang, Panwen; Shi, Lei; Xuan, Chenghao; Li, Miaoxin; Wang, Junwen; Li, Weidong; Kwan, Hoi Shan; Sham, Pak Chung; Wang, Kai; Li, Mulin Jun

Published in

Cold Spring Harbor Laboratory Press, Genome Research, 12(30), p. 1789-1801, 2020

DOI: 10.1101/gr.267997.120

Tools

Export citation

Search in Google Scholar

Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Journal article published in 2020 by Dandan Huang, Xianfu Yi, Yao Zhou, Hongcheng Yao, Hang Xu, Jianhua Wang, Shijie Zhang, Wenyan Nong, Panwen Wang, Lei Shi, Chenghao Xuan, Miaoxin Li, Junwen Wang, Weidong Li, Hoi Shan Kwan and other authors.

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving forbidden

Published version: archiving restricted

Upload

Policy details

Data provided by

Abstract

The advances of large-scale genomics studies have enabled compilation of cell type–specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers.

Published in

Links

Tools

Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Abstract