Published in

Nature Research, Nature Methods, 1(13), p. 63-65, 2015

DOI: 10.1038/nmeth.3654

Links

Tools

Export citation

Search in Google Scholar

Efficient genotype compression and analysis of large genetic-variation data sets

Journal article published in 2015 by Ryan M. Layer, Neil Kindlon, Konrad J. Karczewski ORCID, Aaron R. Quinlan
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Genotype Query Tools (GQT) is a new indexing strategy that expedites analyses of genome variation datasets in VCF format based on sample genotypes, phenotypes and relationships. GQT’s compressed genotype index minimizes decompression for analysis, and performance relative to existing methods improves with cohort size. We show substantial (up to 443 fold) performance gains over existing methods and demonstrate GQT’s utility for exploring massive datasets involving thousands to millions of genomes.