Estimating Genome-Wide Significance for Whole-Genome Sequencing Studies: Genome-Wide Significance for Rare Variants

Xu, ChangJiang; Tachmazidou, Ioanna; Walter, Klaudia; Ciampi, Antonio; Zeggini, Eleftheria; Greenwood, Celia M. T.

Published in

Wiley, Genetic Epidemiology, 4(38), p. 281-290, 2014

DOI: 10.1002/gepi.21797

Tools

Export citation

Search in Google Scholar

Estimating Genome-Wide Significance for Whole-Genome Sequencing Studies: Genome-Wide Significance for Rare Variants

Journal article published in 2014 by ChangJiang Xu, Ioanna Tachmazidou, Klaudia Walter

, Antonio Ciampi, Eleftheria Zeggini, Celia M. T. Greenwood

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Although a standard genome-wide significance level has been accepted for the testing of association between common genetic variants and disease, the era of whole-genome sequencing (WGS) requires a new threshold. The allele frequency spectrum of sequence-identified variants is very different from common variants, and the identified rare genetic variation is usually jointly analyzed in a series of genomic windows or regions. In nearby or overlapping windows, these test statistics will be correlated, and the degree of correlation is likely to depend on the choice of window size, overlap, and the test statistic. Furthermore, multiple analyses may be performed using different windows or test statistics. Here we propose an empirical approach for estimating genome-wide significance thresholds for data arising from WGS studies, and we demonstrate that the empirical threshold can be efficiently estimated by extrapolating from calculations performed on a small genomic region. Because analysis of WGS may need to be repeated with different choices of test statistics or windows, this prediction approach makes it computationally feasible to estimate genome-wide significance thresholds for different analysis choices. Based on UK10K whole-genome sequence data, we derive genome-wide significance thresholds ranging between 2.5 × 10(-8) and 8 × 10(-8) for our analytic choices in window-based testing, and thresholds of 0.6 × 10(-8) -1.5 × 10(-8) for a combined analytic strategy of testing common variants using single-SNP tests together with rare variants analyzed with our sliding-window test strategy.

Published in

Links

Tools

Estimating Genome-Wide Significance for Whole-Genome Sequencing Studies: Genome-Wide Significance for Rare Variants

Abstract