Effect of Population Stratification on Case-Control Association Studies

Gorroochurn, Prakash; Zhang, Junying; Hodge, Susan E.; Heiman, Gary A.; Greenberg, David A.

Published in

Karger Publishers, Human Heredity, 1(58), p. 30-39, 2004

DOI: 10.1159/000081454

Karger Publishers, Human Heredity, 1(58), p. 40-48

DOI: 10.1159/000081455

Tools

Export citation

Search in Google Scholar

Effect of Population Stratification on Case-Control Association Studies

Journal article published in 2004 by Prakash Gorroochurn, Junying Zhang, Susan E. Hodge, Gary A. Heiman

, David A. Greenberg

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Objectives: This is the first of two articles discussing the effect of population stratification on the type I error rate (i.e., false positive rate). This paper focuses on the confounding risk ratio (CRR). It is accepted that population stratification (PS) can produce false positive results in case-control genetic association. However, which values of population parameters lead to an increase in type I error rate is unknown. Some believe PS does not represent a serious concern [1, 2], whereas others believe that PS may contribute to contradictory findings in genetic association [3]. We used computer simulations to estimate the effect of PS on type I error rate over a wide range of disease frequencies and marker allele frequencies, and we compared the observed type I error rate to the magnitude of the confounding risk ratio. Methods: We simulated two populations and mixed them to produce a combined population, specifying 160 different combinations of input parameters (disease prevalences and marker allele frequencies in the two populations). From the combined populations, we selected 5000 case-control datasets, each with either 50, 100, or 300 cases and controls, and determined the type I error rate. In all simulations, the marker allele and disease were independent (i.e., no association). Results: The type I error rate is not substantially affected by changes in the disease prevalence per se. We found that the CRR provides a relatively poor indicator of the magnitude of the increase in type I error rate. We also derived a simple mathematical quantity, Δ, that is highly correlated with the type I error rate. In the companion article (part II, in this issue) [4], we extend this work to multiple subpopulations and unequal sampling proportions. Conclusion: Based on these results, realistic combinations of disease prevalences and marker allele frequencies can substantially increase the probability of finding false evidence of marker disease associations. Furthermore, the CRR does not indicate when this will occur.

Published in

Links

Tools

Effect of Population Stratification on Case-Control Association Studies

Abstract