Published in

Cell Press, American Journal of Human Genetics, 4(97), p. 576-592

DOI: 10.1016/j.ajhg.2015.09.001



Export citation

Search in Google Scholar

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

Journal article published in 2015 by Schizophrenia Working Group of the Psychiatric Genomics Consortium (Ripke S., Bm Neale, Bjarni J. Vilhjálmsson ORCID, Jian Yang ORCID, Hilary Kiyo Finucane ORCID, Alexander Gusev ORCID, Sara Lindström ORCID, Stephan Ripke, Giulio Genovese ORCID, Jt Walters, Po-Ru Loh ORCID, Kh Farh, Gaurav Bhatia ORCID, Ron Do ORCID, Tristan Hayeck ORCID and other authors.
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO


Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.