Cell Press, Trends in Genetics, 9(30), p. 390-400
DOI: 10.1016/j.tig.2014.07.004
Full text: Download
Gene set analysis (GSA) is a promising tool for uncovering the polygenic effects associated with complex diseases. However, the available techniques reflect a wide variety of hypotheses about how genetic effects interact to contribute to disease susceptibility. The lack of consensus about the best way to perform GSA has led to confusion in the field and has made it difficult to compare results across methods. A clear understanding of the various choices made during GSA—such as how gene sets are defined, how SNPs are assigned to genes, and how individual SNP-level effects are aggregated to produce gene- or pathway-level effects—will improve the interpretability and comparability of results across methods and studies. In this review, we provide an overview of the various data sources used to construct gene sets and the statistical methods used to test for gene set association, as well as provide guidelines for ensuring the comparability of results.