Oxford University Press (OUP), Briefings in Functional Genomics, 5-6(9), p. 385-390
DOI: 10.1093/bfgp/elq021
Full text: Download
Gene Set Enrichment (GSE) is a computational technique which determines whether a priori defined set of genes show statistically significant differential expression between two phenotypes. Currently, the gene sets used for GSE are derived from annotation or pathway databases, which often contain computationally based and unrepresentative data. Here, we propose a novel approach for the generation of comprehensive and biologically derived gene sets, deriving sets through the application of machine learning techniques to gene expression data. These gene sets can be produced for specific tissues, developmental stages or environments. They provide a powerful and functionally meaningful way in which to mine genomewide association and next generation sequencing data in order to identify disease-associated variants and pathways.