Full text: Download
Large amounts of gene expression data from several different technologies are becoming available to the scientific community. A common practice is to use these data to calculate global gene coexpression for validation or integration of other "omic" data. To assess the utility of publicly available datasets for this purpose we have analyzed Homo sapiens data from 1202 cDNA microarray experiments, 242 SAGE libraries, and 667 Affymetrix oligonucleotide microarray experiments. The three datasets compared demonstrate significant but low levels of global concordance (rc<0.11). Assessment against Gene Ontology (GO) revealed that all three platforms identify more coexpressed gene pairs with common biological processes than expected by chance. As the Pearson correlation for a gene pair increased it was more likely to be confirmed by GO. The Affymetrix dataset performed best individually with gene pairs of correlation 0.9-1.0 confirmed by GO in 74% of cases. However, in all cases, gene pairs confirmed by multiple platforms were more likely to be confirmed by GO. We show that combining results from different expression platforms increases reliability of coexpression. A comparison with other recently published coexpression studies found similar results in terms of performance against GO but with each method producing distinctly different gene pair lists.