Elsevier, Journal of Statistical Planning and Inference, 12(137), p. 3975-3989
DOI: 10.1016/j.jspi.2007.04.015
Full text: Download
The increasing availability of high-throughput data, that is, massive quantities of molecular biology data arising from different types of experiments such as gene expression or protein microarrays, leads to the necessity of methods for summarizing the available information. As annotation quality improves it is becoming common to rely on biological annotation databases, such as the Gene Ontology (GO), to build functional profiles which characterize a set of genes or proteins using the distribution of their annotations in the database. In this work we describe a statistical model for such profiles, provide methods to compare profiles and develop inferential procedures to assess this comparison. An R-package implementing the methods will be available at publication time.