Nature Research, Nature Reviews Genetics, 7(13), p. 469-483, 2012
DOI: 10.1038/nrg3242
Full text: Download
Differential gene expression is the fundamental mechanism underlying animal development and cell differentiation. However, it is a challenge to identify comprehensively and accurately the DNA sequences required to regulate gene expression, called cis-regulatory modules (CRMs). Three major features (singly or in combination) are used to predict CRMs: clusters of transcription-factor binding-site motifs, noncoding DNA under evolutionary constraint, and biochemical marks associated with CRMs, such as histone modifications and protein occupancy. The validation rates for predictions indicate that identifying diagnostic biochemical marks is the most reliable method, and understanding is enhanced by analysis of motifs and conservation patterns within those predicted CRMs.