Published in

Wiley, Human Mutation: Variation, Informatics and Disease, 4(31), p. 414-420, 2010

DOI: 10.1002/humu.21199

Links

Tools

Export citation

Search in Google Scholar

An Expectation-Maximization Program for Determining Allelic Spectrum from CNV Data (CoNVEM): Insights into Population Allelic Architecture and Its Mutational History

Journal article published in 2010 by Tom R. Gaunt ORCID, Santiago Rodriguez, Philip A. I. Guthrie, Ian N. M. Day
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Copy number variations (CNVs) are a common form of genetic variation in which the allelic population contains a distribution of copy numbers of a particular gene (or other large sequence/region). The simplest forms describe deletion (0 vs. 1 copy) or duplication (1 vs. 2) events. However, some CNV loci contain a much wider range of copy numbers, such as that seen for the CCL3L1 locus. CNV classification methods typically only describe the total (diploid) copy number, leaving the underlying genotypic and allelic frequency distribution unknown. We have developed an expectation-maximization approach for the analysis of data from tandem CNVs that enables estimation of both the allelic copy number frequency distribution and the expected copy number genotype and class distribution under the Hardy-Weinberg equilibrium (HWE). The CNV expectation-maximization algorithm is available in a Web-tool (CoNVEM, http://apps.biocompute.org.uk/convem/), which graphically and numerically presents CNV allele and genotype distributions. We have applied this approach to the analysis of salivary amylase (AMY1A, B, and C), CCL3L1, and SULT1A1 CNVs using published data, and present inferences about the evolutionary history of these loci based on CoNVEM results.