Published in

National Academy of Sciences, Proceedings of the National Academy of Sciences, 26(96), p. 15173-15177, 1999

DOI: 10.1073/pnas.96.26.15173

Links

Tools

Export citation

Search in Google Scholar

Genetic epidemiology of single-nucleotide polymorphisms

Journal article published in 1999 by A. Collins ORCID, C. Lonjou ORCID, Ne E. Morton
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

On the causal hypothesis, most genetic determinants of disease are single-nucleotide polymorphisms (SNPs) that are likely to be selected as markers for positional cloning. On the proximity hypothesis, most disease determinants will not be included among markers but may be detected through linkage disequilibrium with other SNPs. In that event, allelic association among SNPs is an essential factor in positional cloning. Recent simulation based on monotonic population expansion suggests that useful association does not usually extend beyond 3 kb. This is contradicted by significant disequilibrium at much greater distances, with corresponding reduction in the number of SNPs required for a cost-effective genome scan. A plausible explanation is that cyclical expansions follow population bottlenecks that establish new disequilibria. Data on more than 1,000 locus pairs indicate that most disequilibria trace to the Neolithic, with no apparent difference between haplotypes that are random or selected through a major disease gene. Short duration may be characteristic of alleles contributing to disease susceptibility and haplotypes characteristic of particular ethnic groups. Alleles that are highly polymorphic in all ethnic groups may be older, neutral, or advantageous, in weak disequilibrium with nearby markers, and therefore less useful for positional cloning of disease genes. Significant disequilibrium at large distance makes the number of suitably chosen SNPs required for genome screening as small as 30,000, or 1 per 100 kb, with greater density (including less common SNPs) reserved for candidate regions.