Published in

Wiley, Journal of the Royal Statistical Society: Series B, 4(64), p. 695-715, 2002

DOI: 10.1111/1467-9868.00357

Links

Tools

Export citation

Search in Google Scholar

Assessing population differentiation and isolation from single-nucleotide polymorphism data

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Summary We introduce a new, hierarchical, model for single-nucleotide polymorphism allele frequencies in a structured population, which is naturally fitted via Markov chain Monte Carlo methods. There is one parameter for each population, closely analogous to a population-specific version of Wright's FST, which can be interpreted as measuring how isolated the relevant population has been. Our model includes the effects of single-nucleotide polymorphism ascertainment and is motivated by population genetics considerations, explicitly in the transient setting after divergence of populations, rather than as the equilibrium of a stochastic model, as is traditionally the case. For the sizes of data set that we consider the method provides good parameter estimates and considerably outperforms estimation methods analogous to those currently used in practice. We apply the method to one new and one existing human data set, each with rather different characteristics—the first consisting of three rather close European populations; the second of four populations taken from across the globe. A novelty of our framework is that the fit of the underlying model can be assessed easily, and these results are encouraging for both data sets analysed. Our analysis suggests that Iceland is more differentiated than the other two European populations (France and Utah), a finding which is consistent with the historical record, but not obvious from comparisons of simple summary statistics.