Dissemin is shutting down on January 1st, 2025

Published in

Elsevier, Molecular Phylogenetics and Evolution, 2(57), p. 888-898, 2010

DOI: 10.1016/j.ympev.2010.08.006

Links

Tools

Export citation

Search in Google Scholar

Rapid and accurate species tree estimation for phylogeographic investigations using replicated subsampling

Journal article published in 2010 by Sarah Hird ORCID, Laura Kubatko, Bryan Carstens
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

We describe a method for estimating species trees that relies on replicated subsampling of large data matrices. One application of this method is phylogeographic research, which has long depended on large datasets that sample intensively from the geographic range of the focal species; these datasets allow systematicists to identify cryptic diversity and understand how contemporary and historical landscape forces influence genetic diversity. However, analyzing any large dataset can be computationally difficult, particularly when newly developed methods for species tree estimation are used. Here we explore the use of replicated subsampling, a potential solution to the problem posed by large datasets, with both a simulation study and an empirical analysis. In the simulations, we sample different numbers of alleles and loci, estimate species trees using STEM, and compare the estimated to the actual species tree. Our results indicate that subsampling three alleles per species for eight loci nearly always results in an accurate species tree topology, even in cases where the species tree was characterized by extremely rapid divergence. Even more modest subsampling effort, for example one allele per species and two loci, was more likely than not (>50%) to identify the correct species tree topology, indicating that in nearly all cases, computing the majority-rule consensus tree from replicated subsampling provides a good estimate of topology. These results were supported by estimating the correct species tree topology and reasonable branch lengths for an empirical 10-locus great ape dataset.