Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics

Tao, Qiqing; Barba-Montoya, Jose; Huuki, Louise A.; Durnan, Mary Kathleen; Kumar, Sudhir

Published in

Oxford University Press, Molecular Biology and Evolution, 6(37), p. 1819-1831, 2020

DOI: 10.1093/molbev/msaa049

Tools

Export citation

Search in Google Scholar

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics

Journal article published in 2020 by Qiqing Tao

, Jose Barba-Montoya, Louise A. Huuki

, Mary Kathleen Durnan, Sudhir Kumar

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Abstract The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.

Published in

Links

Tools

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics

Abstract