Published in

Oxford University Press, Systematic Biology, 4(45), p. 516-523, 1996

DOI: 10.1093/sysbio/45.4.516

Oxford University Press (OUP), Systematic Biology, 4(45), p. 516

DOI: 10.2307/2413528

Links

Tools

Export citation

Search in Google Scholar

Accuracy of Neighbor Joining for n-Taxon Trees

Journal article published in 1996 by Korbinian Strimmer, Arndt von Haeseler
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

A Monte Carlo approach was used to estimate the accuracy of a given tree reconstruction method for any number of taxa. In this procedure, we sampled randomly over all possible bifurcating trees assigning substitution rates (branch lengths) to each edge from an exponential distribution to obtain a biologically sensible maximal observed distance. Three different sets of trees were studied: the unrestricted tree space, the biologically meaningful tree space as introduced by Nei et al. (1995, Science 267:253–254), and the population data tree space. We used this technique to elucidate the performance of neighbor joining as a function of the number of taxa, assuming that distances are uncorrected and sequences evolve according to the Jukes–Cantor model. The accuracy of neighbor joining decreases almost exponentially with the number of taxa. However, the rate of decrease depends on the tree space studied. Although the accuracy decreases towards zero, the similarity, i.e., the number of partitions that are identical between model tree and reconstructed tree, is in all cases studied much higher than the value expected for two randomly chosen trees. Although the probability of recovering the true tree is dramatically influenced by sequence length, the average similarity does not decrease substantially if branch lengths are not too short.