Published in

Springer (part of Springer Nature), Journal of Molecular Evolution, 6(41)

DOI: 10.1007/bf00173194

Links

Tools

Export citation

Search in Google Scholar

A method for determining the position and size of optimal sequence regions for phylogenetic analysis

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The availability of fast and accurate sequencing procedures along with the use of PCR has led to a proliferation of studies of variability at the molecular level in populations. Nevertheless, it is often impractical to examine long genomic stretches and a large number of individuals at the same time. In order to optimize this kind of study, we suggest a heuristic procedure for detection of the shortest region whose informational content can be considered sufficient for significant phylogenetic reconstruction. The method is based on the comparison of the pairwise genetic distances obtained from a set of sequences of reference to those obtained for different windows of variable size and position by means of a simple index. We also present an approach for testing whether the informative content in the stretches selected in this way is significantly different from the corresponding content shown by the larger genomic regions used as reference. Application of this test to the analysis of the VP1 protein gene of foot-and-mouth-disease type C virus allowed us to define optimal stretches whose informative content is not significantly different from that displayed by the complete VP1 sequence. We showed that the predictions made for type C sequences are valid for type O sequences, indicating that the results of the procedure are consistent.