Published in

The Royal Society, Interface, 184(18), 2021

DOI: 10.1098/rsif.2021.0389

Links

Tools

Export citation

Search in Google Scholar

Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation

This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Evolutionary relationships of protein families can be characterized either by networks or by trees. Whereas trees allow for hierarchical grouping and reconstruction of the most likely ancestral sequences, networks lack a time axis but allow for thresholds of pairwise sequence identity to be chosen and, therefore, the clustering of family members with presumably more similar functions. Here, we use the large family of arylsulfatases and phosphonate monoester hydrolases to investigate similarities, strengths and weaknesses in tree and network representations. For varying thresholds of pairwise sequence identity, values of betweenness centrality and clustering coefficients were derived for nodes of the reconstructed ancestors to measure the propensity to act as a bridge in a network. Based on these properties, ancestral protein sequences emerge as bridges in protein sequence networks. Interestingly, many ancestral protein sequences appear close to extant sequences. Therefore, reconstructed ancestor sequences might also be interpreted as yet-to-be-identified homologues. The concept of ancestor reconstruction is compared to consensus sequences, too. It was found that hub sequences in a network, e.g. reconstructed ancestral sequences that are connected to many neighbouring sequences, share closer similarity with derived consensus sequences. Therefore, some reconstructed ancestor sequences can also be interpreted as consensus sequences.