Domain size distributions can predict domain boundaries

Wheelan, S. J.; Marchler Bauer, Aron; Bryant, Stephen H.

Published in

Oxford University Press (OUP), Bioinformatics, 7(16), p. 613-618

DOI: 10.1093/bioinformatics/16.7.613

Tools

Export citation

Search in Google Scholar

Domain size distributions can predict domain boundaries

Journal article published in 2000 by S. J. Wheelan, Aron Marchler Bauer

, Stephen H. Bryant

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Motivation: The sizes of protein domains observed in the 3D-structure database follow a surprisingly narrow distribution. Structural domains are furthermore formed from a single-chain continuous segment in over 80% of instances. These observations imply that some choices of domain boundaries on an otherwise uncharacterized sequence are more likely than others, based solely on the size and segment number of predicted domains. This property might be used to guess the locations of protein domain boundaries. Results: To test this possibility we enumerate putative domain boundaries and calculate their relative likelihood under a probability model that considers only the size and segment number of predicted domains. We ask, in a cross-validated test using sequences with known 3D structure, whether the most likely guesses agree with the observed domain structure. We find that domain boundary predictions are surprisingly successful for sequences up to 400 residues long and that guessing domain boundaries in this way can improve the sensitivity of threading analysis. Availability: The DGS algorithm, for 'Domain Guess by Size', is available as a web service at http:// www.ncbi. nlm.nih.gov/ dgs. This site also provides the DGS source code.

Published in

Links

Tools

Domain size distributions can predict domain boundaries

Abstract