Scaling features of noncoding DNA

Stanley, H. E.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.

Published in

Elsevier, Physica A: Statistical Mechanics and its Applications, 1-2(273), p. 1-18, 1999

DOI: 10.1016/s0378-4371(99)00407-0

Tools

Export citation

Search in Google Scholar

Scaling features of noncoding DNA

Journal article published in 1999 by H. E. Stanley

, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, M. Simons

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

We review evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene, and utilize this fact to build a Coding Sequence Finder Algorithm, which uses statistical ideas to locate the coding regions of an unknown DNA sequence. Finally, we describe briefly some recent work adapting to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function, and reporting that noncoding regions in eukaryotes display a larger redundancy than coding regions. Specifically, we consider the possibility that this result is solely a consequence of nucleotide concentration differences as first noted by Bonhoeffer and his collaborators. We find that cytosine-guanine (CG) concentration does have a strong "background" effect on redundancy. However, we find that for the purine-pyrimidine binary mapping rule, which is not affected by the difference in CG concentration, the Shannon redundancy for the set of analyzed sequences is larger for noncoding regions compared to coding regions.

Published in

Links

Tools

Scaling features of noncoding DNA

Abstract