Genome-wide evolutionary analysis of the noncoding RNA genes and noncoding DNA of Paramecium tetraurelia

Chen, Chun-Long; Zhou, Hui; Liao, Jian-You; Qu, Liang-Hu; Amar, Laurence

Published in

Cold Spring Harbor Laboratory Press, RNA, 4(15), p. 503-514, 2009

DOI: 10.1261/rna.1306009

Tools

Export citation

Search in Google Scholar

Genome-wide evolutionary analysis of the noncoding RNA genes and noncoding DNA of Paramecium tetraurelia

Journal article published in 2009 by Chun-Long Chen

, Hui Zhou, Jian-You Liao, Liang-Hu Qu, Laurence Amar

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving forbidden

Published version: archiving restricted

Upload

Policy details

Data provided by

Abstract

The compact genome of the unicellular eukaryote Paramecium tetraurelia contains noncoding DNA (ncDNA) distributed into >39,000 intergenic sequences and >90,000 introns of 390 base pairs (bp) and 25 bp on average, respectively. Here we analyzed the molecular features of the ncRNA genes, introns, and intergenic sequences of this genome. We mainly used computational programs and comparative genomics possible because the P. tetraurelia genome had formed throughout whole-genome duplications (WGDs). We characterized 417 5S rRNA, snRNA, snoRNA, SRP RNA, and tRNA putative genes, 415 of which map within intergenic sequences, and two, within introns. The evolution of these ncRNA genes appears to have mainly involved purifying selection and gene deletion. We then compared the introns that interrupt the protein-coding gene duplicates arisen from the recent WGD and identified a population of a few thousands of introns having evolved under most stringent constraints (>95% of identity). We also showed that low nucleotide substitution levels characterize the 50 and 80–115 base pairs flanking, respectively, the stop and start codons of the protein-coding genes. Lower substitution levels mark the base pairs flanking the highly transcribed genes, or the start codons of the genes of the sets with a high number of WGD-related sequences. Finally, adjacent to protein-coding genes, we characterized 32 DNA motifs able to encode stable and evolutionary conserved RNA secondary structures and defining putative expression controlling elements. Fourteen DNA motifs with similar properties map distant from protein-coding genes and may encode regulatory ncRNAs.

Published in

Links

Tools

Genome-wide evolutionary analysis of the noncoding RNA genes and noncoding DNA of Paramecium tetraurelia

Abstract