BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles

Videm, Pavankumar; Rose, Dominic; Costa, Fabrizio; Backofen, Rolf

Published in

Oxford University Press (OUP), Bioinformatics, 12(30), p. i274-i282

DOI: 10.1093/bioinformatics/btu270

Tools

Export citation

Search in Google Scholar

BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles

Journal article published in 2014 by Pavankumar Videm

, Dominic Rose, Fabrizio Costa, Rolf Backofen

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Summary: Non-coding RNAs (ncRNAs) play a vital role in many cellular processes such as RNA splicing, translation, gene regulation. However the vast majority of ncRNAs still have no functional annotation. One prominent approach for putative function assignment is clustering of transcripts according to sequence and secondary structure. However sequence information is changed by post-transcriptional modifications, and secondary structure is only a proxy for the true 3D conformation of the RNA polymer. A different type of information that does not suffer from these issues and that can be used for the detection of RNA classes, is the pattern of processing and its traces in small RNA-seq reads data. Here we introduce BlockClust, an efficient approach to detect transcripts with similar processing patterns. We propose a novel way to encode expression profiles in compact discrete structures, which can then be processed using fast graph-kernel techniques. We perform both unsupervised clustering and develop family specific discriminative models; finally we show how the proposed approach is scalable, accurate and robust across different organisms, tissues and cell lines.

Published in

Links

Tools

BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles

Abstract