STEME: A Robust, Accurate Motif Finder for Large Data Sets

Reid, John E.; Wernisch, Lorenz

Published in

Public Library of Science, PLoS ONE, 3(9), p. e90735, 2014

DOI: 10.1371/journal.pone.0090735

Tools

Export citation

Search in Google Scholar

STEME: A Robust, Accurate Motif Finder for Large Data Sets

Journal article published in 2014 by John E. Reid

, Lorenz Wernisch

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Motif finding is a difficult problem that has been studied for over 20 years. Some older popular motif finders are not suitable for analysis of the large data sets generated by next-generation sequencing. We recently published an efficient approximation (STEME) to the EM algorithm that is at the core of many motif finders such as MEME. This approximation allows the EM algorithm to be applied to large data sets. In this work we describe several efficient extensions to STEME that are based on the MEME algorithm. Together with the original STEME EM approximation, these extensions make STEME a fully-fledged motif finder with similar properties to MEME. We discuss the difficulty of objectively comparing motif finders. We show that STEME performs comparably to existing prominent discriminative motif finders, DREME and Trawler, on 13 sets of transcription factor binding data in mouse ES cells. We demonstrate the ability of STEME to find long degenerate motifs which these discriminative motif finders do not find. As part of our method, we extend an earlier method due to Nagarajan et al. for the efficient calculation of motif E-values. STEME's source code is available under an open source license and STEME is available via a web interface.

Published in

Links

Tools

STEME: A Robust, Accurate Motif Finder for Large Data Sets

Abstract