Clustering Millions of Tandem Mass Spectra

Frank, Ari M.; Bandeira, Nuno; Shen, Zhouxin; Tanner, Stephen; Briggs, Steven P.; Smith, Richard D.; Pevzner, Pavel A.

Published in

American Chemical Society, Journal of Proteome Research, 1(7), p. 113-122, 2007

DOI: 10.1021/pr070361e

Tools

Export citation

Search in Google Scholar

Clustering Millions of Tandem Mass Spectra

Journal article published in 2007 by Ari M. Frank, Nuno Bandeira, Zhouxin Shen, Stephen Tanner, Steven P. Briggs, Richard D. Smith

, Pavel A. Pevzner

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Must obtain written permission from Editor
Must not violate ACS ethical Guidelines

Upload

Postprint: archiving restricted

Must obtain written permission from Editor
Must not violate ACS ethical Guidelines

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Tandem mass spectrometry (MS/MS) experiments often generate redundant datasets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. We present an efficient clustering approach for analyzing large MS/MS datasets (over ten million spectra) with a capability to reduce the number of spectra submitted to further analysis by an order of magnitude. The MS/MS database search of clustered spectra results in fewer spurious hits to the database and increases number of peptide identifications as compared to regular non-clustered searches. Our open source software MS-Clustering is available for download at http://peptide.ucsd.edu or can be run online at http://proteomics.bioprojects.org/MassSpec.

Published in

Links

Tools

Clustering Millions of Tandem Mass Spectra

Abstract