Elsevier, Molecular and Cellular Proteomics, 1(8), p. 53-69, 2009
DOI: 10.1074/mcp.m800103-mcp200
Full text: Download
Database search tools identify peptides by matching tandem mass spectra against a protein database. We study an alternative approach when all plausible de novo interpretations of a spectrum (spectral dictionary) are generated and then quickly matched against the database. We present a new MS-Dictionary algorithm for efficiently generating spectral dictionaries and demonstrate that MS-Dictionary can identify spectra that are missed in the database search. We argue that MS-Dictionary enables proteogenomics searches in six-frame translation of genomic sequences that may be prohibitively time-consuming for existing database search approaches. We show that such searches allow one to correct sequencing errors and find programmed frameshifts.