Dissemin is shutting down on January 1st, 2025

Published in

MDPI, Proteomes, 2(9), p. 26, 2021

DOI: 10.3390/proteomes9020026

Links

Tools

Export citation

Search in Google Scholar

A Systematic Evaluation of Semispecific Peptide Search Parameter Enables Identification of Previously Undescribed N-Terminal Peptides and Conserved Proteolytic Processing in Cancer Cell Lines

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) has become the most commonly used technique in explorative proteomic research. A variety of open-source tools for peptide-spectrum matching have become available. Most analyses of explorative MS data are performed using conventional settings, such as fully specific enzymatic constraints. Here we evaluated the impact of the fragment mass tolerance in combination with the enzymatic constraints on the performance of three search engines. Three open-source search engines (Myrimatch, X! Tandem, and MSGF+) were evaluated concerning the suitability in semi- and unspecific searches as well as the importance of accurate fragment mass spectra in non-specific peptide searches. We then performed a semispecific reanalysis of the published NCI-60 deep proteome data applying the most suited parameters. Semi- and unspecific LC-MS/MS data analyses particularly benefit from accurate fragment mass spectra while this effect is less pronounced for conventional, fully specific peptide-spectrum matching. Search speed differed notably between the three search engines for semi- and non-specific peptide-spectrum matching. Semispecific reanalysis of NCI-60 proteome data revealed hundreds of previously undescribed N-terminal peptides, including cases of proteolytic processing or likely alternative translation start sites, some of which were ubiquitously present in all cell lines of the reanalyzed panel. Highly accurate MS2 fragment data in combination with modern open-source search algorithms enable the confident identification of semispecific peptides from large proteomic datasets. The identification of previously undescribed N-terminal peptides in published studies highlights the potential of future reanalysis and data mining in proteomic datasets.