Published in

Oxford University Press, Bioinformatics, 24(35), p. 5243-5248, 2019

DOI: 10.1093/bioinformatics/btz383

Links

Tools

Export citation

Search in Google Scholar

Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions

Journal article published in 2019 by Ana S. C. Silva ORCID, Robbin Bouwmeester ORCID, Lennart Martens, Sven Degroeve
This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Abstract Motivation The use of post-processing tools to maximize the information gained from a proteomics search engine is widely accepted and used by the community, with the most notable example being Percolator—a semi-supervised machine learning model which learns a new scoring function for a given dataset. The usage of such tools is however bound to the search engine’s scoring scheme, which doesn’t always make full use of the intensity information present in a spectrum. We aim to show how this tool can be applied in such a way that maximizes the use of spectrum intensity information by leveraging another machine learning-based tool, MS2PIP. MS2PIP predicts fragment ion peak intensities. Results We show how comparing predicted intensities to annotated experimental spectra by calculating direct similarity metrics provides enough information for a tool such as Percolator to accurately separate two classes of peptide-to-spectrum matches. This approach allows using more information out of the data (compared with simpler intensity based metrics, like peak counting or explained intensities summing) while maintaining control of statistics such as the false discovery rate. Availability and implementation All of the code is available online at https://github.com/compomics/ms2rescore. Supplementary information Supplementary data are available at Bioinformatics online.