iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates

Shteynberg, David; Deutsch, Eric W.; Lam, Henry; Eng, Jimmy K.; Sun, Zhi; Tasman, Natalie; Mendoza, Luis; Moritz, Robert L.; Aebersold, Ruedi; Nesvizhskii, Alexey I.

Tools

Export citation

Search in Google Scholar

iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates

Journal article published in 2011 by David Shteynberg, Eric W. Deutsch

, Henry Lam

, Jimmy K. Eng, Zhi Sun, Natalie Tasman, Luis Mendoza, Robert L. Moritz, Ruedi Aebersold, Alexey I. Nesvizhskii

This paper is available in a repository.

Full text: Download

Preprint: policy unknown

Upload

Postprint: policy unknown

Upload

Published version: policy unknown

Upload

Abstract

The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.

Links

Tools

iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates

Abstract