Summarization vs Peptide-Based Models in Label-Free Quantitative Proteomics: Performance, Pitfalls, and Data Analysis Guidelines

Goeminne, Ludger Jan Elzue; Argentini, Andrea; Martens, Lennart; Clement, Lieven

Published in

American Chemical Society, Journal of Proteome Research, 6(14), p. 2457-2465, 2015

DOI: 10.1021/pr501223t

Tools

Export citation

Search in Google Scholar

Summarization vs Peptide-Based Models in Label-Free Quantitative Proteomics: Performance, Pitfalls, and Data Analysis Guidelines

Journal article published in 2015 by Ludger Jan Elzue Goeminne

, Andrea Argentini, Lennart Martens, Lieven Clement

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Must obtain written permission from Editor
Must not violate ACS ethical Guidelines

Upload

Postprint: archiving restricted

Must obtain written permission from Editor
Must not violate ACS ethical Guidelines

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Quantitative label-free mass spectrometry is increasingly used to analyze the proteomes of complex biological samples. However, the choice of appropriate data analysis methods remains a major challenge. We therefore provide a rigorous comparison between peptide-based models and peptide summarization-based pipelines. We show that peptide-based methods outperform summarization-based pipelines in terms of sensitivity, specificity, accuracy and precision. We also demonstrate that predefined FDR cut-offs for the detection of differentially regulated proteins can become problematic when differentially expressed proteins are highly abundant in one or more samples. Care should therefore be taken when data is interpreted from samples with spiked-in internal controls, and from samples that contain a few very highly abundant proteins. We do, however, show that specific diagnostic plots can be used for assessing differentially expressed proteins and the overall quality of the obtained fold-change estimates. Finally, our study also illustrates that imputation under the missing by low abundance assumption is beneficial for detecting differential expression in low abundant proteins, but that it negatively affects moderately to highly abundant proteins. Hence, imputation strategies that are commonly implemented in standard proteomics software should be used with care.

Published in

Links

Tools

Summarization vs Peptide-Based Models in Label-Free Quantitative Proteomics: Performance, Pitfalls, and Data Analysis Guidelines

Abstract