Published in

Nature Precedings

DOI: 10.1038/npre.2010.5010

Nature Precedings

DOI: 10.1038/npre.2010.5010.1

Links

Tools

Export citation

Search in Google Scholar

Towards reproducible MSMS data preprocessing, quality control and quantification

Journal article published in 2010 by Laurent Gatto ORCID, Kathryn S. Lilley
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The development of MSnbase aims at providing researchers dealing with labelled quantitative proteomics data with a transparent, portable, extensible and open-source collaborative framework to easily manipulate and analyse MS2-level raw tandem mass spectrometry data. The implementation in R gives users and developers a great variety of powerful tools to be used in a controlled and reproducible way. Furthermore, MSnbase has been developed following an object-oriented programming paradigm: all information that is manipulated by the user is encapsulated in ad hoc data containers to hide it's underlying complexity. We illustrate the usage and achievements of our software using a published spiked-in data set in which varying quantities of test proteins have been labelled with four different iTRAQ tags. In addition to providing raw MSMS data, MSnbase also stores meta-data and logs processing steps in the data object itself for optimal traceability. We provide graphics on how to inspect precursor data for quality control and how individual or merged MSMS spectra can subsequently be processed, plotted and extracted using a variety of methods. We also demonstrate how reporter ions (or any peaks of interest defined by the user) can easily be quantified and normalised using several build-in alternative strategies and how the effect of each transformation can be recorded, examined and reproduced. MSnbase constitutes a unique versatile working and development environment to process labelled MSMS data and provides in turn important feedback for data acquisition optimisation. We conclude by presenting future extensions of MSnbase and highlight its usage in reproducible proteomics research.