Published in

American Chemical Society, Analytical Chemistry, 5(86), p. 2320-2325, 2014

DOI: 10.1021/ac403702p

Links

Tools

Export citation

Search in Google Scholar

Peak Aggregation as an Innovative Strategy for Improving the Predictive Power of LC-MS Metabolomic Profiles

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
  • Must obtain written permission from Editor
  • Must not violate ACS ethical Guidelines
Orange circle
Postprint: archiving restricted
  • Must obtain written permission from Editor
  • Must not violate ACS ethical Guidelines
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The Liquid Chromatography-Mass Spectrometry (LC-MS)-based metabolomic datasets consist of different features including (de)protonated ions, fragments, adducts and isotopes that may show high correlation values related to a high level of collinearity. There have been described several sources of these high correlation patterns regarding metabolomic datasets. Among these sources, it should be highlighted the high level of correlation computed between features coming from the same metabolite. It is well known that soft ionisation methods (such as electrospray) produce several mass features from a particular compound (i.e. metabolite spectrum). Typically, the statistical methods used in metabolomics consider spectral peaks as variables. However it has been reported that a high collinearity between variables might be the responsible for high uncertainty values in the predictors of a regression. In this context, this technical note proposes a new strategy based on the application of the so-called peak aggregation methods (NMF Reduction, PCA Decomposition, Maximum Peak and Spectrum Mean) to take advantage of the variable collinearity and solve the issue of high variable collinearity. A set of real samples obtained after human nutritional intervention with placebo or polyphenol-rich beverages was used to test this methodology. The results showed that applying any peak aggregation method (especially NMF and PCA) improves the statistical prediction power of class pertinence independently of the nature of the classifier (linear PLS-DA or non-linear SVM). Overall, the introduction of this new approach resulted in a reduction of dimensionally of the data and, in addition, in a significant increase in the overall predictive power of the data.