Sparse multi-block PLSR for biomarker discovery when integrating data from LC–MS and NMR metabolomics

Karaman, İbrahim; Nørskov, Natalja P.; Yde, Christian Clement; Hedemann, Mette Skou; Bach Knudsen, Knud Erik; Knudsen, Knud Erik Bach; Kohler, Achim

Published in

Springer Verlag, Metabolomics, 2(11), p. 367-379

DOI: 10.1007/s11306-014-0698-y

Tools

Export citation

Search in Google Scholar

Sparse multi-block PLSR for biomarker discovery when integrating data from LC–MS and NMR metabolomics

Journal article published in 2014 by İbrahim Karaman

, Natalja P. Nørskov, Christian Clement Yde, Mette Skou Hedemann, Knud Erik Bach Knudsen, Knud Erik Bach Knudsen, Achim Kohler

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

The objective of this study was to implement a multivariate method which analyzes multi-block metabolomics data and performs variable selection in order to discover potential biomarkers, simultaneously. We call this method sparse multi-block partial least squares regression (Sparse MBPLSR). To achieve this method, we first defined a nonlinear iterative partial least squares (NIPALS) algorithm for Sparse PLSR, thereafter we extended it to Sparse MBPLSR. Since over-fitting is an issue when variable selection is involved, we implemented a cross model validation (CMV) to assess the reliability and stability of the selected variables. The performance of the method was evaluated using a simulated data set and a multi-block data set from a dietary intervention study with pigs used as model for humans. The objective of the study was to investigate the biochemical effects in plasma after dietary intervention with breads varying in types of dietary fiber and to identify potential biomarkers. By introducing Sparse MBPLSR, we aimed at identifying the biomarkers where data from LC–MS and NMR instruments were analyzed simultaneously and therefore in addition we intended to explore the relationships among the measurement variables of this multi-block data set. The results showed that Sparse MBPLSR with CMV is a useful tool for analyzing multi-block metabolomics data with a good prediction and for identifying potential biomarkers.

Published in

Links

Tools

Sparse multi-block PLSR for biomarker discovery when integrating data from LC–MS and NMR metabolomics

Abstract