Published in

Elsevier, Chemometrics and Intelligent Laboratory Systems, (122), p. 65-77, 2013

DOI: 10.1016/j.chemolab.2012.12.005

Links

Tools

Export citation

Search in Google Scholar

Comparison of Sparse and Jack-knife partial least squares regression methods for variable selection

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The objective of this study was to compare two different techniques of variable selection, Sparse PLSR and Jack-knife PLSR, with respect to their predictive ability and their ability to identify relevant variables. Sparse PLSR is a method that is frequently used in genomics, whereas Jack-knife PLSR is often used by chemometricians. In order to evaluate the predictive ability of both methods, cross model validation was implemented. The performance of both methods was assessed using FTIR spectroscopic data, on the one hand, and a set of simulated data. The stability of the variable selection procedures was highlighted by the frequency of the selection of each variable in the cross model validation segments. Computationally, Jack-knife PLSR was much faster than Sparse PLSR. But while it was found that both methods have more or less the same predictive ability, Sparse PLSR turned out to be generally very stable in selecting the relevant variables, whereas Jack-knife PLSR was very prone to selecting also uninformative variables. To remedy this drawback, a strategy of analysis consisting in adding a perturbation parameter to the uncertainty variances obtained by means of Jack-knife PLSR is demonstrated.