Taxon selection using statistical learning techniques to improve transfer function prediction

Juggins, Steve; Simpson, Gavin L.; Telford, Richard J.

Published in

SAGE Publications, Holocene, 1(25), p. 130-136, 2014

DOI: 10.1177/0959683614556388

Tools

Export citation

Search in Google Scholar

Taxon selection using statistical learning techniques to improve transfer function prediction

Journal article published in 2014 by Steve Juggins

, Gavin L. Simpson

, Richard J. Telford

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Transfer functions are widely used in palaeoecology to provide quantitative environmental reconstructions using biological proxies. Most models use all but the rarest taxa present in the training set, even though many may be unrelated to the environmental variable of interest. We hypothesise that retaining such non-informative taxa will reduce model robustness and present a method for variable selection motivated by the statistical learning algorithm in random forests. We apply our species-pruning algorithm into weighted averaging (WA) and maximum likelihood calibration of response curves (MLRCs), and compare results of boosted regression trees (BRTs) using artificial and real datasets. Results from the artificial data show that WA is particularly sensitive to the influence of both non-informative taxa and secondary environmental variables in the training set or fossil assemblage, and that BRTs are relatively immune to these effects. Furthermore, species-pruned WA and MLRCs offer substantial improvements over all-species models when the training set includes non-informative taxa but does not guard against confounding effects when species have bi- or multivariate responses to the primary and one or more secondary variables. Tests with a limited set of examples of real data indicate that BRTs, MLRCs or species-pruned models have no apparent advantage over WA. We discuss possible reasons for this contradiction and suggest that more tests are needed to properly evaluate BRTs and species-pruned models.

Published in

Links

Tools

Taxon selection using statistical learning techniques to improve transfer function prediction

Abstract