Published in

Wiley Open Access, Ecography, 12(36), p. 1291-1298, 2013

DOI: 10.1111/j.1600-0587.2013.00127.x

Links

Tools

Export citation

Search in Google Scholar

Community-level vs species-specific approaches to model selection

Journal article published in 2013 by Bénédicte Madon ORCID, David I. Warton, Miguel B. Araújo
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

A topic of particular current interest is community-level approaches to species distribution modelling (SDM), i.e. approaches that simultaneously analyse distributional data for multiple species. Previous studies have looked at the advan- tages of community-level approaches for parameter estimation, but not for model selection – the process of choosing which model (and in particular, which subset of environmental variables) to fit to data. We compared the predictive performance of models using the same modelling method (generalised linear models) but choosing the subset of variables to include in the model either simultaneously across all species (community-level model selection) or separately for each species (species- specific model selection). Our results across two large presence/absence tree community datasets were inconclusive as to whether there was an overall difference in predictive performance between models fitted via species-specific vs community- level model selection. However, we found some evidence that a community approach was best suited to modelling rare species, and its performance decayed with increasing prevalence. That is, when data were sparse there was more opportunity for gains from ‘borrowing strength’ across species via a community-level approach. Interestingly, we also found that the community-level approach tended to work better when the model selection problem was more difficult, and more reliably detected ‘noise’ variables that should be excluded from the model.