Published in

Wiley, Ecology, 2(100), p. e02568, 2018

DOI: 10.1002/ecy.2568

Zenodo, 2018

DOI: 10.5281/zenodo.1256704

Zenodo, 2018

DOI: 10.5281/zenodo.1256705

Links

Tools

Export citation

Search in Google Scholar

Comparison of large-scale citizen science data and long-term study data for phenology modeling

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Codebase and data files for this study. Github repo: https://github.com/sdtaylor/phenology_dataset_study Manuscript preprint: https://doi.org/10.1101/335802 Abstract: Large-scale observational data from citizen science efforts are becoming increasingly common in ecology, and researchers often choose between these and data from intensive local-scale studies for their analyses. This choice has potential trade-offs related to spatial scale, observer variance, and inter-annual variability. Here we explored this issue with phenology by comparing models built using data from the large-scale, citizen science National Phenology Network (NPN) effort with models built using data from more intensive studies at Long Term Ecological Research (LTER) sites. We built process based phenology models for species common to each dataset. From these models we compared parameter estimates, estimates of phenological events, and out-of-sample errors between models derived from both NPN and LTER data. We found that model parameter estimates for the same species were most similar between the two datasets when using simple models, but parameter estimates varied widely as model complexity increased. Despite this, estimates for the date of phenological events and out-of-sample errors were similar, regardless of the model chosen. Predictions for NPN data had the lowest error when using models built from the NPN data, while LTER predictions were best made using LTER-derived models, confirming that models perform best when applied at the same scale they were built. Accordingly, the choice of dataset depends on the research question. Inferences about species-specific phenological requirements are best made with LTER data, and if NPN or similar data are all that is available, then analyses should be limited to simple models. Large-scale predictive modeling is best done with the larger-scale NPN data, which has high spatial representation and a large regional species pool. LTER datasets, on the other hand, have high site fidelity and thus characterize inter-annual variability extremely well. Future research aimed at forecasting phenology events for particular species over larger scales should develop models which integrate the strengths of both datasets.