Published in

Elsevier, Geoderma, (237-238), p. 237-245, 2015

DOI: 10.1016/j.geoderma.2014.09.006

Links

Tools

Export citation

Search in Google Scholar

Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The assessment of class frequency in soil map legends is affected by uncertainty, especially at small scales where generalization is greater. The aim of this study was to test the hypothesis that data mining techniques provide better estimation of class frequency than traditional deterministic pedology in a national soil map. In the 1:5,000,000 map of Italian soil regions, the soil classes are the WRB reference soil groups (RSGs). Different data mining techniques, namely neural networks, random forests, boosted tree, classification and regression tree, and supported vector machine (SVM), were tested and the last one gave the best RSG predictions using selected auxiliary variables and 22,015 classified soil profiles. The five most frequent RSGs resulting from the two approaches were compared. The outcomes were validated with a Bayesian approach applied to a subset of 10% of geographically representative profiles, which were kept out before data processing. The validation provided the values of both positive and negative prediction abilities. The most frequent classes were equally predicted by the two methods, which differed however from the forecast of the other classes. The Bayesian validation indicated that the SVM method was more reliable than the deterministic pedological approach and that both approaches were more confident in predicting the absence rather than the presence of a soil type.