Full text: Download
This publication revises the deteriorated performance of field calibrated low-cost sensor systems after spatial and temporal relocation, which is often reported for air quality monitoring devices that use machine learning models as part of their software to compensate for cross-sensitivities or interferences with environmental parameters. The cause of this relocation problem and its relationship to the chosen algorithm is elucidated using published experimental data in combination with techniques from data science. Thus, the origin is traced back to insufficient sampling of data that is used for calibration followed by the incorporation of bias into models. Biases often stem from non-representative data and are a common problem in machine learning, and more generally in artificial intelligence, and as such a rising concern. Finally, bias is believed to be partly reducible in this specific application by using balanced data sets generated in well-controlled laboratory experiments, although not trivial due to the need for infrastructure and professional competence.