Oxford University Press, PNAS Nexus, 3(3), 2024
DOI: 10.1093/pnasnexus/pgae088
Full text: Download
Abstract High-resolution assessment of historical levels is essential for assessing the health effects of ambient air pollution in the large Indian population. The diversity of geography, weather patterns, and progressive urbanization, combined with a sparse ground monitoring network makes it challenging to accurately capture the spatiotemporal patterns of ambient fine particulate matter (PM2.5) pollution in India. We developed a model for daily average ambient PM2.5 between 2008 and 2020 based on monitoring data, meteorology, land use, satellite observations, and emissions inventories. Daily average predictions at each 1 km × 1 km grid from each learner were ensembled using a Gaussian process regression with anisotropic smoothing over spatial coordinates, and regression calibration was used to account for exposure error. Cross-validating by leaving monitors out, the ensemble model had an R2 of 0.86 at the daily level in the validation data and outperformed each component learner (by 5–18%). Annual average levels in different zones ranged between 39.7 μg/m3 (interquartile range: 29.8–46.8) in 2008 and 30.4 μg/m3 (interquartile range: 22.7–37.2) in 2020, with a cross-validated (CV)-R2 of 0.94 at the annual level. Overall mean absolute daily errors (MAE) across the 13 years were between 14.4 and 25.4 μg/m3. We obtained high spatial accuracy with spatial R2 greater than 90% and spatial MAE ranging between 7.3–16.5 μg/m3 with relatively better performance in urban areas at low and moderate elevation. We have developed an important validated resource for studying PM2.5 at a very fine spatiotemporal resolution, which allows us to study the health effects of PM2.5 across India and to identify areas with exceedingly high levels.