Prediction of daily mean and one-hour maximum PM2.5 concentrations and
applications in Central Mexico using satellite-based machine-learning
models
Abstract
Machine-learning algorithms are becoming popular techniques to predict
ambient air PM2.5 concentrations at high spatial resolutions (1x1 km)
using satellite-based aerosol optical depth (AOD). Most machine-learning
models have aimed to predict 24h-averaged PM2.5 concentrations (mean
PM2.5). Over Mexico, none has been developed to predict subdaily peak
levels, such as the maximum daily one-hour concentration (max PM2.5). We
present a new modeling approach based on extreme gradient boosting
(XGBoost) and inverse-distance weighting that uses AOD data,
meteorology, and land-use variables to predict mean and max PM2.5 in
Central Mexico (including the Mexico City Metropolitan Area) from 2004
through 2019. Our models for mean and max PM2.5 exhibited good
performance, with overall cross-validated mean absolute errors (MAE) of
3.68 and 9.21 μg/m3 , respectively, compared to mean absolute deviations
from the median (MAD) of 8.55 and 15.64 μg/m3. We also investigated
applications of our mean PM2.5 predictions that can aid local
authorities in air-quality management and public-health surveillance,
such as the co-occurrence of high PM2.5 and heat, compliance with local
air-quality standards, and the relationship of PM2.5 exposure with
social marginalization.