Development and application of a 1-km hourly air-temperature model for
the Northeastern and Mid-Atlantic United States using remotely sensed
and ground-based measurements
Abstract
Background: Accurate and precise estimates of ambient air temperatures
that can capture fine-scale within-day variability are necessary for
studies of air temperature and health. Method: We developed statistical
models for predicting temperature at each hour in each cell of a 927-m
square grid across the Northeast and Mid-Atlantic United States from
2003 to 2019, across ~4,000 meteorological stations from
the Integrated Mesonet, using inputs such as elevation, an inverse
distance-weighted interpolation of temperature, and satellite-based
vegetation and land surface temperature. We used a rigorous spatial
cross-validation scheme and spatially weighted the errors to estimate
how well model predictions would generalize to new cell-days. We assess
the within county association of temperature and social vulnerability in
a heat wave as an example application. Results: We found that a model
based on the XGBoost machine-learning algorithm was fast and accurate,
obtaining weighted root mean square errors (RMSEs) around 1.6 K,
compared to standard deviations around 11.0 K. We found similar accuracy
when validating our model on an external dataset from Weather
Underground. Assessing predictions from the North American Land Data
Assimilation System-2 (NLDAS-2), another hourly model, in the same way,
we found it was much less accurate, with RMSEs around 2.5 K. Finally, we
demonstrated the health relevance of our model by showing that our
temperature estimates were associated with social vulnerability across
the region during a heat wave, whereas the NLDAS-2 showed a much weaker
association. Conclusion: Our high spatiotemporal resolution air
temperature model provides a strong contribution for future health
studies in this region.