A Multiscale Spatio-Temporal Big Data Fusion Algorithm from Point to
Satellite Footprint Scales
Abstract
The past six decades has seen an explosive growth in remote sensing data
across air, land, and water dramatically improving predictive
capabilities of physical models and machine-learning (ML) algorithms.
Physical models, however, suffer from rigid parameterization and can
lead to incorrect inferences when little is known about the underlying
physical process. ML models, conversely, sacrifice interpretation for
enhanced predictions. Geostatistics are an attractive alternative since
they do not have strong assumptions like physical models yet enable
physical interpretation and uncertainty quantification. In this work, we
propose a novel multiscale multi-platform geostatistical algorithm which
can combine big environmental datasets observed at different
spatio-temporal resolutions and over vast study domains. As a case
study, we apply the proposed algorithm to combine satellite soil
moisture data from Soil Moisture Active Passive (SMAP) and Soil Moisture
and Ocean Salinity (SMOS) with point data from U.S Climate Reference
Network (USCRN) and Soil Climate Analysis Network (SCAN) across
Contiguous US for a fifteen-day period in July 2017. Using an underlying
covariate-driven spatio-temporal process, the effect of dynamic and
static physical controls—vegetation, rainfall, soil texture and
topography—on soil moisture is quantified. We successfully validate
the fused soil moisture across multiple spatial scales (point, 3 km, 25
km and 36 km) and compute five-day soil moisture forecasts across
Contiguous US. The proposed algorithm is general and can be applied to
fuse many other environmental variables.