Knowledge-based Artificial Intelligence for Agroecosystem Carbon Budget and Crop Yield Estimation
Abstract
Improving the estimation of CO2 exchange between the
atmosphere and terrestrial ecosystems is critical to reducing the large
uncertainty in the global carbon budget. Large amounts of the
atmospheric CO2 assimilated by plants return to the
atmosphere by ecosystem respiration (Reco), including plant autotrophic
respiration (Ra) and soil microbial heterotrophic respiration (Rh).
However, Ra and Rh are challenging to be estimated at large regional
scales because of the limited understanding of the complex interactions
among physical, chemical, and biological processes and the resulting
high spatio-temporal dynamics. Traditional approaches for estimating
Reco including process-based (PB) models are limited by human knowledge
resulting in limited accuracy and efficiency. Accumulation of the
in situ observation of net ecosystem exchange (NEE), weather, and
soil, and satellite data of GPP, LAI and soil moisture make it possible
for applying data driven machine learning (ML) approaches. But the ML
model approach has disadvantages of omission of domain knowledge and
lack of interpretability. Here we propose a novel knowledge guided
machine learning (KGML) method for predicting daily Ra and Rh in the US
crop fields. With Gated Recurrent Unit (GRU) as the basis, we develop
the KGML models constructing the hierarchical structure of ML with a
mass balance constraint. The KGML models were pre-trained using
synthetic data generated by an advanced agroecosystem model, ecosys, and
re-trained with real-world FLUXNET observation data. We extrapolate the
best KGML model to crop fields over the US with the help of satellite
data, reanalysis climate forcings, and soil database to reveal the
spatio-temporal variations and key controlling factors. We believe this
study advances the interpretable machine learning concept for carbon
cycle estimation and will shed light on many other process-based
biogeochemistry research.