Improved quantification of ocean carbon uptake by using machine learning
to merge global models and pCO2 data
Abstract
The ocean plays a critical role in modulating climate change by
sequestering CO2 from the atmosphere. Quantifying the CO2 flux across
the air-sea interface requires time-dependent maps of surface ocean
partial pressure of CO2 (pCO2), which can be estimated using global
ocean biogeochemical models (GOBMs) and observational-based data
products. GOBMs are internally consistent, mechanistic representations
of the ocean circulation and carbon cycle, and have long been the
standard for making spatio-temporally resolved estimates of air-sea CO2
fluxes. However, there are concerns about the fidelity of GOBM flux
estimates. Observation-based products have the strength of being
data-based, but the underlying data are sparse and require significant
extrapolation to create global full-coverage flux estimates. The Lamont
Doherty Earth Observatory-Hybrid Physics Data (LDEO-HPD) pCO2 product is
a new approach to estimating the temporal evolution of surface ocean
pCO2 and air-sea CO2 exchange. LDEO-HPD uses machine learning to merge
high-quality observations with state-of-the-art GOBMs. We train an
eXtreme Gradient Boosting (XGB) algorithm to learn a non-linear
relationship between model-data mismatch and observed predictors. GOBM
fields are then corrected with the predicted model-data misfit to
estimate real-world pCO2 for 1982-2018. A benefit of this approach is
that model-data misfit has reduced temporal skewness compared to the
observed pCO2 that is the target variable for other machine-learning
based reconstructions. This supports a robust reconstruction by LDEO-HPD
that is in better agreement with independent observations than other
estimates. LDEO-HPD global ocean uptake of CO2 is in agreement with
other products and the Global Carbon Budget 2020.