Mapping dissolved oxygen concentrations by combining shipboard and Argo
observations using machine learning algorithms
Abstract
The ocean oxygen (O2) inventory has declined in recent decades but the
estimates of O2 trend are uncertain due to its sparse and irregular
sampling. A refined estimate of deoxygenation rate is developed using
machine learning techniques and biogeochemical Argo array. The source
data includes historical shipboard (bottle and CTD-O2) profiles from
1965 to 2020 and biogeochemical Argo profiles after 2005. Neural network
and random forest algorithms were trained using approximately 80 % of
this data and the remaining 20% for validation. The training data is
further divided into 5-fold decadal groups to perform cross validation
and hyperparameter tuning. Through different combinations of algorithm
types and predictor variable sets, an ensemble of gridded monthly O2
datasets was generated with similar skills (root-mean-square error
~ 13-18 micro-mol/kg and R2 ~ 0.9). The
largest errors are found in the oxycline and frontal regions with strong
lateral and vertical gradients. The mapping was repeated with shipboard
data only and with both shipboard and Argo data. The effect of including
Argo data on the estimated global deoxygenation trends has a major
impact with an 56% increase while reducing the uncertainty by 40% as
measured by the ensemble spread. This study demonstrates the importance
of new biogeochemical Argo arrays in relatively data-poor regions such
as the Southern Ocean.