Development and application of the Branched and Isoprenoid GDGT Machine
learning Classification algorithm (BIGMaC) for paleoenvironmental
reconstruction
Abstract
Glycerol dialkyl glycerol tetraethers (GDGTs), including both the
archaeal isoprenoid GDGTs (isoGDGTs) and the bacterial branched GDGTs
(brGDGTs), have been used in paleoclimate studies to reconstruct
temperature in marine and terrestrial archives. However, GDGTs are
present in many different types of environments, with relative
abundances that strongly depend on the depositional setting. This
suggests that GDGT distributions can be used more broadly to infer
paleoenvironments in the geological past. In this study, we analyzed
1153 samples from a variety of modern sedimentary settings for both
isoGDGT and brGDGTs. We used machine learning on the GDGT relative
abundances from this dataset to relate the lipid distributions to the
physical and chemical characteristics of the depositional settings. We
observe a robust relationship between the depositional environment and
the lipid distribution profiles of our samples. This dataset was used to
train and test the Branched and Isoprenoid GDGT Machine learning
Classification algorithm (BIGMaC), which identifies the environment a
sample comes from based on the distribution of GDGTs with high accuracy.
We tested the model on the sedimentary record from the Giraffe
kimberlite pipe, an Eocene maar in subantarctic Canada, and found that
the BIGMaC reconstruction agrees with independent stratigraphic
information, provides new information about the paleoenvironment of this
site, and helps improve paleotemperature reconstruction. In cases where
paleoenvironments are unknown or are changing, BIGMaC can be applied in
concert with other proxies to generate more refined paleoclimatic
records.