Machine learning for red tide prediction in the Gulf of Mexico along the
West Florida Shelf
Abstract
The objective of this study is to understand relations between multiple
physical and environmental factors and red tide, which is a common name
for harmful algal blooms occurring along coastal regions worldwide.
Large concentrations of Karenia brevis, a toxic mixotrophic
dinoflagellate, make up the red tide along the West Florida Shelf (WFS)
in the Gulf of Mexico. Besides being toxic, red tide causes unpleasant
odor and scenery, which result in multiple environmental and
socioeconomic impacts and public health issues. Understanding the
physical and biogeochemical processes that control the occurrence of red
tide is important for studying the impact of climate change on red tide
frequency, and accordingly assessing the future environmental and
socioeconomic impacts of red tide under different mitigation techniques
and climate scenarios. We use observation and reanalysis data in the WFS
to train machine learning (ML) models to predict red tide, as a
classification problem of large bloom or no bloom. We develop the ML
model using seasonal input data of Peace River and Caloosahatchee River
outflow, alongshore and offshore wind speed, and Loop Current position.
The Loop Current, which is a warm ocean current that enters and loops
through the Gulf of Mexico before exiting to join the Gulf Stream, can
be detected from sea surface height. In addition to the observation and
reanalysis data, these variables can be simulated by the Earth system
models (ESMs) of the Coupled Model Intercomparison Project Phase 6
(CMIP6), especially by the high-resolution models of the High Resolution
Model Intercomparison Project (HighResMIP) of CMIP6. This is needed to
understand the frequency and future trends of red tide under different
Shared Socioeconomic Pathways (SSPs) of CMIP6. In this preliminary
study, we investigate the impact of different choices regarding ML model
selection and training dataset on the accuracy of red tide prediction,
and the physical interpretation of the results. We also discuss the
validation of ESMs data for predictive modeling, and ensemble methods
for improving predictive performance. The study provides several
insights that can be useful for predicting the future trends of red tide
under SSPs using CMIP6 data.