FogNet-v2.0: Explainable Physics-Informed Vision Transformer for Coastal
Fog Forecasting
Abstract
Fog, a weather phenomenon near the Earth’s surface, poses significant
visibility challenges, critically impacting all modes of transportation.
Accurate fog prediction models are crucial for ensuring safety and
reducing delays. Forecasting meteorological phenomena involves complex
five-dimensional data structures, including geographical coordinates,
atmospheric conditions, altitude layers, and time-series data. The
challenge is to effectively learn from this data to improve visibility
predictions.
Recent advancements in vision transformers have revolutionized deep
learning, particularly in image analysis, opening new avenues for
interpreting complex spatio-temporal data in atmospheric science. This
paper focuses on coastal fog forecasting with a 24-hour prediction
window. We introduce and compare innovative tokenization strategies for
vision transformer models aimed at enhancing the accuracy and
interpretability of fog predictions.
The study evaluates various sampling methods, comparing traditional 2D
approaches (Vanilla Vision Transformer and Unified Variable Transformer)
with more sophisticated 3D and 4D techniques (Spatio-Temporal
Transformer, Spatio-Variable Transformer, and Physics-Informed
Transformer). FogNet-v2.0, a PIT model, emerges as the front-runner,
outperforming other models and benchmarks, including the 3D CNN-based
FogNet. FogNet-v2.0 improves prediction accuracy across most metrics
except for the Critical Success Index (CSI).
Key innovations of this research include correctly forecasting fog
events, improved skill scores, reduced miss cases, and the development
of an explainable, physics-informed vision transformer. This paper
highlights the integration of physical principles with machine learning
for precise, interpretable weather prediction models, showcasing the
efficacy of advanced tokenization and physics-informed methodologies in
addressing the complexities of atmospheric phenomena.