A benchmark to test generalization capabilities of deep learning methods
to classify severe convective storms in a changing climate
Abstract
This is a test-case study assessing the ability of deep learning methods
to generalize to a future climate (end of 21st century) when trained to
classify thunderstorms in model output representative of the present-day
climate. A convolutional neural network (CNN) was trained to classify
strongly-rotating thunderstorms from a current climate created using the
Weather Research and Forecasting (WRF) model at high-resolution, then
evaluated against thunderstorms from a future climate, and found to
perform with skill and comparatively in both climates. Despite training
with labels derived from a threshold value of a severe thunderstorm
diagnostic (updraft helicity), which was not used as an input attribute,
the CNN learned physical characteristics of organized convection and
environments that are not captured by the diagnostic heuristic. Physical
features were not prescribed but rather learned from the data, such as
the importance of dry air at mid-levels for intense thunderstorm
development when low-level moisture is present (i.e., convective
available potential energy). Explanation techniques also revealed that
thunderstorms classified as strongly rotating are associated with
learned rotation signatures. Results show that the creation of synthetic
data with ground truth is a viable alternative to human-labeled data and
that a CNN is able to generalize a target using learned features that
would be difficult to encode due to spatial complexity. Most
importantly, results from this study show that deep learning is capable
of generalizing to future climate extremes and can exhibit out-of-sample
robustness with hyperparameter tuning in certain applications.