Figure 5 : Permutation importances for the most important
component of each variable in predicting global mean temperature (TAS)
and precipitation (PR). Each emulator input variable is shuffled in turn
to determine the relative contribution to prediction skill. Note that
these average estimates do not account for potential regional
contributions which may be particularly relevant for aerosol.
Neural Networks
Artificial Neural Networks (ANNs), algorithms inspired by the biological
neural networks of human brains, have shown great success in areas like
Computer Vision and Natural Language Processing. Two major architectures
are Convolutional Neural Networks (CNNs) (LeCun et al., 1990), to model
spatial dependencies, and Recurrent Neural Networks (RNNs), to process
sequential data. Besides the traditional areas, ANNs have been recently
employed to tackle a variety of problems in earth system science
(Camp-Valls et al., 2021). Long short-term memory (LSTM) networks
(Hochreiter et al., 1997), an advanced type of RNNs, are used for
modelling time-series, for example for El NiƱo-Southern Oscillation
prediction (Broni-Bedaiko et al., 2019). In cases where both input and
target have a spatial structure, such as modelling of precipitation or
changes in satellite imagery, a very commonly used CNN type is the U-Net
(Ronneberger et al, 2015), which has been applied frequently in climate
science and weather forecasting (Trebing et al., 2020, Harder et al.,
2020).
We explored both a pure LSTM approach and a pure CNN approach, using a
U-Net. A combination of both network types gave the best results,
therefore we use an LSTM combined with a CNN for our example
architecture. The CNN is used to extract spatial features before feeding
our input in the LSTM. The CNN consists of one convolutional layer with
a kernel size of 6, followed by a ReLU activation function and average
pooling. The LSTM uses 25 units and a ReLU activation function as well,
which is followed by a dense layer and reshaping to the output
dimension. To train the emulator we use ssp126 , 370 and585 scenarios and the historical data with a moving-time window
size of 10 years (in one-year increments, leading to 570 training
points). The emulator is trained for 20 epochs, using a batch size of 25
for T and DTR and 5 for PR and PR90. For this baseline approach we chose
not to do any hyperparameter optimization.
RMSE scores obtained with the CNN-LSTM architecture are comparable to
those obtained with the other methods. The CNN-LSTM architecture
performs particularly well for temperature predictions, with average
RMSE scores of 0.38 K over the second half of the 21st century. This
might be because temperature has greater autocorrelation/less
variability from one year to the next one compared to the other
variables. Such autocorrelation would be well captured by a time-aware
model like an RNN. Spatial patterns of temperature changes, such as the
Arctic amplification, are reasonably well predicted, even though the
coldest temperatures (e.g. in the North-Atlantic cold patch) are not as
well captured (as shown in Figure 4). The CNN-LSTM performs slightly
worse than the other emulators for diurnal temperature range and
precipitation predictions. For precipitation, global patterns (e.g. the
ITCZ shift) are well predicted by the emulator, but the relative changes
are overestimated (too wet or too dry) in most places. Like for all
other emulators showcased in this study, extreme precipitation proves
the hardest variable to predict accurately.