Abstract
The use of machine learning (ML) for the online correction of
coarse-resolution atmospheric models has proven effective in reducing
biases in near-surface temperature and precipitation rate. However, this
often introduces biases in the upper atmosphere and improvements are not
always reliable across ML-corrective models trained with different
random seeds. Furthermore, ML corrections can feed back on the baseline
physics of the atmospheric model and produce profiles that are outside
the distribution of samples used in training, leading to low confidence
in the predicted corrections. This study introduces the use of a novelty
detector to mask the predicted corrections when the atmospheric state is
deemed out-of-sample. The novelty detector is trained on profiles of
temperature and specific humidity in a semi-supervised fashion using
samples from the coarsened reference fine-resolution simulation.
Offline, the novelty detector determines more columns to be
out-of-sample in simulations which are known, using simple metrics like
mean bias, to drift further from the reference simulation. Without
novelty detection, corrective ML leads to the development of undesirably
large climate biases for some ML random seeds but not others. Novelty
detection deems about 21% of columns to be novelties in year-long
simulations. The spread in the root mean square error (RMSE) of
time-mean spatial patterns of surface temperature and precipitation rate
across a random seed ensemble is sharply reduced when using novelty
detection. In particular, the random seed with the worst RMSE is
improved by up to 60% (depending on the variable) while the best seed
maintains its low RMSE.