The use of machine learning (ML) for the online correction of coarse-resolution atmospheric models has proven effective in reducing biases in near-surface temperature and precipitation rate. However, this often introduces biases in the upper atmosphere and improvements are not always reliable across ML-corrective models trained with different random seeds. Furthermore, ML corrections can feed back on the baseline physics of the atmospheric model and produce profiles that are outside the distribution of samples used in training, leading to low confidence in the predicted corrections. This study introduces the use of a novelty detector to mask the predicted corrections when the atmospheric state is deemed out-of-sample. The novelty detector is trained on profiles of temperature and specific humidity in a semi-supervised fashion using samples from the coarsened reference fine-resolution simulation. Offline, the novelty detector determines more columns to be out-of-sample in simulations which are known, using simple metrics like mean bias, to drift further from the reference simulation. Without novelty detection, corrective ML leads to the development of undesirably large climate biases for some ML random seeds but not others. Novelty detection deems about 21% of columns to be novelties in year-long simulations. The spread in the root mean square error (RMSE) of time-mean spatial patterns of surface temperature and precipitation rate across a random seed ensemble is sharply reduced when using novelty detection. In particular, the random seed with the worst RMSE is improved by up to 60% (depending on the variable) while the best seed maintains its low RMSE.

W. Andre Perkins

and 3 more

We present a machine learning based emulator of a microphysics scheme for condensation and precipitation processes (Zhao-Carr) used operationally in a global atmospheric forecast model (FV3GFS). Our tailored emulator architecture achieves high skill (≥94%) in predicting condensate and precipitation amounts and maintains low global-average bias (≤4%) for 1 year of continuous simulation when replacing the Fortran scheme. The stability and success of this emulator stems from key design decisions. By separating the emulation of condensation and precipitation processes, we can better enforce physical priors such as mass conservation and locality of condensation, and the vertical dependence of precipitation falling downward, using specific network architectures. An activity classifier for condensation imitates the discrete-continuous nature of the Fortran microphysics outputs (i.e., tendencies are identically zero where the scheme is inactive, and condensate is zero where clouds are fully evaporated). A temperature-scaled conditional loss function ensures accurate condensate adjustments for a high dynamic range of cloud types (e.g., cold, low-condensate cirrus clouds or warm, condensate-rich clouds). Despite excellent overall performance, the emulator exhibits some deficiencies in the uppermost model levels, leading to biases in the stratosphere. The emulator also has short episodic skill dropouts in isolated grid columns and is computationally slower than the original Fortran scheme. Nonetheless, our challenges and strategies should be applicable to the emulation of other microphysical schemes. More broadly, our work demonstrates that with suitable physically motivated architectural choices, ML techniques can accurately emulate complex human-designed parameterizations of fast physical processes central to weather and climate models.