Arthur Grundner

and 3 more

A promising method for improving the representation of clouds in climate models, and hence climate projections, is to develop machine learning-based parameterizations using output from global storm-resolving models. While neural networks can achieve state-of-the-art performance, they are typically climate model-specific, require post-hoc tools for interpretation, and struggle to predict outside of their training distribution. To avoid these limitations, we combine symbolic regression, sequential feature selection, and physical constraints in a hierarchical modeling framework. This framework allows us to discover new equations diagnosing cloud cover from coarse-grained variables of global storm-resolving model simulations. These analytical equations are interpretable by construction and easily transferable to other grids or climate models. Our best equation balances performance and complexity, achieving a performance comparable to that of neural networks ($R^2=0.94$) while remaining simple (with only 13 trainable parameters). It reproduces cloud cover distributions more accurately than the Xu-Randall scheme across all cloud regimes (Hellinger distances $<0.09$), and matches neural networks in condensate-rich regimes. When applied and fine-tuned to the ERA5 reanalysis, the equation exhibits superior transferability to new data compared to all other optimal cloud cover schemes. Our findings demonstrate the effectiveness of symbolic regression in discovering interpretable, physically-consistent, and nonlinear equations to parameterize cloud cover.

Tom Beucler

and 1 more

There is no consensus on the physical mechanisms controlling the scale at which convective activity organizes near the Equator, where the Coriolis parameter is small. High resolution cloud-permitting simulations of non-rotating convection show the emergence of a dominant length scale, which has been referred to as convective self-aggregation. Furthermore, simulations in an elongated domain of size 12228km x 192km with a 3km horizontal resolution equilibrate to a wave-like pattern in the elongated direction, where the cluster size becomes independent of the domain size. These recent findings suggest that the size of convective aggregation may be regulated by physical mechanisms, rather than artifacts of the model configuration, and thus within the reach of physical understanding. We introduce a diagnostic framework relating the evolution of the length scale of convective aggregation to the net radiative heating, the surface enthalpy flux, and horizontal energy transport. We evaluate these length scale tendencies of convective aggregation in twenty high-resolution cloud-permitting simulations of radiative-convective equilibrium. While both radiative fluxes contribute to convective aggregation, the net longwave radiative flux operates at large scales (1000-5000 km) and stretches the size of moist and dry regions, while the net shortwave flux operates at smaller scales (500-2000 km) and shrinks it. The surface flux length scale tendency is dominated by convective gustiness, which acts to aggregate convective activity at smaller scales (500-3000 km). We further investigate the scale-by-scale radiative tendencies in a suite of nine mechanism denial experiments, in which different aspects of cloud radiation are homogenized or removed across the horizontal domain, and find that liquid and ice cloud radiation can individually aggregate convection. However, only ice cloud radiation can drive the convective cluster to scales exceeding 5000 km, because of the high optical thickness of ice, and the increase in coherence between water vapor and deep convection with horizontal scale. The framework presented here focuses on the length scale tendencies rather than a static aggregated state, which is a step towards diagnosing clustering feedbacks in the real world. Overall, our work underscores the need to observe and simulate surface fluxes, radiative and advective fluxes across the 1km-1000km range of scales to better understand the characteristics of turbulent moist convection.

Arthur Grundner

and 5 more

A promising approach to improve cloud parameterizations within climate models and thus climate projections is to use deep learning in combination with training data from storm-resolving model (SRM) simulations. The Icosahedral Non-Hydrostatic (ICON) modeling framework permits simulations ranging from numerical weather prediction to climate projections, making it an ideal target to develop neural network (NN) based parameterizations for sub-grid scale processes. Within the ICON framework, we train NN based cloud cover parameterizations with coarse-grained data based on realistic regional and global ICON SRM simulations. We set up three different types of NNs that differ in the degree of vertical locality they assume for diagnosing cloud cover from coarse-grained atmospheric state variables. The NNs accurately estimate sub-grid scale cloud cover from coarse-grained data that has similar geographical characteristics as their training data. Additionally, globally trained NNs can reproduce sub-grid scale cloud cover of the regional SRM simulation. Using the game-theory based interpretability library SHapley Additive exPlanations, we identify an overemphasis on specific humidity and cloud ice as the reason why our column-based NN cannot perfectly generalize from the global to the regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally and globally trained column-based NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. Our results show the potential of deep learning to derive accurate yet interpretable cloud cover parameterizations from global SRMs, and suggest that neighborhood-based models may be a good compromise between accuracy and generalizability.