Exploring Advectable Latent Representations for Droplet Size
Distributions with Physics-Informed Autoencoders
Abstract
Investigating the role of clouds and precipitation in the Earth system
necessitates microphysical schemes capable of accurately describing the
evolution of hydrometeor particle size distribution (PSD), while
maintaining low computational costs implementable in atmospheric models.
Machine learning (ML) offers a promising approach for replacing detailed
binned yet computationally expensive schemes with efficient emulations.
However, many existing emulations still rely on moments as prognostic
variables, inheriting structural limitations from traditional bulk
schemes. In contrast, latent variables directly discovered by ML are
potential to represent PSDs more accurately, but their inherent
nonlinearity breaks the conservation property under advection and
diffusion, limiting their applicability in online simulations. To
address this dilemma, we propose Weighted Integral Parameters (WIPs),
formulated as weighted integrals of PSD with learnable weight functions,
providing the most general mathematical form for advectable
microphysical prognostic variables. Using autoencoders that are
physics-informed by WIP’s formulation to learn the optimal PSD
representations, we conducted unsupervised learning over a liquid
droplet PSD dataset generated from ensemble large eddy simulations with
Spectral Bin Microphysics, to compare WIPs with traditional moment
approaches in bulk schemes on their ability to represent “actual”
PSDs. Results show that WIPs can automatically capture the critical
information of medium-sized droplets unprecedentedly with traditional
moment approaches, and outperform partial and full integral moments in
terms of PSD reconstruction error, indicating superior PSD information
compression efficiency. With these properties, WIPs are potential to
replace moments as fully prognostic variables to build more accurate
ML-based bin-emulating schemes.