Statistical Compression for Climate Model Output
- Dorit Hammerling,
- Joseph Guinness
Abstract
Numerical climate model simulations run at high spatial and temporal
resolutions generate massive quantities of data. As our computing
capabilities continue to increase, storing all of the data is not
sustainable, and thus is it important to develop methods for
representing the full datasets by smaller compressed versions. We
propose a statistical compression and decompression algorithm based on
storing a set of summary statistics as well as a statistical model
describing the conditional distribution of the full dataset given the
summary statistics. We decompress the data by computing conditional
expectations and conditional simulations from the model given the
summary statistics. Conditional expectations represent our best estimate
of the original data but are subject to oversmoothing in space and time.
Conditional simulations introduce realistic small-scale noise so that
the decompressed fields are neither too smooth nor too rough compared
with the original data. Considerable attention is paid to accurately
modeling the original dataset--one year of daily mean temperature
data--particularly with regard to the inherent spatial nonstationarity
in global fields, and to determining the statistics to be stored, so
that the variation in the original data can be closely captured, while
allowing for fast decompression and conditional emulation on modest
computers.