Abstract
Deterministic Numerical Weather Prediction (NWP) models are the
state-of-the-science models to provide reliable weather forecasts that
provide indispensable actionable data to society. However, each NWP
model run carry uncertainty caused by errors in initial conditions and
assumptions in the model. As a result, probabilistic forecasts can be
used to quantify and even correct the model bias. The Analog Ensemble
(AnEn) technique uses a historical dataset of past deterministic
forecasts and their associated observations to generate an ensemble of
future outcomes. One of the main advantages of the AnEn technique, along
with other related statistical ensemble techniques, is that it is not
necessary to run multiple NWP runs by varying initial conditions or
model settings. However, all these techniques require access to the
entire historical dataset to generate analogs. Moreover, these
techniques require reading the dataset for every forecast which is
computationally expensive. In this work, the whole historical dataset is
replaced by a model that has the capability to learn the Probability
Density Function (PDF) of that dataset. Specifically, we utilize a
Conditional Variational Autoencoder (CVAE) deep generative machine
learning model in order to correct the wind speed forecasts of North
American Mesoscale (NAM) forecasting system. As a result, we feed the
values forecasted by the NWP model as a condition to our CVAE and
generate an ensemble used to correct the forecasted value in constant
time and with small memory usage. Initial results show that CVAE
probabilistic performance is comparable to AnEn while CVAE can be up to
25 and 2 times smaller in memory and runtime, respectively, for 5 years
of historical data.