Statistical and Machine Learning Methods Applied to the Prediction of
Tropical Rainfall
Abstract
We explore the use of three advanced statistical and machine learning
methods (a generalized linear model, random forest, and neural network)
to predict the occurrence and rain rate distribution of three tropical
rain types (deep convective, stratiform, and shallow convective)
observed by the radar onboard the GPM satellite over the West Pacific.
Three-hourly temperature and moisture fields from MERRA-2 were used as
predictors. While all three methods perform reasonably well at
predicting the occurrence of each rain type, the neural network is the
only method able to produce rain rate distributions similar to
observations, especially for the top 5-10% of observed values. However,
the neural network took the most effort to train and has a relatively
high root mean square error, suggesting that it sometimes assigns high
rain rates to situations that in reality produce much weaker rain rates.