Evaluating Variations in Tropical Cyclone Precipitation (TCP) in Eastern
Mexico using Machine Learning Techniques
Abstract
Tropical Cyclone Precipitation (TCP) is one of the major triggers of
flash flooding and landslide in eastern Mexico. We apply different
statistical and machine learning techniques to study a 99 year TCP
climatology in high resolution. Strong correlations exist between
location variables and annual mean TCP, as well as between dynamic
variables and event TCP. Topographic variables observe mixed signals
with the elevation variances positively correlated with TCP. The Random
Forest (RF) model is a powerful tool with excellent fitting and
predicting skills for TCP variations. It has a very small out of sample
cross-validation error and well captures the spatial variations of
historical TCP events. Only three location variables are needed to
construct the best model for the annual mean TCP while the best model
needs 18 variables to explain the complex variations in the event TCP.
The distance to the track is the most important variable for the event
TCP model and many other factors contribute to the TCP collectively and
nonlinearly, which can’t be captured fully by the previous correlation
analysis. They include translation characteristics of the storms,
locations of the precipitation grid, and topography. Event TCP is
generally larger in storms with slower translation speed and more
variance in their tracks. While the lower coastal area generally has a
higher probability of TCP, the higher inland has elevation variances
that enhance less frequent but extreme TCP events. The RF algorithm is
an efficient machine learning approach showing potentials for future
Quantitative Precipitation Forecasting (QPF).