A machine learning tool for determining the required sample size for GEV
fitting in climate applications
Abstract
Extreme climate events (ECEs) like heavy rainfall and heatwaves
significantly impact society, and climate change is altering their
frequency. Generalised Extreme Value (GEV) distributions help quantify
these ECEs and guide human system design (e.g., return value of extreme
wind gust sets construction codes at specific locations). We train a
machine learning (ML) model using GEV distributions to determine the
sample size required to estimate return values with specific
uncertainties. For ECEs like heatwaves (with negative GEV shape
parameters), fewer samples are needed to estimate the return value with
specific uncertainty than rainfall extremes (positive shape parameters).
For the heatwave example, a sample size of more than 20 times the annual
recurrence interval is typically required to estimate the return value
to ±10% uncertainty. A 1-in-20-year heatwave requires 400 samples,
equating to 20 different 20-year simulations. Achieving such quantities
will require extensive climate downscaling efforts, potentially aided by
ML-based downscaling methods.