Biases in estimating long-term recurrence intervals of extreme events
due to regionalised sampling
Abstract
Preparing for environmental risks requires estimating the frequencies of
extreme events, often from data records that are too short to confirm
them directly. This requires fitting a statistical distribution to the
data. To improve precision, investigators often pool data from
neighboring sites into single samples, referred to as “superstations,”
before fitting. We demonstrate that this technique can introduce
unexpected biases in typical situations, using wind and rainfall
extremes as case studies. When the combined locations have even small
differences in the underlying statistics, the regionalization approach
gives a fit that may tend toward the highest levels suggested by any of
the individual sites. This bias may be large or small compared to the
sampling error, for realistic record lengths, depending on the
distribution of the quantity analysed. The results of this analysis
indicate that previous analyses could potentially have overestimated the
likelihood of extreme events arising from natural weather variability.