Regional flood frequency analysis using extreme gradient boosting based
on Bayesian optimization.
Abstract
Estimation of design flood is a crucial task in water resources
engineering. Regional Flood Frequency Analysis is one of the widely used
approaches for estimating design flood in ungauged basin. In the present
research, we develop an eXtreme Gradient Boost based ML model for RFFA.
The proposed approach relies on developing a regression model between
flood quantiles and the commonly available catchment descriptors. In
this study, the CAMELs data for 671 catchments from USA was used to
study the efficiency of the approach. Further, the results were compared
with the traditional methods such Multiple Linear Regression (MLR) and
Artificial Neural Networks (ANN). XGB is a decision-tree-based ensemble
machine learning algorithm that uses gradient boosting as a framework.
The results revealed that the XGB based approach resulted in estimates
with highest accuracy when using all the available catchment descriptors
(i.e., mean annual rainfall(MAR), drainage area, fraction forest, mean
annual potential evapotranspiration (MAPET), mean annual temperature,
rainfall intensity, slope, fraction snow, soil porosity, and soil
conductivity) both during training and validation. Four distinct models
consisting of three to ten descriptors were examined for 2-, 5-, 10-,
25-, 50-, and 100-year return periods, all of the models exhibit smaller
mean absolute error values and root mean square error values with
percentage bias ranging from -10 to +10. A model with three predictor
variables has comparable performance to other models. Drainage area,
rainfall intensity, MAR, and fraction snow are the most efficient
predictor variables, while MAPET, Slope, Temperature, Fraction Forest,
Soil Porosity, and Soil Conductivity have low significance in predicting
design flood for an ungauged catchment. The XGB modeling approach that
has been proposed can be applied to different places throughout the
world.