A HydroLSTM-based Machine-Learning Approach to Discovering Regionalized
Representations of Catchment Dynamics
Abstract
Finding similarities between model parameters across different
catchments has proved to be challenging, especially for ungauged
catchments. Existing approaches struggle due to catchment heterogeneity
and non-linear dynamics. In particular, attempts to correlate catchment
attributes with hydrological responses have failed due to
interdependencies among variables and consequent equifinality.
Machine Learning (ML), particularly Long Short-Term Memory (LSTM)
approach, has demonstrated strong predictive and spatial regionalization
performance. However, understanding the nature of the regionalization
relationships remains difficult. This study proposes a novel approach to
partially decouple the representation learning of (a) catchment dynamics
by using the HydroLSTM architecture and (b) spatial regionalization
relationships by using a Random Forest(RF) clustering approach to learn
the relationships between catchment attributes and the dynamics.
This coupled approach, called Regional HydroLSTM, generates a
representation of “potential streamflow” using a single cell-state,
while the output gate corrects it given the temporal context of the
hydrologic regime. RF clusters mediate the relationship between
catchment attributes and dynamics, allowing the identification of
spatially consistent hydrological regions, thereby providing insight
into the factors driving spatial and temporal hydrological variability.
Results suggest that combining the two complementary architectures can
enhance the interpretability of regional machine learning models in
hydrology, offering a new perspective on the ”catchment classification”
problem and potentially advancing streamflow prediction in ungauged
basins. We conclude that an improved understanding of the underlying
nature of hydrologic systems can be achieved by careful design of ML
architectures to target the specific things we are seeking to learn from
the data.