Towards a Multi-Representational Approach to Prediction, Understanding,
and Discovery in Hydrology
Abstract
A key step in model development is selection of an appropriate
representational system, including both the representation of what is
observed (the data), and the formal mathematical structure used to
construct the input-state-output mapping. These choices are critical,
because they completely determine the questions we can ask, the nature
of the analyses and inferences we can perform, and the answers that we
can obtain. Accordingly, a representation that is suitable for one kind
of investigation might be limited in its ability to support some other
kind.
Arguably, how different representational approaches affect what we can
learn from data is poorly understood. This paper explores three
complementary representational strategies as vehicles for understanding
how catchment-scale hydrological processes vary across
hydro-geo-climatologically diverse Chile. Specifically, we test a lumped
water-balance model (GR4J), a data-based dynamical systems model (LSTM),
and a data-based regression-tree model (Random Forest). Insights were
obtained regarding system memory encoded in data, spatial
transferability by use of surrogate attributes, and informational
deficiencies of the dataset that limit our ability to learn an adequate
input-output relationship. As expected, each approach exhibits specific
strengths, with LSTM providing the best characterization of dynamics,
GR4J being the most robust under informationally deficient conditions,
and RF being most supportive of interpretation.
Overall, the complementary nature of the three approaches suggests the
value of adopting a multi-representational framework in order to more
fully extract information from the data. Our results show that a
multi-representational approach better supports the goals of prediction,
understanding, and scientific discovery in Hydrology.