Improving stream solute predictions with a modified LSTM model
incorporating solute interdependences and hysteresis patterns
Abstract
Surface runoff and infiltrated water interact with dynamic landscape
properties en route to the stream, ranging from vegetation and microbial
activities to soil and geological attributes. Stream solute
concentrations are highly variable and interconnected due to these
interactions, flow paths, and residence times, and often exhibit
hysteresis with flow. Significant unknowns remain about how point
measurements of stream solute chemistry reflect interdependent
hydrobiogeochemical and physical processes, and how signatures are
encapsulated as nonlinear dynamical relationships between variables. We
take a machine learning approach to understand and capture these
dynamical relationships and improve predictions of solutes at short and
long time scales. We introduce a physical process-based ”flow-gate” into
an LSTM (long short-term memory) model, which enables the model to learn
hysteresis behaviors if they exist. Further, we use
information-theoretic metrics to detect how solutes are interdependent,
and iteratively select source solutes that best predict a given target
solute concentration. The ”flow-gate LSTM” model improves model
predictions (RSME values decrease from 1% to 32%) relative to the
standard LSTM model for all nine solutes included in the study. The
predictive improvements from the flow-gate LSTM model highlight the
importance of lagged concentration and discharge relationships for
certain solutes. It also indicates a potential limitation in the
traditional LSTM model approach since flow rates are always provided as
input sources, but this information is not fully utilized. This work
provides a starting point for a predictive understanding of geochemical
interdependencies using machine-learning approaches and highlights
potential improvements in model architecture.