Tarun Agrawal

and 2 more

Surface runoff and infiltrated water interact with dynamic landscape properties en route to the stream, ranging from vegetation and microbial activities to soil and geological attributes. Stream solute concentrations are highly variable and interconnected due to these interactions, flow paths, and residence times, and often exhibit hysteresis with flow. Significant unknowns remain about how point measurements of stream solute chemistry reflect interdependent hydrobiogeochemical and physical processes, and how signatures are encapsulated as nonlinear dynamical relationships between variables. We take a machine learning approach to understand and capture these dynamical relationships and improve predictions of solutes at short and long time scales. We introduce a physical process-based ”flow-gate” into an LSTM (long short-term memory) model, which enables the model to learn hysteresis behaviors if they exist. Further, we use information-theoretic metrics to detect how solutes are interdependent, and iteratively select source solutes that best predict a given target solute concentration. The ”flow-gate LSTM” model improves model predictions (RSME values decrease from 1% to 32%) relative to the standard LSTM model for all nine solutes included in the study. The predictive improvements from the flow-gate LSTM model highlight the importance of lagged concentration and discharge relationships for certain solutes. It also indicates a potential limitation in the traditional LSTM model approach since flow rates are always provided as input sources, but this information is not fully utilized. This work provides a starting point for a predictive understanding of geochemical interdependencies using machine-learning approaches and highlights potential improvements in model architecture.

Rohit Nandan

and 3 more

Maryam Ghadiri

and 4 more

Enhanced water management systems depend on accurate estimation of hydraulic properties of subsurface formations. This is while hydraulic conductivity of geologic formations could vary significantly. Therefore, using information only from widely spaced boreholes will be insufficient in characterizing subsurface aquifer properties. Hence, there is a need for other sources of information to complement our hydro-geophysics understanding of a region of interest. This study presents a numerical framework where information from different measurement sources is combined to characterize the 3-dimensional random field representing the hydraulic conductivity of a watershed in a Multi-Fidelity estimation model. Coupled with this model, a Bayesian experimental design will also be presented that is used to select the best future sampling locations. This work draws upon unique capabilities of electrical resistivity tests as well as statistical inversion. It presents a Multi-Fidelity Gaussian Processes (Kriging) model to estimate the geological properties in Upper Sangamon Watershed in east central Illinois, using multi-source observation data, obtained from electrical resistivity and pumping tests. We demonstrate the accuracy of Co-Kriging that is dependent on the locations and the distribution of both the high- and low-fidelity data, and also discuss its comparison with Single-High-Fidelity Kriging results. The uncertainties and confidence in the measurements and parameter estimates are then quantified and are in turn used to design future cycles of data collection to further improve the confidence intervals.