Tadd Bindas

and 7 more

Recently, rainfall-runoff simulations in small headwater basins have been improved by methodological advances such as deep neural networks (NNs) and hybrid physics-NN models — particularly, a genre called differentiable modeling that intermingles NNs with physics to learn relationships between variables. However, hydrologic routing, necessary for simulating floods in stem rivers downstream of large heterogeneous basins, had not yet benefited from these advances and it was unclear if the routing process can be improved via coupled NNs. We present a novel differentiable routing model that mimics the classical Muskingum-Cunge routing model over a river network but embeds an NN to infer parameterizations for Manning’s roughness (n) and channel geometries from raw reach-scale attributes like catchment areas and sinuosity. The NN was trained solely on downstream hydrographs. Synthetic experiments show that while the channel geometry parameter was unidentifiable, n can be identified with moderate precision. With real-world data, the trained differentiable routing model produced more accurate long-term routing results for both the training gage and untrained inner gages for larger subbasins (>2,000 km2) than either a machine learning model assuming homogeneity, or simply using the sum of runoff from subbasins. The n parameterization trained on short periods gave high performance in other periods, despite significant errors in runoff inputs. The learned n pattern was consistent with literature expectations, demonstrating the framework’s potential for knowledge discovery, but the absolute values can vary depending on training periods. The trained n parameterization can be coupled with traditional models to improve national-scale flood simulations.

Tadd Bindas

and 7 more

Recently, runoff simulations in small, headwater basins have been improved by methodological advances such as deep learning (DL). Hydrologic routing modules are typically needed to simulate flows in stem rivers downstream of large, heterogeneous basins, but obtaining suitable parameterization for them has previously been difficult. It is unclear if downstream daily discharge contains enough information to constrain spatially-distributed parameterization. Building on recent advances in differentiable modeling principles, here we propose a differentiable, learnable physics-based routing model. It mimics the classical Muskingum-Cunge routing model but embeds a neural network (NN) to provide parameterizations for Manning’s roughness coefficient (n) and channel geometries. The embedded NN, which uses (imperfect) DL-simulated runoffs as the forcing data and reach-scale attributes as inputs, was trained solely on downstream hydrographs. Our synthetic experiments show that while channel geometries cannot be identified, we can learn a parameterization scheme for n that captures the overall spatial pattern. Training on short real-world data showed that we could obtain highly accurate routing results for both the training and inner, untrained gages. For larger basins, our results are better than a DL model assuming homogeneity or the sum of runoff from subbasins. The parameterization learned from a short training period gave high performance in other periods, despite significant bias in runoff. This is the first time an interpretable, physics-based model is learned on the river network to infer spatially-distributed parameters. The trained n parameterization can be coupled to traditional runoff models and ported to traditional programming environments.

Kai Ma

and 7 more

There is a drastic geographic imbalance in available global streamflow gauge and catchment property data, with additional large variations in data characteristics, so that models calibrated in one region cannot normally be migrated to another. Currently in these regions, non-transferable machine learning models are habitually trained over small local datasets. Here we show that transfer learning (TL), in the sense of weights initialization and weights freezing, allows long short-term memory (LSTM) streamflow models that were trained over the Conterminous United States (CONUS, the source dataset) to be transferred to catchments on other continents (the target regions), without the need for extensive catchment attributes. We demonstrate this possibility for regions where data are dense (664 basins in the UK), moderately dense (49 basins in central Chile), and where data are scarce and only globally-available attributes are available (5 basins in China). In both China and Chile, the TL models significantly elevated model performance compared to locally-trained models. The benefits of TL increased with the amount of available data in the source dataset, but even 50-100 basins from the CONUS dataset provided significant value for TL. The benefits of TL were greater than pre-training LSTM using the outputs from an uncalibrated hydrologic model. These results suggest hydrologic data around the world have commonalities which could be leveraged by deep learning, and significant synergies can be had with a simple modification of the currently predominant workflows, greatly expanding the reach of existing big data. Finally, this work diversified existing global streamflow benchmarks.

Kuai Fang

and 3 more

Recently, recurrent deep networks have shown promise to harness newly available satellite-sensed data for long-term soil moisture projections. However, to be useful in forecasting, deep networks must also provide uncertainty estimates. Here we evaluated Monte Carlo dropout with an input-dependent data noise term (MCD+N), an efficient uncertainty estimation framework originally developed in computer vision, for hydrologic time series predictions. MCD+N simultaneously estimates a heteroscedastic input-dependent data noise term (a trained error model attributable to observational noise) and a network weight uncertainty term (attributable to insufficiently-constrained model parameters). Although MCD+N has appealing features, many heuristic approximations were employed during its derivation, and rigorous evaluations and evidence of its asserted capability to detect dissimilarity were lacking. To address this, we provided an in-depth evaluation of the scheme’s potential and limitations. We showed that for reproducing soil moisture dynamics recorded by the Soil Moisture Active Passive (SMAP) mission, MCD+N indeed gave a good estimate of predictive error, provided that we tuned a hyperparameter and used a representative training dataset. The input-dependent term responded strongly to observational noise, while the model term clearly acted as a detector for physiographic dissimilarity from the training data, behaving as intended. However, when the training and test data were characteristically different, the input-dependent term could be misled, undermining its reliability. Additionally, due to the data-driven nature of the model, the two uncertainty terms are correlated. This approach has promise, but care is needed to interpret the results.

Farshid Rahmani

and 5 more

Stream water temperature (T) is a variable of critical importance and decision-making relevance to aquatic ecosystems, energy production, and human’s interaction with the river system. Here, we propose a basin-centric stream water temperature model based on the long short-term memory (LSTM) model trained over hundreds of basins over continental United States, providing a first continental-scale benchmark on this problem. This model was fed by atmospheric forcing data, static catchment attributes and optionally observed or simulated discharge data. The model achieved a high performance, delivering a high median root-mean-squared-error (RMSE) for the groups with extensive, intermediate and scarce temperature measurements, respectively. The median Nash Sutcliffe model efficiency coefficients were above 0.97 for all groups and above 0.91 after air temperature was subtracted, showing the model to capture most of the temporal dynamics. Reservoirs have a substantial impact on the pattern of water temperature and negative influence the model performance. The median RMSE was 0.69 and 0.99 for sites without major dams and with major dams, respectively, in groups with data availability larger than 90%. Additional experiments showed that observed or simulated streamflow data is useful as an input for basins without major dams but may increase prediction bias otherwise. Our results suggest a strong mapping exists between basin-averaged forcings variables and attributes and water temperature, but local measurements can strongly improve the model. This work provides the first benchmark and significant insights for future effort. However, challenges remain for basins with large dams which can be targeted in the future when more information of withdrawal timing and water ponding time were accessible.