Yuan Yang

and 9 more

Accurate global river discharge estimation is crucial for advancing our scientific understanding of the global water cycle and supporting various downstream applications. In recent years, data-driven machine learning models, particularly the Long Short-Term Memory (LSTM) model, have shown significant promise in estimating discharge. Despite this, the applicability of LSTM models for global river discharge estimation remains largely unexplored. In this study, we diverge from the conventional basin-lumped LSTM modeling in limited basins. For the first time, we apply an LSTM on a global 0.25° grid, coupling it with a river routing model to estimate river discharge for every river reach worldwide. We rigorously evaluate the performance over 5332 evaluation gauges globally for the period 2000-2020, separate from the training basins and period. The grid-scale LSTM model effectively captures the rainfall-runoff behavior, reproducing global river discharge with high accuracy and achieving a median Kling-Gupta Efficiency (KGE) of 0.563. It outperforms an extensively bias-corrected and calibrated benchmark simulation based on the Variable Infiltration Capacity (VIC) model, which achieved a median KGE of 0.466. Using the global grid-scale LSTM model, we develop an improved global reach-level daily discharge dataset spanning 1980 to 2020, named GRADES-hydroDL. This dataset is anticipated to be useful for a myriad of applications, including providing prior information for the Surface Water and Ocean Topography (SWOT) satellite mission. The dataset is openly available via Globus.

Shujie Cheng

and 5 more

Understanding the partitioning of runoff into baseflow and quickflow is crucial for informed decision-making in water resources management, guiding the implementation of flood mitigation strategies, and supporting the development of drought resilience measures. Methods that combine the physically-based Budyko framework with machine learning (ML) have shown promise in estimating global runoff. However, such ‘hybrid’ approaches have not been used for baseflow estimation. Here, we develop a Budyko-constrained ML approach for baseflow estimation by incorporating the Budyko-based baseflow coefficient (BFC) curve as a physical constraint. We estimate the parameters of the original Budyko curve and the newly developed BFC curve based on 13 climatic and physiographic characteristics using boosted regression trees (BRT). BRT models are trained and tested in 1226 catchments worldwide and subsequently applied to the entire global land surface at a 0.25° grid scale. The catchment-trained models exhibit strong performance during the testing phase, with R2 values of 0.96 and 0.88 for runoff and baseflow, respectively. Results reveal that, on average, 30.3% (spatial standard deviation std=26.5%) of the continental precipitation is partitioned into runoff, of which 20.6% (std=22.1%) is baseflow and 9.7% (std=10.3%) is quickflow. Among the 13 climatic and physiographic characteristics, topography and soil-related characteristics generally emerge as the most important drivers, although significant regional variability is observed. Comparisons with previous datasets suggest that global runoff partitioning is still highly uncertain and warrants further research.