Despite the proliferation of computer-based research on hydrology and water resources, such research is typically poorly reproducible. Published studies have low reproducibility due to incomplete availability of data and computer code, and a lack of documentation of workflow processes. This leads to a lack of transparency and efficiency because existing code can neither be quality controlled nor re-used. Given the commonalities between existing process-based hydrological models in terms of their required input data and preprocessing steps, open sharing of code can lead to large efficiency gains for the modeling community. Here we present a model configuration workflow that provides full reproducibility of the resulting model instantiations in a way that separates the model-agnostic preprocessing of specific datasets from the model-specific requirements that models impose on their input files. We use this workflow to create large-domain (global, continental) and local configurations of the Structure for Unifying Multiple Modeling Alternatives (SUMMA) hydrologic model connected to the mizuRoute routing model. These examples show how a relatively complex model setup over a large domain can be organized in a reproducible and structured way that has the potential to accelerate advances in hydrologic modeling for the community as a whole. We provide a tentative blueprint of how community modeling initiatives can be built on top of workflows such as this. We term our workflow the “Community Workflows to Advance Reproducibility in Hydrologic Modeling’‘ (CWARHM; pronounced “swarm”).

Jeffery Horsburgh

and 3 more

Critical Zone (CZ) scientists study the system of coupled chemical, biological, physical, and geological processes operating together across all scales to support life at the Earth’s surface (Brantley et al., 2007). In 2020, the U.S. National Science Foundation funded a new network of Thematic Cluster projects who are working collaboratively to answer scientific questions related to effects of urbanization on CZ processes; CZ function in semi-arid landscapes and the role of dust in sustaining these ecosystems; processes in deep bedrock and their relationship to CZ evolution; recovery of the CZ from disturbances such as fire and flooding; and changes in the coastal CZ related to rising sea level. Given the diversity of data being collected by these projects, supporting data collection, access, and archival for the larger network presents significant challenges. Leveraging existing repositories and cyberinfrastructure provides many benefits, but still poses the questions of which repositories to use and how to enable discovery of and access to data that may be deposited across different repositories. This presentation describes new cyberinfrastructure development that leverages existing, domain-specific data repositories to enable managing, curating, disseminating, and preserving data from the new network of CZ Thematic Cluster projects. A distributed architecture is under development that links existing data facilities and services, including HydroShare, EarthChem, SESAR, and eventually other systems as needed, via a CZ Hub that provides tools for simplified data submission, discovery and access, and links to computational resources for data analysis and visualization in support of CZ synthesis efforts. Our goal is to make data, samples, and software collected by the Thematic Cluster projects Findable, Accessible, Interoperable, and Reusable (FAIR), using existing domain-specific repositories. This collaboration among repositories to deliver integrated data services for an interdisciplinary science program may provide a template for future development of integrated, interdisciplinary data services. Brantley, S.L., M.B. Goldhaber, V. Ragnarsdottir (2007). Crossing disciplines and scales to understand the Critical Zone. Elements 3, 307-314, doi:10.2113/gselements.3.5.307.