Abstract
The conduct of reproducible science improves when computations are
portable and verifiable. A container provides an isolated environment
for running computations and thus is useful for porting applications on
new machines. Current container engines, such as Linux Containers (LXC)
and Docker, however, have a high learning curve, are resource-intensive,
and do not address the entire reproducibility spectrum consisting of
portability, repeatability, and replicability. As part of EarthCube, we
have developed Sciunit (https://sciunit.run) which encapsulates
application dependencies i.e, system binaries, code, data, environment,
along with application provenance. The resulting research object can be
easily shared and reused amongst collaborators. Sciunit can be used with
HydroShare’s JupyterHub CUAHSI notebook environment, and available to
the entire community for use. In this poster, we will present three new
features in Sciunit which have emerged based on community-provided use
cases and discussion. Sciunit is available as a command-line utility. We
will: (1) showcase the new Sciunit API. This will allow data facilities
to integrate Sciunit as a reproducible environment on portals, (2) show
how a Sciunit container can transition to a Docker container and vice
versa, and finally, (3) demonstrate the ability to contrast two
containers in terms of content and metadata. We will show these
capabilities with the Hydrology use case of pySUMMA, a Python API for
the Structure for Unifying Multiple Modeling Alternative (SUMMA)
hydrologic model.