Reproducible Hydrological Modeling with CyberGIS-Jupyter For Water (CJW) and HydroShare

$Fangzheng$ $Lyu^{1}$, $Zhiyu$ $Li^{1}$, $Anand$ $Padmanabhan^{1}$, $Shaowen$ $Wang^{1}$, $Youngdon$ $Choi^{2}$, $Jonathan$ $Goodall^{2}$, $Andrew$ $Bennett^{3}$, $Bart$ $Nijssen^{3}$, $David$ $Tarboton^{4}$

$^{1}$ $University$ $of$ $Illinois$ $at$ $Urbana-Champaign$; $^{2}$ $University$ $of$ $Virginia$; $^{3}$ $University$ $of$ $Washington$; $^{4}$ $Utah$ $State$ $University$

CyberGIS-Jupyter for Water (CJW), leveraging the cyberGIS software ecosystem, is integrated with HydroShare. CJW provides a collaborative platform for enabling computationally intensive and reproducible hydrologic research by delivering advanced cyberinfrastructure and cyberGIS capabilities based on high-performance computing (HPC) resources such as Virtual ROGER and XSEDE Comet. The Structure For Unifying Multiple Modeling Alternatives (SUMMA), which is a hydrological modeling framework, allows for formal evaluation of multiple working hypotheses on model representations of physical processes. This CyberGIS-Jupyter notebook illustrates specific support for a SUMMA model on top of the cutting-edge hydrologic modeling capabilities on CJW. By taking advantage of CJW, users can easily tune different parameters for a SUMMA model and submit computationally intensive High-Throughput Computing (HTC) jobs for executing the model on HPC resources via Jupyter notebooks without having to possess in-depth technical knowledge about cyberGIS or HydroShare. Computational experiments demonstrate that the integration of cyberGIS capabilities and HydroShare achieves a high-performance and easy-to-use environment for reproducible SUMMA-based hydrological modeling.

The Structure For Unifying Multiple Modeling Alternatives (SUMMA)

SUMMA or the Structure for Unifying Multiple Modeling Alternatives is a hydrologic modeling approach that is built on a common set of conservation equations and a common numerical solver, which together constitute the structural core of the model. Different modeling approaches can then be implemented within the structural core, enabling a controlled and systematic analysis of alternative modeling options, and providing insight for future model development.

  1. The formulation of the conservation equations is cleanly separated from their numerical solution;
  2. Different model representations of physical processes (in particular, different flux parameterizations) can be used within a common set of conservation equations; and
  3. The physical processes can be organized in different spatial configurations, including model elements of different shape and connectivity (e.g., nested multi-scale grids and HRUs)

Architecture of Job Submission System

The architecture of job submission system is illustrated as follows. The architecture of the integrated system enables interactions among three key entities: users, CyberGISX frontend, and HPC resources provided through cyberGIS platform (e.g. keeling). In addition, there are six supporting components with which the key entities interact with: 1) CyberGISX website that acts as a portal for users to login out server; 2) authentication system for CyberGISX platform; 3) HydroShare for hydrological data retrieval; 4) Shared folder to store existing resources to avoid excessive data transfer; 5) JupyterHub with appropriate cyberGIS and geospatial python libraries installed, and 6) Docker hub for SUMMA singularity image. HydroShare is a collaborative research platform for advancing hydrological data and model sharing.