Accurate load balancing accelerates Lagrangian simulation of water ages
on distributed, multi-GPU platforms
Abstract
Water age is a fundamental descriptor of source, storage, and mixing of
water parcels in a watershed. The Lagrangian, particle tracking,
approach is a powerful tool for physically-based modeling of water age
distributions, but its application has been hampered since it is
computationally demanding. In this study, we present a parallel approach
for particle tracking simulations. This approach uses multi-GPU with MPI
parallelism based on domain decomposition. An inherent challenge of
distributed parallelization of Lagrangian approaches is the disparity in
computational work or load imbalance (LIB) among different processing
elements (PEs). Here, load balancing (LB) schemes were proposed to
dynamically balance the distribution of particles across PEs during
runtime. In the followed hillslope simulations, LIB was observed in all
LB-disabled runs, e.g., with a load ratio of 423.62% by using 2-GPU in
LW_Shrub case. LB schemes then accurately balanced the load
distribution and improved the parallel scaling. Additionally, the
parallel approach showed excellent overall speedup: a 60-fold
improvement using 4-GPU relative to the serial run. A regional scale
application further demonstrated the LB performance. The parallel time
used by 8-GPU without LB was 31.33% reduced after LB was activated.
When increasing 8-GPU with LB to 16-GPU with LB, it showed parallel
scalability by reducing the parallel time of ~50%. This
work shows how massively parallel computing can be applied to particle
tracking in water age simulations. It also demonstrates the practical
importance of load balancing in this context, which enables the
large-scale simulations with an increased complexity of flow paths.