Balancing EC-Earth3: Improving the performance of EC-Earth CMIP6
configurations by minimizing the coupling cost
Abstract
Earth System Models (ESMs) are complex systems used in weather and
climate studies generally built from different independent components
responsible for simulating a specific realm (ocean, atmosphere,
biosphere, etc.). To replicate the interactions between these processes,
ESMs typically use coupling libraries that manage the synchronization
and field exchanges between the individual components, which run in
parallel as a Multi-Program, Multiple-Data (MPMD) application.
As ESMs get more complex (increase in resolution, number of components,
configurations, etc.), achieving the best performance when running in
HPC platforms has become increasingly challenging and of major concern.
One of the critical bottlenecks is the load-imbalance, where the fastest
components will have to wait for the slower ones. Finding the optimal
number of processing elements (PEs) to assign to each of the multiple
independent constituents to minimize the performance loss due to
synchronizations and maximize the overall parallel efficiency is
impossible without the right performance metrics, methodology and
tools.
This paper presents the results of balancing multiple Coupled Model
Intercomparison Project phase 6 (CMIP6) configurations for the EC-Earth3
ESM. We will show that intuitive approaches can lead to suboptimal
resource allocations and propose new setups up to 25% fasters while
reducing the computational cost by 72%.
We prove that new methods are needed to deal with the load-balance of
ESMs and hope that our study will serve as a guide to optimize any other
coupled system.