Plain Language Summary
Many different emission pathways exist that are compatible with the
Paris climate agreement, and many more are possible that miss that
target. While some of the most complex Earth System Models have
simulated a small selection of possible futures, it is impractical to
use these expensive models to fully explore the space of possibilities.
Such explorations therefore mostly rely simple approximations of the
global mean temperature response to a given scenario. Here we present
ClimateBench - a benchmarking framework based on a suite of
state-of-the-art simulations performed by a full complexity Earth System
Model, and a set of baseline machine learning models that emulate its
response to a variety of forcers. These emulators can predict annual
mean global distributions of temperature, diurnal temperature range and
precipitation (including extreme precipitation) given a wide range of
emissions and concentrations of carbon dioxide, methane and aerosols,
allowing them to efficiently probe previously unexplored scenarios. We
also describe a set of evaluation metrics which we hope will entice
statisticians and machine learning experts to tackle this important and
demanding challenge.