The radiation parameterization is one of the computationally most expensive components of Earth system models (ESMs). To reduce computational cost, radiation is often calculated on coarser spatial or temporal scales, or both, than other physical processes in ESMs, leading to uncertainties in cloud-radiation interactions and thereby in radiative temperature tendencies. One way around this issue is the emulation of the radiation parameterization using machine learning which is usually faster and has good accuracy in a high dimensional parameter space. This study investigates the development and interpretation of a machine learning based radiation emulator using the ICOsahedral Non-hydrostatic (ICON) model with the RTE-RRTMGP radiation code which calculates radiative fluxes based on the atmospheric state and its optical properties. With a Bidirectional Long Short-Term Memory (Bi-LSTM) architecture, which can account for vertical bidirectional auto-correlation, we can accurately emulate shortwave and longwave heating rates with a mean absolute error of $0.049~K/d\,(2.50\%)$ and $0.069~K/d\,(5.14\%)$ respectively. Further, we analyse the trained neural networks using Shapley Additive exPlanations (SHAP) and confirm that the networks have learned physical meaningful relationships among the inputs and outputs. Notably, we observe that the local temperature is used as a predictive source for the longwave heating, consistent with physical models of radiation. For shortwave heating, we find that clouds reflect radiation, leading to reduced heating below the cloud.