Optimising short-term load forecasting performance is a challenge due to the randomness of nonlinear power load and variability of system operation mode. The existing methods generally ignore how to reasonably and effectively combine the complementary advantages among them and fail to capture enough internal information from load sequence, resulting in accuracy reduction. To achieve accurate and efficient short-term load forecasting, an integral implementation framework is proposed based on convolutional neural network (CNN), gated recurrent unit (GRU) and channel attention mechanism. CNN and GRU are first combined to fully extract the complicated dynamic features and learn the time compliance relationship of load sequence. Based on CNN-GRU network, the channel attention mechanism is introduced to further reduce the loss of historical information and enhance the impact of important features. Then, the overall framework of short-term load forecasting based on CNN-GRU-Attention network is proposed, and the coupling relationship between each designed stage is revealed. Finally, the developed framework is implemented on one realistic load dataset of distribution networks, and the experimental results verify the proposed method outperforms the state-of-the-art models in common metrics.