This paper proposes a joint optimization method for the imaging algorithm and sampling scheme of sparse spotlight syhthetic aperture radar (SAR) imaging based on deep convolutional neural networks. Traditional compressed sensing (CS) based sparse SAR imaging has been widely studied. Deep learning and sparse unfolding networks have been introduced into sparse SAR imaging, but most current works focus only on the imaging stage and simply adopt the conventional uniform or random down-sampling scheme. Considering that the imaging quality also depends on the sampling pattern besides the imaging algorithm, this paper introduces a learning-based strategy to jointly optimize the sampling scheme and the imaging network parameters of the reconstruction module. In a deep learning-based image reconstruction scheme, joint and continuous optimization of the sampling patterns and convolutional neural network parameters is achieved to improve the image quality. Simulation results based on real SAR image dataset illustrate the effectiveness and superiority of the proposed framework.