Due to the limited understanding of the physical/chemical processes and large uncertainties in emissions, ozone prediction task becomes more difficult with numerical models. Deep learning provides an alternative way. However, most of the deep learning ozone prediction models only consider temporality and have limited capacity. Exist spatiotemporal deep learning models generally suffer from model complexity and inadequate spatiality learning. Thus, we propose a novel spatiotemporal model, namely the Spatiotemporal Attentive Gated Recurrent Unit (STAGRU), which employs double attention mechanism and Gated Recurrent Unit (GRU) to capture spatiotemporal information. We compare STAGRU with Seq2Seq and their single attention version on nine monitoring stations in Nanjing. The results show that STAGRU outperforms other competitors in terms of RMSE, R2, and SMAPE. In addition, we make an interpretability discussion for STAGRU. The discussion reveals that wind direction plays an important role in ozone transmission and temporality mainly involves short-term and periodical dependency.