\justifying The atmospheric radiative transfer calculations are among the most time-consuming components of the numerical weather prediction (NWP) models. Deep learning (DL) models have recently been increasingly applied to accelerate radiative transfer modeling. Besides, a physical relationship exists between the output variables, including fluxes and heating rate profiles. Integration of such physical laws in DL models is crucial for the consistency and credibility of the DL-based parameterizations. Therefore, we propose a physics-incorporated framework for the radiative transfer DL model, in which the physical relationship between fluxes and heating rates is encoded as a layer of the network so that the energy conservation can be satisfied. It is also found that the prediction accuracy was improved with the physic-incorporated layer. In addition, we trained and compared various types of deep learning model architectures, including fully connected (FC) neural networks (NNs), convolutional-based NNs (CNNs), bidirectional recurrent-based NNs (RNNs), transformer-based NNs, and neural operator networks, respectively. The offline evaluation demonstrates that bidirectional RNNs, transformer-based NNs, and neural operator networks significantly outperform the FC NNs and CNNs due to their capability of global perception. A global perspective of an entire atmospheric column is essential and suitable for radiative transfer modeling as the changes in atmospheric components of one layer/level have both local and global impacts on radiation along the entire vertical column. Furthermore, the bidirectional RNNs achieve the best performance as they can extract information from both upward and downward directions, similar to the radiative transfer processes in the atmosphere.