One major challenge of solar flare prediction with machine learning methods is the scarcity of large flares. This issue of low positive sample size is even more severe for data observed in the relatively weak Solar Cycle 24, for example, the SHARPs data product. This partly hampers the successful application of deep learning methods, especially those dealing with high-dimensional spatial and/or temporal data. By joining SHARPs with Space-Weather MDI Active Region Patches (SMARPs), a new data product derived from observations in Solar Cycle 23, we are able to obtain a fused dataset with nearly tripled positive samples. We evaluated two deep learning methods, LSTM and CNN, using the selected parameter sequences and image snapshots in the fused dataset. Experiment results show that the two models trained on the fused dataset achieve better or equivalent test set performance than those trained on a single solar cycle. In addition, we demonstrate the improvement of the performance of the stacking ensemble that combines LSTM and CNN. We provided interpretation to CNN using modern visual attribution methods in computer vision. The results show that CNN is able to identify flare-related signatures in magnetograms.