Camera traps have become in-situ sensors for collecting information on animal abundance and occupancy estimates. When deployed over a large landscape, camera traps have become ideal for measuring the health of ecosystems, particularly in unstable habitats where it can be dangerous or even impossible to observe using conventional methods. However, manual processing of imagery is extremely time and labor intensive. Because of the associated expense, many studies have started to employ machine learning tools, such as convolutional neural networks (CNNs). One drawback is that for the majority of networks a large number of images (millions) are needed to devise an effective identification or classification model. This study examines specific factors pertinent to camera trap placement in the field that may influence the accuracy metrics of a deep learning model that has been trained with a small set of images. False negatives and false positives may occur due to a variety of environmental factors that make it difficult for even a human observer to classify, including local weather patterns and daylight. We transfer-trained a CNN to detect 16 different object classes (14 animal species, humans, and fires) across 9,576 images taken from camera traps placed in the Chernobyl Exclusion Zone. After analyzing wind speed, cloud cover, temperature, and image contrast, there was a significant positive association between CNN success and temperature. Furthermore, we found that the model was more successful when images were taken during the day as well as when precipitation was not present. We show that external variables at camera trap locations have a noticeable effect on CNN accuracy. Qualitative site-specific factors can confuse quantitative classification algorithms such as CNNs. This study suggests that further exploration into the causes of error in classification modeling is necessary given the unique challenges posed by the analysis of camera trap imagery.