We classify all sky images from 4 seasons, transform the classification results into time-series data to include information about the evolution of images and combine these with information on the onset of geomagnetic substorms. We train a lightweight classifier on this dataset to predict the onset of substorms within a 15 minute interval after being shown information of 30 minutes of aurora. The best classifier achieves a balanced accuracy of 59% with a recall rate of 39% and false positive rate of 20%. We show that the classifier is limited by the strong imbalance in the dataset of approximately 50:1 between negative and positive events. All software and results are open source and freely available.