Abstract
Solar flares are among the most intense solar phenomena, and precise
forecasting of solar flares holds significant importance in space
weather research. The ability to predict solar flares is crucial for
protecting the near-Earth space environment and preserving the integrity
of our technological infrastructure from potential detrimental
consequences. Over the last two decades, machine learning and deep
learning models have made a significant impact on the prediction of
solar flares, leveraging their capacity to learn from high-dimensional
data spaces. However, the scarcity of high-quality data from the field
of solar flare prediction becomes a daunting challenge for such tasks.
One of the methods to tackle the scarcity of high-quality data is to
generate synthetic samples (i.e., data augmentation). In this study, we
aim to explore the role of data augmentation on time series-based flare
prediction models, namely, deep learning-based methods. We utilize the
latest time series-based benchmark dataset extracted from the vector
magnetograms of Solar Dynamics Observatory’s Helioseismic and Magnetic
Imager (SDO/HMI). Specifically, we use seven-time series data
augmentation techniques to enrich our dataset and train three machine
learning models for multivariate time series classification. To our
knowledge, this is the first research effort that attempts to explore
data augmentation’s impact on the solar flare prediction problem.