Machine-Learning Research in the Space Weather Journal:
Prospects, Scope and Limitations
Noé Lugaz, Huixin Liu, Mike Hapgood, Steven Morley
Abstract: Manuscripts based on machine-learning
techniques have significantly increased in Space Weather over the past
few years. We discuss which manuscripts are within the journal’s scope
and emphasize that manuscripts focusing purely on a forecasting
technique (rather than on understanding and forecasting a phenomenon)
must correspond to a substantial improvement over the current
state-of-the-art techniques and present this comparison. All manuscripts
shall include information about data preparation, including splitting of
data between training, validation and testing sets. The software and/or
algorithms used for to develop the machine-learning technique should be
included in a repository at the time of submission. Comparison with
published results using other methods must be presented, and
uncertainties of the forecast results must be discussed.
While machine-learning techniques in space weather have been around
since the first years of the field (e.g., see O’Brien and McPherron,
2003; Anderson et al., 2004), there has been a clear growth in published
articles using machine-learning techniques in the past four years. Since
2018, there has been at least 10 articles published in Space
Weather per year using machine-learning techniques, whereas no year
before 2018 had more than three such articles. As a result, about 15%
of articles published in Space Weather in 2021 have the term “machine
learning”, “deep learning” or “neural network” in their abstract
(18 so far). Overall, this growth covers all aspects of space weather,
about equally distributed among work focusing on forecasting the
radiation belts, geomagnetic indices (Dst, Kp), the ionosphere Total
Electron Content (TEC) and solar flares. Journal of Geophysical
Research: Space Physics has witnessed a similar growth with about as
many articles published using machine-learning (ML) techniques but as
that journal publishes about six times more articles than Space Weather,
this is still a very small portion of total number of published
articles.
In light of this growth, the Space Weather ’s editorial team has
been discussing the place that ML-based forecasts of space weather
phenomena should have within space weather research. Note that
Camporeale (2019) in a Grand Challenges review discussed the prospect
and technical challenges of applying machine learning to space weather
research. Readers interested in the specific applications and
recommendations in moving towards probabilistic forecast, and assessing
uncertainties, for example, are referred to this article. Hereafter, we
focus on the scope of articles focusing or relying on ML techniques to
be published in Space Weather. At the core is the current scope
of the journal (“to understand and forecast space weather”) as well as
the readership composed of a mix of researchers, forecasters, end-users,
and policy makers.
Pure ML manuscripts, i.e. manuscripts presenting a new technique
are not within scope, as there are specialized journals more appropriate
for such articles. Manuscripts presenting a ML-technique to better
understand the drivers of space weather are in scope but
must, as any other manuscript, bring significant new insight to specific
space weather phenomena to be published. This assessment is typically
done by reviewers but is sometimes undertaken directly by the editor.
AGU’s Earth and Space Science is a journal where manuscripts
presenting a new technique which confirms existing knowledge are within
scope. Manuscripts presenting a new forecasting scheme
based on an existing ML technique are only in scope if they present a
substantial improvement over state-of-the-art existing forecasting
models. Comparison of the results of the new model to those from
state-of-the-art models needs to be presented within the manuscript. The
comparison cannot be limited to the most simplistic models (climatology,
persistence, etc.) unless there aren’t more appropriate models. AGU open
data/software policy results in models published in the past few years
being accessible for this comparison.
The lack of sufficient comparison with published results using other
methods has been one major obstacle to judging the
advancement/improvements of a new ML method. To facilitate the proper
use and further development of ML in the field of space weather,
straightforward inter-comparison between different studies must be made
available. Towards this end and in line with AGU software policy, the
model/code used in any new study submitted to Space Weathershould be included in a repository at the time of submission. If this is
a standard machine-learning code available in existing libraries, the
set-up, version numbers, input parameters/coefficients need to be
provided in supplementary information. The parameter space for which the
model can be used and was validated should be provided in the
manuscript. The data used to train, validate, and test the model should
be presented. Issues with splitting the data, and the cyclical
dependency (diurnal, seasonal, solar cycle) of many space weather
phenomena, should be clearly described in the main text of the
manuscript. This also includes the treatment of data gaps, extreme
events, and outliers, which could directly affect the ML model outcome.
Furthermore, the uncertainty of the prediction should be discussed.
Finally, the improvement over existing ML methods and new physical
insights should be discussed.
The presentation of metrics to quantify the goodness of the technique
should go beyond correlation, RMSE and MAE. Specific examples of metrics
and community best practices were provided in the Space Weather
Capabilities Assessment topical issue of Space Weather and should
be followed if possible. This includes the forecast of geomagnetic
indices (Liemohn et al., 2018), thermospheric neutral densities
(Bruinsma et al., 2018), radiation and plasma environment (Zheng et al.,
2019), and arrival time of coronal mass ejections (Verbeke et al.,
2019), among others. When appropriate, specific thresholds should be
defined in order to develop a binary classification. Deterministic
forecasting metrics should then be included, including skill scores.
Studies using probabilistic forecasts (Camporeale, 2019) are also highly
encouraged. For both approaches (deterministic and probabilistic)
authors must cite references that describe their metrics and, if using
metrics developed by the ML community, show how those are related to
metrics used by the wider forecasting and research community. We note
that these two communities often use different names for the same
metrics. For example, the metrics of precision and recall used in the ML
are identical to metrics of success ratio and probability of detection
as used by the forecasting community (see section 6 of Morley et al.,
2020 for more examples). We consider it important to develop a joint
understanding of these two sets of metrics as ML becomes more widely
used for space weather purposes.