Exploring the potential of neural networks to predict statistics of
solar wind turbulence
Abstract
Time series datasets often have missing or corrupted entries, which need
to be ignored in subsequent data analysis. For example, in the context
of space physics, calibration issues, satellite telemetry issues, and
unexpected events can make parts of a time series unusable. Various
approaches exist to tackle this problem, including mean/median
imputation, linear interpolation, and autoregressive modeling. Here we
study the utility of artificial neural networks (ANNs) to predict
statistics, particularly second-order structure functions, of turbulent
time series concerning the solar wind. Using a dataset with artificial
gaps, a neural network is trained to predict second-order structure
functions and then tested on an unseen dataset to quantify its
performance. A small feedforward ANN, with only 20 hidden neurons, can
predict the large-scale fluctuation amplitudes better than mean
imputation or linear interpolation when the percentage of missing data
is high. Although they perform worse than the other methods when it
comes to capturing both the shape and fluctuation amplitude together,
their performance is better in a statistical sense for large fractions
of missing data. Caveats regarding their utility, the optimisation
procedure, and potential future improvements are discussed.