Draft 29 Aug 2022
Determining the timing of driver influences on 1.8-3.5 MeV electron flux at geosynchronous orbit using ARMAX methodology
L. E. Simms1,2, M. J. Engebretson1, G. D. Reeves3
1Department of Physics, Augsburg University, Minneapolis, MN, USA
2Climate and Space Sciences and Engineering, University of Michigan, Ann Arbor, MI, USA
3Space Science and Applications Group, Los Alamos National Laboratory, Los Alamos, NM, USA
Corresponding author: L. E. Simms (simmsl@augsburg.edu)
Keypoints
1. ARMAX models show drivers of relativistic electron flux are influential only within a few hours of flux changes
2. Contrary to simple correlation findings, influences are lower in magnitude and act more immediately
3. Stepwise multiple regression shows less cumulative effects of drivers in after-storm periods than simple correlation would suggest
Abstract
Although lagged correlations have suggested influences of solar wind velocity (V) and number density (N), IMF Bz, ULF wave power, and substorms (as measured by AE) on MeV electron flux at geosynchronous orbit over an impressive number of hours and days, a satellite’s diurnal cycle can inflate correlations, associations between drivers may produce spurious effects, and correlations between all previous time steps may create an appearance of additive influence over many hours. Autoregressive-moving average transfer function (ARMAX) multiple regressions incorporating previous hours simultaneously can eliminate cycles and assess the impact of parameters, at each hour, while others are controlled. ARMAX influences are an order of magnitude lower than correlations. Most influence occurs within a few hours, not the many hours suggested by correlation. Over all hours, V and N show an initial negative impact, with longer term positive influences over the 9 (V) or 27 (N) h. Bz is initially a positive influence, longer term (6 h) negative effect. ULF waves impact flux in the first (positive) and second (negative) hour before the flux measurement, with further negative influences in the 12- 24 h before. AE (representing electron injection by substorms) shows only a short term (1 h) positive influence. However, when only recovery and after-recovery storm periods are considered (using stepwise regression), there are positive influences of ULF waves and V, negative influences of N and Bz, while AE shows no influence.
Plain Language Summary
The influence of solar wind, waves, and substorms on high energy electrons at geosynchronous orbit can appear to occur over a number of hours and days. However, these long duration correlations may be due to diurnal cycles in satellite data, associations between the driving parameters, or correlations of each variable with itself over previous time steps. These extraneous correlations can be corrected for using autoregressive-moving average multiple regression models including previous hours simultaneously. Once these are controlled, the correlations between possible driving parameters and high energy electrons are both lower and influential only over a few hours.
Introduction
The response of high energy (MeV) electron flux at geosynchronous orbit to various solar wind, IMF, and magnetospheric parameters has been well studied at a daily cadence (e.g., Balikhin et al., 2011; Mathie & Mann, 2000; Reeves et al., 2011; Potapov, 2017; Sakaguchi et al., 2015; Simms et al., 2016; 2018; Wing et al., 2016), but injection or acceleration of electrons may occur within 24 h (e.g., Reeves et al., 1998). Daily averages may obscure this activity, and the associations of hourly flux with possible drivers is not as well documented.
The timing and direction of correlations between drivers and flux has been used to infer the physical processes that result in flux changes. For example, ULF wave-driven inward radial diffusion, resulting in electron acceleration, is thought to require a number of days of previous high wave activity (O’Brien et al., 2001; Friedel et al., 2002, Osmane et al., 2022), although the diffusion itself may happen fairly rapidly (Jaynes et al., 2018). ULF waves may also result in outward radial diffusion, leading to electron loss, which would be reflected in a negative correlation, possibly at a different time step (Elkington et al., 2003). High solar wind velocity is thought to drive these ULF waves perhaps via the Kelvin Helmholtz effect (Rostoker et al., 1998) or via its contribution to solar wind pressure variations, with the latter thought to be more likely due to the timing of the maximum correlation (Takahashi and Ukhorskiy, 2007). However, both solar wind velocity and density contribute to pressure and its variations, therefore the correlation of number density with electron flux is also important to consider (Balikhin et al., 2011; Borovsky & Denton, 2014; Boynton et al., 2013; Lyatsky & Khazanov, 2008). Pressure, and therefore both velocity and number density, may also result in rapid flux reductions due to magnetopause shadowing (Shprits et al., 2006; Loto’aniu et al., 2010; Staples et al., 2022; Tu et al., 2019), and there is no reason to discount the possibility that both positive and negative effects could be due to these same variables acting in different ways and at different time scales. Substorms have also been proposed to influence relativistic electron flux through the injection of seed electrons (hundreds of keV), which provide a population that can be accelerated to high energies between the substorm-injected energetic-electron flux in the magnetosphere and relativistic electron fluxes (Birn et al., 1998; Hwang et al., 2007), as well as source electrons (tens of keV), producing the VLF waves that contribute to electron acceleration (Jaynes et al., 2015; Friedel et al., 2002; Summers et al., 2002; Boyd et al., 2014). Substorm activity itself, however, appears to be dependent on a southward IMF Bz (Jaynes et al., 2015). Various magnetospheric indices such as Kp and Dst also correlate well with flux (Borovsky & Denton, 2014; Lam, 2004; Sakaguchi et al., 2015, Su et al., 2018), but the proposed physical action of these indices is not as clear and they tend to correlate highly with the parameters listed above, meaning any correlations between flux and Kp or Dst might more likely be the result of their correlation with other drivers rather than an actual physical relationship with flux.
Understanding the timing of the action of these various driving parameters on electron flux would be helpful in determining the physical relationships between them. Previously, cross correlations (simple correlations at each time step) have been used to study this statistically. Integrating over a number of hours may increase the correlations with flux (Romanova & Pilipenko, 2009). Maximum correlations were found, when integrated, for solar wind velocity (98 h), number density (38 h), and pressure (16 h), IMF Bz (116 h), AE (140 h), a ground ULF index (123 h), Dst (106 h), and Kp (138 h), among other possible variables (Borovksy, 2017). However, integrating or averaging over a number of hours can obscure the details of the time dependent interactions between predictors and flux. In addition, any moving average of this sort will lag behind any trend that occurs in the time series. Even a small trend will result in a moving average that is consistently above or below actual values (Hyndman & Athanasopoulos, 2018). Lower energy electrons (<100 keV) have also been shown to correlate with specific time lags of V and N within 24 h (Stepanov et al., 2021).
Simple correlations can confound a number of different processes into one number. Driving factors may be correlated among themselves, action at one time step will be correlated with other time steps, and co-cycling or common trends between electron flux and possible drivers can greatly inflate the apparent correlations. The first problem can be addressed by various means including multiple regression using all variables (Simms et al., 2014; 2016; Potapov, 2017; Sakaguchi et al., 2015) or conditional mutual information (Wing et al., 2016; Osmane et al., 2022). At an hourly cadence, controlling velocity for number density and vice versa, the peak correlation of flux with solar wind velocity has been found at 44-56 h when solar wind density is controlled, while the peak solar wind density correlation with flux, when velocity is controlled, is at 7-11 h (Wing et al., 2022). Similarly, correlations between electron flux and ULF activity peaked at 48 h (Osmane et al., 2022). However, in these studies, the effect due to other time steps was not removed from the analysis of each other time step. The second problem can be mitigated in a similar fashion by including more than one time step in a multiple regression (Simms et al., 2018).
However, common cycles and trends that inflate correlations can be dealt with by describing the time behavior with autoregressive (AR) and moving average (MA) terms. Using these ARMA terms (as well as differencing: subtracting previous observations) methods, we have previously found correlations, while statistically significant, much lower than that seen in uncorrected lagged correlations. Electron flux-solar wind velocity correlations peaked at 0.1 (vs. 0.7 seen in lagged correlations) between 24-48 h previous, while flux-ULF correlations were strongest at 0-1 h (-0.2) and at 24-30 h (<0.1), again much lower than a simple lagged correlation around 0.5 peaking at 50 h (Simms et al., 2022). However, these analyses, over 0-96 h, included only 1 lag at a time. In the present study, we seek to gain a fuller understanding of the timing of each influence, corrected for both the other factors as well as other time lags of itself, using ARMAX analyses to remove the co-cycling that can contribute to spurious correlations in these data.
In addition, we use subsets of this data, choosing periods following storms, to study the timing of influence during more disturbed periods. A continuous (and long) time series is not possible for this storm-subsetted data so we cannot analyze it using AR and MA terms. Instead, using a flux measurement following recovery, we use stepwise regression on predictors from previous time steps to describe the points of highest influence.
Data and Analysis Approach
We use hourly averaged log10 electron fluxes (log (electrons/(cm2/s/sr/keV))) in the 1.8-3.5 MeV (“relativistic”) and 100 keV (“seed”) ranges from the Los Alamos National Laboratory (LANL) Energetic Spectrometer for Particles (ESP) instrument located at geosynchronous orbit (≈ 6.6 RE) on the 1994-084 satellite. We limit the period to 25Oct1995 – 13Jun2002 when there was a minimum of missing data. Short periods of missing values were interpolated from surrounding values. Hourly averages of the log10 solar wind velocity (km/s) (V), log10 number density (#/cc) (N), IMF Bz (GSM: nT) (Bz), log10 pressure (nPa) (P), log10 AE (nT), and Dst (nT) from the OMNIweb database are used. We use the Kozyreva et al. (2007) log10(ULF Pc5 index) which provides an hourly measure of Pc5 (2–7 mHz) ULF power (in nT2/Hz) observed in the local time range from 0500 to 1500 hours by ground-based magnetometers stationed between 60 and 70°N corrected geomagnetic latitude (ULF).
Analyses were performed in MATLAB and SPSS. ARIMAX models were developed using the SPSS TSMODEL procedure. We chose a parsimonious model for the dependent variable (relativistic electron flux), adding AR and MA terms until the partial autocorrelation function (PACF) contained no significant terms. Using the ARMA terms chosen by this procedure, we then added lagged inputs as independent variables (up to 96 h prior to the electron flux measurement). Electron flux was fit with both autoregressive (at 1 h) and moving average terms (at both at 1 and 2 h) and daily autoregressive and moving average terms (both at 1 d) (Balikhin et al., 2011; Boynton et al., 2013; Pankratz, 1991; Makridakis et al., 1998; Hyndman & Athanasopoulos, 2018; Simms et al., 2019).
In the overall analyses, we report which coefficients were statistically significant, both at a standard alpha level of 0.05, and at a corrected threshold using the Holm-Bonferroni method to control the familywise error rate. The p-value is the probability that the null hypothesis is true. In regression, the null hypothesis is that the slope of the relationship between dependent and independent variables (the regression coefficient) is zero. If the p-value is less than the standard level of 0.05, this means there is only a 5 % probability that the null hypothesis of a zero slope is true. We would thus reject this null hypothesis and conclude that there is a relationship between the variables (i.e., a non-zero slope). However, if many comparisons are being made, the 5 % level of rejection means that, randomly, we will unknowingly be accepting a false null hypothesis 5% of the time. While this is an acceptable rate for a single comparison (the comparisonwise error rate), when making >20 comparisons the likelihood of finding a false relationship in the entire set of comparisons (the familywise error rate) is nearly 100%. In order to keep these mistakes low (i.e., to keep the family-wise error rate controlled), various corrections can be made. In this case, we use the Holm-Bonferroni method to decrease the threshold p-value based on the number of comparisons being made (Holm, 1979).
We identify 206 storms in the dataset where Dst dropped below -40 nT and where the end of recovery (a Dst rise above -30 nT) was reached before the start of the next storm. Of these, there were 169 storms with at least 44 h after the end of recovery before the next storm. To reduce the number of lags to test, we average variables over every 2 h. Using relativistic electron flux at this point, we identify the statistically significant lag times for each parameter by performing a stepwise regression. This reduces the lags to those that are most influential by only entering each lag if its coefficient is statistically significant, and only retaining that lag in the model if it does not lose significance when another lag is entered. For entering variables, we set the threshold p-value to 0.05. We set the threshold p-value to 0.10 for removal of variables (Neter et al., 1985).
We compare several analysis approaches:
a. Simple cross correlations (lagged correlations) for each hour and each predictor variable with relativistic electron flux up to 96 h (Figure 1).
b. ARMAX analysis for each variable at each hour independently (Figure 1)
c. ARMAX analysis with all hours, combining V, N, Bz, ULF, and AE in a single analysis (Figure 2).
d. Simple cross correlation (lagged correlations) during storm periods to 44 h after the end of recovery (Figure 4).
e. Stepwise regression of each parameter individually during this pre and post storm recovery period (Figure 4).
f. Stepwise regression combining V, N, Bz, ULF, and AE in a single analysis during this pre and post storm recovery period (Figure 5).
Results
Although simple correlations of possible drivers with electron flux can be fairly high, when common cycles and trends are accounted for using the ARMAX methodology, the associations are much lower (Figure 1). Each variable is analyzed separately, each hour separately, either as a simple lagged correlation (blue line) or with AR and MA terms added to account for common cycling (green line). Solar wind velocity has simple correlations > 0.4 (0.42 50 h previous), ULF shows simple correlations peaking at 0.30 (62 h), and the correlation of N with flux is greatest in magnitude at -0.31 (6 h). Somewhat lower correlations are seen with AE (0.26 at 61 h), Dst (-0.23 at 54 h), Bz (-0.08 at 61 h), P (both -0.19 at 9 h and 0.15 at 96 h). Seed electrons show the highest simple correlation, up to 0.87 1 h earlier. (Note difference in y-axis scale for seed electrons.)
However, the introduction of ARMA terms to describe the cycling and trends greatly reduces the association of each variable with flux. The coefficients of the input variables of the ARMAX models are an order of magnitude lower than the correlation coefficients. Thus, much of the apparent correlation found in simple correlations is merely due to co-cycling and long-term trends. (As we use standardized variables, the single variable correlations can be compared directly to the regression coefficients for these variables in the ARMAX models.) Once an attempt at removing cycling and trends is made, the greatest influence of some variables (Bz, ULF, and AE) occurs within the first few hours, in contrast to the lagged correlations. The removal of cycling also appears to result in strong negative associations of electron flux with V in the first hour, but with strong positive effects 14-96 h previous. N and P show negative associations with flux over 1-20 h previous. Seed electrons still show a spike in positive correlation at 24 h periods, indicating that the removal of cycles was not entirely successful and that even these corrected coefficients are suspect. It is also possible that the high correlation of seed and relativistic electrons is a reflection of both these energies being driven by the same processes rather than one driving the other. Although we include Dst in this single variable analysis, we do not have a well-defined physical explanation for how Dst might drive relativistic electron flux. We therefore suspect that correlations between flux and Dst are due to a mutual correlation with the actual drivers rather than to Dst having any direct or indirect influence on electron. In addition, much of the Dst effect may simply be the result of the magnetosphere rebounding so that the radiation belt is now at the altitude of the satellite rather than any actual increase in electron flux.
However, as all these variables and lags are intercorrelated with each other, these individual correlations, even with cycling and trend influences removed, are still misleading. For example, it is impossible to know if the V-flux correlation is simply a restatement of the N-flux correlation, given that it is known V and N are highly (negatively) correlated with each other. These variables must be analyzed simultaneously to assess the degree to which each correlates with flux when other possible drivers are held constant. We therefore combine the possible drivers (both direct and indirect) into a single ARMAX model (Figure 2). We drop P from the model as it is so highly correlated with N and V as to create multicollinearity problems, leaving P fighting to explain the same variation as N and V. We also drop seed electrons, since they may be a result and not a cause, and Dst, as it may only be correlated with the actual drivers. For this analysis, we report which coefficients are statistically significant, both at a standard alpha level of 0.05 (light blue bars), and at a corrected threshold using the Holm-Bonferroni method to control the family-wise error rate (dark blue bars). We use this more conservative family-wise error rate in our conclusions.
In the combined analysis, V (Figure 2a) and N (2b) show negative influences at 1 h. This is likely the signature of an initial pressure pulse, which both N and V contribute to. This is preceded by positive influence, with the effect of N being longer lasting (up to 24 or 40 h previous). IMF Bz (c) shows a positive influence at 1 h with negative influences at 3-8 h. ULF (d) shows two opposing rapid effects (positive at hour 1, negative at hour 2) with continued negative influence in the preceding hours. AE (e) is lower in magnitude than the other effects with only a single (positive) strongly significant effect at hour 1. Thus, most of the influence of these parameters occurs in the first few hours with only N showing a reliably significant influence out to 24 h.
Note that most of these influences in the full ARMAX model are not only lower than that seen in the simple lagged correlations, but they are strongest over a shorter time frame, and may be in the opposite direction from that seen in the simpler and uncorrected analyses. Conclusions about direction and timing of influence based on lagged correlations are often unsupported.
4.1 Influence during Storms
The influence of these parameters may differ during geomagnetic storms. We perform similar analyses on a subset of identified storms over this time period. First, we create a superposed epoch analysis of 2 h averaged electron flux during the 206 storms in this period in which recovery was allowed to finish without a new storm occurring (i.e., when Dst was allowed to rise to -30 nT following the main phase). The epoch is centered at this end of recovery marker (Figure 3). The maximum flux does not occur during recovery, but in the hours and days after the Dst rise above -30 nT. In this data set, the average flux during storms (black line) rises to a peak at 44-48 h following recovery.
The number of storms available for this analysis falls steadily as the analysis period is increased, due to the possibility of subsequent storms occurring after each storm’s recovery period. While there are 206 storms that reach the end of recovery, there are only 184 and 169 at the first and second peaks of flux. The lower sample size both increases the 95% confidence interval around the mean (dashed lines) and reduces the degrees of freedom available in the error term for regression analysis. For this reason, we limit the correlation and regression analyses to those storms (n=169) where flux has risen to near its average maximum at 44 h after recovery, but the number of storms available for the analysis is not as limited as it is in the later hours.
As we only use one point to measure electron flux, co-cycling correlations should not be as much of a problem. Individual parameters (Figure 4) back to 24 h before the end of recovery are analyzed both by simple lagged correlation where each lag is correlated independent of the others (top plot of each panel) and by a stepwise regression allowing the incorporation of any statistically significant predictor lag (the lagged predictor model in the bottom plot of each panel). We might have used a lagged predictor model without the stepwise procedure, entering all lags at once, however, if all lags are included, the low number of storms (n=169) and high number of predictors (68 lags for each variable) mean that the ability of the regression to pick out the statistically significant lags is low. In addition, certain close lags may tend to fight against each other to explain the same small bit of variation, which can result in apparently significant opposing pairs of coefficients in quick succession. In stepwise regression each lag is entered into the analysis only if its influence is statistically significant and only retained in the model if it does not lose significance when another lag is entered (Neter et al., 1985).
In comparison to the more generalized lagged correlations, the lagged predictor regression analyses give a better understanding of when the action of each predictor occurs. Statistically significant (dark blue: p<0.05) and nonsignificant (white: p>0.05) lags are shown for each parameter in the regressions and the lagged correlations. While the lagged correlations show a more familiar pattern of many influential lags, possibly peaking at some number of hours before the post-storm flux measurement for most parameters, when all lags are considered for incorporation in regression, the timing of the action of each parameter becomes much more specific.
For example, the simple lagged correlations indicate that V (4a: peak influence of r=0.60 2-4 h after recovery) has strong, significant, and long lasting effects. However, the lagged predictor regression analysis shows a strong effect of V only 2 h before the end of recovery, with no long lasting effects. The spread of effect over many hours in the lagged correlation is only an artifact of each hour’s V being correlated with itself at different time steps. N, and to a lesser extent Bz and Dst, also appear to have effects lasting over many hours (in the lagged correlations), but the stepwise regression only identifies 1 or 2 lags at which these parameters are most strongly correlated with flux (N: 4 h before the end of recovery and 12 h after; Bz: 18 h after recovery; Dst: 26 h after the end of recovery). ULF and AE show somewhat more persistent effect, with 3 peaks of influence (20 h before and at 8 and 18 h after the end of recovery for ULF, and at 14 h before and at 8 and 20 h after recovery for AE). Pressure shows no significant influence in the studied time period, whether in lagged correlation or stepwise regression. The strongest correlation of seed with relativistic electrons occurs at 44 h after the end of recovery, at the hour where relativistic flux is being predicted. This suggests that this is also an artifact, probably due to them being measured at the same location or being simultaneously driven by the same influences. Somewhat more trustworthy associations of seed and relativistic electrons are found at 2 h after (positive) and 4 h before (negative) the end of recovery, in that they are not simultaneous.
As above, the intercorrelations of the possible drivers mean that single variable analyses (as in Figure 4) may not accurately assess the influences of each variable. We combine the 5 most likely drivers (V,N, Bz, ULF, and AE) into one stepwise regression. In this analysis, the effect of V appears at 18 h after recovery, N 33 h, Bz at 6 h (as well as 20 h before the end), and ULF at 22 and 32 h after the end of recovery as well as 10 h before the end. AE has no significant influence. This is somewhat different from the individual stepwise analyses, but markedly different from the simple lagged correlations. In the stepwise analyses, there is no long term cumulative effect of any variable, with significant influences occurring at (at most) 3 time steps. While the individual stepwise regressions show 1-3 significant time steps near the point of maximum lagged correlation, when combined into a single analysis, these significant hours no longer correspond to the maximal simple correlations. Addition of other variables changes the response: the most marked difference is the disappearance entirely of the AE effect.
Discussion
In previous work, simple correlations suggested influences of solar wind velocity and number density, IMF Bz, ULF wave power, and substorms (as measured by AE) on MeV electron flux over an impressive number of hours and days. However, the diurnal cycle in flux measurements from geosynchronous satellites can inflate correlations, the associations between potential drivers may produce spurious effects, and correlations between all previous hours of each driver may create the appearance that it acts additively over many hours. Autoregressive-moving average transfer function (ARMAX) multiple regressions incorporating previous hours simultaneously can eliminate these cycles and study the impact of each parameter, at each hour, while the others are controlled. This can accomplish the same goal as integrating over a number of hours (Borovsky 2017) while retaining the details of which hours during the integration period are most influential.
When studying all hours, using ARMAX lagged correlations, we show that the impact of potential drivers is at least an order of magnitude lower than correlations alone would suggest for all tested parameters, with most of this influence occurring within a few hours of the electron measurement. This is in contrast to previous studies which found higher correlations over all hours (e.g., Borovksy, 2017; Osmane et al., 2022; Wing et al., 2022). Much of the correlation found between ULF and flux, for example, appears to be the result of co-cycling parameters (Simms et al., 2022). Removing this diurnal cycle greatly reduces the apparent influence, both in magnitude and in time. The long, potentially cumulative effects seen in simple correlations disappear for most variables, either in the introduction of the ARMAX terms (Figure 1) or when variables are studied in a combined analysis (Figure 2). However, with cycles removed, we are better able to study the actual associations between variables, keeping in mind that our goal is not to find the highest correlation but the one that answers the questions of interest.
Of the five variables we study with both lagged ARMAX correlations and combined variable models, we find the following result from over all hours (using lagged ARMAX correlations) and from periods just following storms (using stepwise regression):
a. Over all hours, with cycles removed, V, which commonly shows the most impressive, positive simple lagged correlations, shows a more nuanced influence, with the initial effect (in the first hour) being negative (perhaps driving magnetopause shadowing), preceded by several hours of positive influence (2-10 h previous to flux measurement, presumably the result of driving processes that result in electron acceleration). This contrasts with previous studies that conclude the peak correlation occurs much further in the past, 45-65 h (simple correlations), 40-100 h (using conditional mutual information) (both from Wing et al., 2022), or 81 h (when optimized for correlation with 4 other variables) (Borovsky 2017). The removal of cycles, which removes much of the spurious correlation, leads to a completely different conclusion about the timing of V influence on flux. Integrating over a number of hours, rather than testing each lag simultaneously, obscures the complexities of the relationship. A 98 h time integration of V may correlate better with electron flux (Borovsky, 2017), but it averages out the opposing effects of V at different time steps and gives the impression that there must be long periods of high V to drive electrons to higher energies, which is not supported by our current analysis.
In contrast, the response of electrons to V only during after storm periods is either a positive response to a single hour at the end of recovery (in the single variable analysis of Figure 4) or a positive response to a single hour 18 h after the end of recovery (in the combined analysis of Figure 5). As we do not explicitly test the flux response (after storms) to the beginning of the storm period, it is likely that we miss the response to an initial storm pressure pulse that V would contribute to or magnetopause shadowing. Thus, the quick negative response to V (or P) is only visible in the analyses using all periods (in the first hour). In the after-storm period analysis, we are left with only the positive response to V. This may not be a direct response to V, but rather an indirect reaction to other processes driven or associated with the velocity.
b. N behaves similarly to V in that the lagged, single variable ARMAX correlations roughly follow the trend of the negative simple correlations, albeit lower in magnitude. However, in the combined ARMAX analysis (over all hours), N is mostly a positive influence (barring the initial negative influence 1 h before the flux measurement), and its effects last beyond 24 h even if the more conservative p-value cut off is used (2-28 h previous to the flux measurement). While this covers the range of previously found peaks (13 h if no correction for V is made, 7-11 h if V is accounted for (Wing et al., 2022), or 11 h for the optimal N lag (Borovsky, 2017)), the influence we find is opposite in sign to previous studies and the single variable analyses. This shows the importance of considering the joint effects of variables rather than studying them singly. The initial negative response, again, is likely the response of electron flux to pressure pulses (driving the field lines below the altitude of the LANL satellite or to more permanent magnetopause shadowing). The longer term, positive response to N is a surprising result of this analysis. It appears to be the result of considering all lags at once, with only a near-term negative effect (at a lag of 1 h). Once this lag is essentially removed, the other lags show a positive effect. Similar to the V positive effect previous to the negative immediate effect, this may be due to N driving processes that increase flux. However, this does not appear to be a feature of the after-storm response to N. After storms, N retains its negative influence on flux.
c. In the overall analyses (all hours), Bz is mostly influential in the few hours leading up to the flux observation: positive in the hour immediately before and negative in the 6 h previous. However, when only the after-storm period is considered, Bz shows only a negative influence with 2 important time lags: 20 h before the end of recovery and 4 h after. (We do not explicitly capture the southward Bz turn at the start of storms.)
d. Over all hours, with cycles removed, the strongest ULF influences are in the first (positive) and second (negative) hour before the flux measurement, with further negative influences in the 12- 24 h before. This is in marked contrast to the sustained, positive simple correlations up to 96 h previous. The maximum correlation does not occur 48-50 h before as previously reported (Osmane et al., 2022), nor are several hours or days of driving necessary to produce a rise in electron flux (O’Brien et al., 2001). In our storm specific analyses, the ULF appears to act positively at several key points during and after recovery, supporting the findings of some storm case studies that the rise in ULF power precedes the electron flux rise by several hours (Jaynes et al., 2018; Rostoker et al., 1998), although not by several days (Baker et al., 1998, Elkington et al., 2003). Increases could be the result of inward radial diffusion, but ULF waves have also been speculated to result in electron loss by outward radial diffusion which may be the weaker negative effect we see over the 12-24 h period preceding in the overall analysis and which does not appear in the after-storm analysis. This is much less than the previously postulated 1-6 days (Elkington et al., 2003) or 40 h (Wing et al., 2022) for outward radial diffusion to occur.
e. AE, which is thought to represent the lower energy electron injection by substorms, is similarly much reduced in apparent effect, perhaps with only a single positive influence in the hour just before flux (over all hours). Following storm periods, the AE at first appears to have an influence similar to that of ULF, but this effect disappears completely in the after-storm analysis of combined variables. It is possible that substorm levels following storms are somewhat constant, in that there is usually the necessary activity to inject electrons. Either storms do not vary much in their substorm activity, or all storms reach the necessary threshold and further activity results in no further injection.
For various reasons, we do not include solar wind pressure, seed electrons, or Dst in the combined analyses, in part because they overlap in measuring some of the same influences we do include. Pressure shows a lower correlation in the overall analyses than the solar wind velocity and number density that contribute to it, and the correlation is very low in the hours around the end of storm recovery. Beyond this, the inclusion of a derived variable (P) with its components (V and N) can make for difficulties in interpreting the variables that are used to produce it.
Seed electrons present a particular problem in this analysis for several reasons. First, both seed and relativistic electrons are measured at the same satellite in our dataset, making them correlated in both time and space. We can already see this might be a problem in the simple correlations that cycle at a roughly 24 h period. The ARMAX correction does not get rid of this problem. While it is possible that obtaining seed electron data from another satellite might get around this problem, the cycling of flux at any geosynchronous location (i.e., the altitude we are studying) makes this unlikely. Second, it might be presumed that whichever parameters drive the acceleration of electrons to relativistic energies are the same parameters that drive the acceleration to seed electron energies. Thus, increased seed electron flux may be only a reflection of the same processes resulting in increases in higher energy electron flux. This in itself would create a high correlation between the two electron energies, even if one had no influence on the other. Nor would it be surprising if a rise in seed electron populations preceded that of the relativistic-energy electrons as it would take less time for these processes to accelerate electrons to the lower seed electron energy. While it is true that seed electrons are highly predictive of MeV flux, this does not prove that they are influential. Appearing first might give the appearance that seed electrons must be present before relativistic electrons and therefore must be causal, but it does not prove it. We have no way of uncoupling this relationship as we cannot produce the same magnetospheric conditions with different levels of seed electrons. For these reasons, we do not include seed electrons in the combined analyses. It is not that we do not think seed electrons have no influence, only that we recognize we have no way of testing that hypothesis.
We leave Dst out of the combined analyses because it is a more generalized index of magnetospheric activity. We have chosen to include AE (another somewhat general index) instead. The AE, as it is a measure of substorm activity, has the advantage of including the possible effect of electron injection. In the after-storm period, in particular, the inclusion of Dst is problematic as it is also being used to identify the periods of study. This could result in false correlations that are merely artifacts of the Dst being used in this manner.
5. Conclusions
A simple correlation of time series data can contain several elements of information within this single number, not all of which are pertinent to the question of whether one variable drives the other. Both co-cycling behavior and associations with variables omitted from the analysis can contribute to the correlation. These effects must be removed before relying on correlational analysis to determine if two variables show an association. In cross correlation (lag) analysis, correlations between previous hours of the same variable can also create the illusion that a driver might have a long and cumulative influence only because the driver is correlated with itself at different hours. To determine the most accurate timing of influence, time series data should have cycling behavior removed and driver lags should be considered simultaneously, not individually, to determine which are most influential. Similarly, predictors should be studied concurrently to resolve the effect of each individually.
Applying ARMAX techniques to all hours, we find that the association of each possible driver with relativistic electron flux is an order of magnitude lower than simple correlation. Driver associations with flux are generally strongest within a few hours, not the many hours that have been suggested previously. Solar wind velocity and number density show an initial negative impact, with longer term positive influences over no more than 27 h. Bz is initially a positive influence, with its longer term negative effect only up to 6 h. ULF waves impact flux in the first (positive) and second (negative) hour before the flux measurement, with further negative influences in the 12- 24 h before. Substorms (AE) show only a short term (1 h) positive influence.
When only after-storm periods are studied (using stepwise regression), influences occur only at a few lags in the recovery or after recovery period. In these after-storm hours there are positive influences of ULF waves and V, negative influences of N and Bz. AE shows no influence in these more disturbed periods.
Acknowledgements
Work at Augsburg University was supported by NSF grants AGS-1651263 and AGS-2013648.
Data Availability Statement
Electron flux data were obtained from Los Alamos National Laboratory (LANL) geosynchronous energetic particle instruments (https://zenodo.org/record/5834856). The ULF index is available at http://ulf.gcras.ru/plot_ulf.html. Solar wind (V) data are available from Goddard Space Flight Center Space Physics Data Facility at the OMNI Web data website (http://omniweb.gsfc.nasa.gov/html/ow_data.html).
Literature Cited
Baker, D.N, T Pulkkinen, X Li, S Kanekal, K Ogilvie, R Lepping, J Blake, L Callis, G Rostoker, H Singer (1998), A strong CME-related magnetic cloud interaction with the Earth’s magnetosphere: ISTP observations of rapid relativistic electron acceleration on May 15, 1997, Geophysical Research Letters, 25, 2975-2978, doi: 10.1029/98GL01134.
Balikhin, M. A., R. J. Boynton, S. N. Walker, J. E. Borovsky, S. A. Billings, and H. L. Wei (2011), Using the
NARMAX approach to model the evolution of energetic electrons fluxes at geostationary orbit, Geophysical Research Letters, 38, L18105, doi:10.1029/2011GL048980.
Birn, J., M. F. Thomsen, J. E. Borovsky, G. D. Reeves, D. J. McComas, and R. D. Belian (1997),
Characteristic plasma properties during dispersionless substorm injections at geosynchronous orbit Journal of Geophysical Research, Vol. 102, No. A2, Pages 2309-2324, doi:10.1029/96JA02870
Borovsky, J. E., and M. H. Denton (2014), Exploring the cross correlations and autocorrelations of the ULF indices and incorporating the ULF indices into the systems science of the solar wind-driven
magnetosphere, Journal of Geophysical Research Space Physics, 119, doi:10.1002/2014JA019876.
Borovsky, J. E. (2017). Time-integral correlations of multiple variables with the relativistic-electron flux at geosynchronous orbit: The strong roles of substorm-injected electrons and the ion plasma sheet. Journal of Geophysical Research: Space Physics, 122, 11,961–11,990. https://doi.org/10.1002/
2017JA024476
Boyd, A. J., H. E. Spence, S. G. Claudepierre, J. F. Fennell, J. B. Blake, D. N. Baker, G. D. Reeves, D. L. Turner, and H. O. Funsten (2014), Quantifying the radiation belt seed population in the 17 March 2013 electron acceleration event, Geophys. Res. Lett., 41, 2275–2281, doi:10.1002/2014GL059626
Boynton, R. J., M. A. Balikhin, S. A. Billings, G. D. Reeves, N. Ganushkina, M. Gedalin, O. A. Amariutei,
J. E. Borovsky, and S. N. Walker (2013), The analysis of electron fluxes at geosynchronous orbit employing a NARMAX approach, Journal of Geophysical Research Space Physics, 118, 1500–1513, doi:10.1002/jgra.50192.
Elkington, S. R., M. K. Hudson, and A. A. Chan (2003), Resonant acceleration and diffusion of outer zone electrons in an asymmetric geomagnetic field, J. Geophys. Res., 108(A3), 1116, doi:10.1029/2001JA009202.
Friedel, R.H.W., G.D Reeves, T. Obara (2002), Relativistic electron dynamics in the inner magnetosphere — a review, Journal of Atmospheric and Solar-Terrestrial Physics, 64(2), 265-282,
https://doi.org/10.1016/S1364-6826(01)00088-8.
Holm, S. (1979), A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics. 6 (2): 65–70, doi: 10.2307/4615733.
Hwang, J. A., D.-Y. Lee, L. R. Lyons, A. J. Smith, S. Zou, K. W. Min, K.-H. Kim, Y.-J. Moon, and Y. D. Park (2007), Statistical significance of association between whistler-mode chorus enhancements and enhanced convection periods during highspeed streams, J. Geophys. Res., 112, A09213, doi:10.1029/2007JA012388.
Hyndman, R.J. and G. Athanasopoulos (2018) Forecasting: Principles and Practice, 2nd ed., OTexts, Heathmont, Victoria, Australia 291 pp.
Jaynes, A. N., Baker, D. N., Singer, H. J., Rodriguez, J. V., Loto’aniu, T. M., Ali, A. F., et al. (2015). Source and seed populations for relativistic electrons: Their roles in radiation belt changes. Journal of Geophysical Research: Space Physics, 120, 7240–7254. https://doi.org/10.1002/2015JA021234
Jaynes, A. N., Ali, A. F., Elkington, S. R., Malaspina, D. M., Baker, D. N., Li, X., Kanekal, S. G., Henderson, M. G., Kletzing, C. A., & Wygant, J. R. (2018). Fast Diffusion of Ultrarelativistic Electrons in the Outer Radiation Belt: 17 March 2015 Storm Event. Geophysical research letters, 45(20), 10874–10882. https://doi.org/10.1029/2018GL079786
Kozyreva, O., V. Pilipenko, M. J. Engebretson, K. Yumoto, J. Watermann, and N. Romanova (2007), In search of a new ULF wave index: Comparison of Pc5 power with dynamics of geostationary relativistic electrons, Planetary Space Sciences, 55, 755, doi: 10.1016/j.pss.2006.03.013.
Lam, H.-L. (2004), On the prediction of relativistic electron fluence based on its
relationship with geomagnetic activity over a solar cycle, Journal of Atmospheric and Solar-Terrestrial Physics 66 (2004) 1703–1714, doi:10.1016/j.jastp.2004.08.002
Loto’aniu, T. M., H. J. Singer, C. L. Waters, V. Angelopoulos, I. R. Mann, S. R. Elkington, and J. W. Bonnell (2010), Relativistic electron loss due to ultralow frequency waves and enhanced outward radial diffusion, Journal of Geophysical Research, 115, A12245, doi:10.1029/2010JA015755.
Lyatsky, W., and G. V. Khazanov (2008), Effect of geomagnetic disturbances and solar wind density on relativistic electrons at geostationary orbit, Journal of Geophysical Research, 113, A08224, doi:10.1029/2008JA013048.
Makridakis, S.G., S. C. Wheelwright, and R. J. Hyndman (1998), Forecasting: Methods and Applications, 3rd ed., John Wiley and Sons, New York, NY, 652 pp.
Mathie, R.A. and I. R. Mann, 2000, A correlation between extended intervals of ULF wave
power and storm-time geosynchronous relativistic electron flux enhancements, Geophysical Research Letters, 27(20), 3261-3264, https://doi.org/10.1029/2000GL003822.
Neter J., W. Wasserman, M. Kutner (1985) Applied Linear Statistical Models 2nded. , Richard D. Irwin, Inc., Homewood, IL, USA, 112 pp.
O’Brien, T.P., R.L. McPheroon, D. Sornette, G.D. Reeves, R. Friedel, H.J. Singer (2001), Which magnetic storms produce relativistic electrons at geosynchronous orbit?, Journal Of Geophysical Research, 106(A8), 15,533-15,54, doi.org/10.1029/2001JA000052
Osmane, Adnane, Mikko Savola, Emilia Kilpua, Hannu Koskinen, Joseph E. Borovsky, and Milla Kalliokoski (2022), Quantifying the non-linear dependence of energetic electron fluxes in the Earth’s radiation belts with radial diffusion drivers, Annales Geophysicae, 40, 37–53, doi.org/10.5194/angeo-40-37-2022
Pankratz, A (1991), Forecasting with Dynamic Regression Models, John Wiley & Sons Inc., New York, NY, 386 pp.
Potapov, A.S. (2017), Relativistic electrons of the outer radiation belt and methods of their forecast (review), Solar-Terrestrial Physics, 2017. Vol. 3. Iss. 1, pp. 57–72, DOI: 10.12737/article_58f9703837c248.84596315
Reeves, G.D., D. N. Baker, R. D. Belian, J. B. Blake, T. E. Cayton, J. F. Fennell, R. H. W. Friedel, M. M. Meier, R. S. Selesnick, H. E. Spence (1998), The global response of relativistic radiation belt electrons to the January 1997 magnetic cloud, Geophysical Research Letters, 25, 17:3265-3268, doi:10.1029/98gl02509.
Reeves, G. D., S. K. Morley, R. H. W. Friedel, M. G. Henderson, T. E. Cayton, G. Cunningham, J. B. Blake, R. A. Christensen, and D. Thomsen (2011), On the relationship between relativistic electron flux and solar wind velocity: Paulikas and Blake revisited, Journal of Geophysical Research, 116, A02213, doi:10.1029/2010JA015735.
Romanova, N, Pilipenko, V (2009), ULF wave indices to characterize the solar wind-magnetosphere interaction and relativistic electron dynamics, Acta Geophysica, 57(1), 158-170, doi:
10.2478/s11600-008-0064-4
Rostoker, G., S. Skone, and D. N. Baker (1998), On the origin of relativistic electrons in the magnetosphere associated with some geomagnetic storms, Geophysical Research Letters, 25(19), 3701– 3704.
Sakaguchi, K., T. Nagatsuma, G. D. Reeves, and H. E. Spence (2015), Prediction of MeV electron fluxes throughout the outer radiation belt using multivariate autoregressive models, Space Weather, 13, 853–867, doi:10.1002/2015SW001254.
Shprits, Y. Y., R. M. Thorne, R. Friedel, G. D. Reeves, J. Fennell, D. N. Baker, and S. G. Kanekal (2006), Outward radial diffusion driven by losses at magnetopause, Journal of Geophysical Research, 111, A11214, doi:10.1029/2006JA011657
Simms, L. E., Engebretson, M. J., Pilipenko, V., Reeves, G. D., & Clilverd, M. (2016). Empirical predictive models of daily relativistic electron flux at geostationary orbit: Multiple regression analysis. Journal of Geophysical Research: Space Physics, 121, 3181–3197. https://doi.org/
10.1002/2016JA022414
Simms, L., Engebretson, M., Clilverd, M., Rodger, C., Lessard, M., Gjerloev, J., & Reeves, G. (2018). A distributed lag autoregressive model of geostationary relativistic electron fluxes: Comparing the influences of waves, seed and source electrons, and solar wind inputs. Journal of Geophysical Research: Space Physics, 123. https://doi.org/10.1029/2017JA025002
Simms, L. E., Engebretson, M. J., Rodger, C. J., Gjerloev, J. W., & Reeves, G. D. (2019). Predicting lower band chorus with autoregressive‐moving average transfer function (ARMAX) models. Journal of Geophysical Research: Space Physics, 124, 5692–5708. https://doi.org/10.1029/2019JA026726
Simms, L. E., Pilipenko, V. A., Engebretson, M. J., Reeves, G. D., Smith, A. J., & Clilverd, M. (2014). Prediction of relativistic electron flux following storms at geostationary orbit: Multiple regression analysis. Journal of Geophysical Research: Space Physics, 119, 7297–7318. https://doi.org/
10.1002/2014JA019955
Simms, Laura E. , Engebretson, Mark J. , Reeves, Geoffrey D. (2022), Removing diurnal signals and longer term trends from electron flux and ULF correlations: a comparison of spectral subtraction, simple differencing, and arimax models, Journal of Geophysical Research. Space Physics, vol. 127, no. 2, doi:10.1029/2021JA030021.
Staples, F. A., Kellerman, A., Murphy, K.R., Rae, I. J., Sandhu, J. K., & Forsyth, C. (2022). Resolving magnetopause shadowing using multimission measurements of phase space density. Journal of Geophysical Research: Space Physics, 127, e2021JA029298. https://doi.org/10.1029/2021JA029298
Stepanov, N. A., Sergeev, V. A., Sormakov, D. A., Andreeva, V. A., Dubyagin, S. V., Ganushkina, N., et al. (2021). Superthermal proton and electron fluxes in the plasma sheet transition region and their dependence on solar wind parameters. Journal of Geophysical Research: Space Physics,
126, e2020JA028580. https://doi.org/10.1029/2020JA028580
Su, Y.-J., J. M. Quinn, W. R. Johnston, J. P. McCollough, and M. J. Starks (2014), Specification of>2MeV electron flux as a function of local time and geomagnetic activity at geosynchronous orbit, Space
Weather, 12, 470–486, doi:10.1002/2014SW001069.
Summers, D., C. Ma, N. P. Meredith, R. B. Horne, R. M. Thorne, D. Heynderickx, and R. R. Anderson (2002), Model of the energization of outer-zone electrons by whistler-mode chorus during the October 9, 1990 geomagnetic storm, Geophysical Research Letters, 29(24), 2174, doi:10.1029/2002GL016039.
Tu, W., Xiang, Z., & Morley, S. K. (2019). Modeling the magnetopause shadowing loss during the June 2015 dropout event. Geophysical Research Letters, 46, 9388–9396, https://doi.org/10.1029/2019GL084419
Takahashi, K., and A. Y. Ukhorskiy (2007), Solar wind control of Pc5 pulsation power at geosynchronous orbit, Journal of Geophysical Research, 112, A11205, doi:10.1029/2007JA012483.
Wing, S., J. R. Johnson, E. Camporeale, and G. D. Reeves (2016), Information theoretical approach to discovering solar wind drivers of the outer radiation belt, Journal of Geophysical Research Space Physics, 121, doi:10.1002/2016JA022711
Wing, S., Johnson, J. R., Turner, D. L., Ukhorskiy, A. Y., & Boyd, A. J. (2022). Untangling the solar wind and magnetospheric drivers of the radiation belt electrons. Journal of Geophysical Research: Space Physics, 127, e2021JA030246. https://doi.org/10.1029/2021JA030246
Figure Captions
Figure 1. Cross correlations (in blue; scale: -0.4 - 0.4 except seed electrons) and ARMAX lag coefficients (in green; scale: -0.02 - 0.02 except seed electrons) of parameters with electron flux. Cross correlations are up to 10 times greater than ARMAX coefficients, reflecting that most simple correlation is due to co-cycling of parameters with flux. ARMAX coefficients (with co-cycling removed) may not peak in the same hour nor in the same direction as the cross correlations. (Note the different scale for seed electron flux correlations and coefficients.)
Figure 2. Coefficients of the lagged predictors (over 48 h) when all lags are included in a simultaneous ARMAX multiple regression ncluding V, N, Bz, ULF, and AE. Nonsignificant coefficients are white bars. Those with p<0.05 are in light blue. Dark blue are those coefficients that are still statistically significant when the Holm correction for multiple comparisons is applied.
Figure 3. Relativistic electron flux (Z scores; black line) superposed at end of recovery and averaged over available storms. Number of storms showing end of recovery marker (> -30 nT following main phase Dst drop) is n=206. 95% confidence interval shown as dashed lines.
Figure 4. Coefficients from individual analyses of lagged parameters used to predict relativistic electron flux 44 h after the end of storm recovery. Top plot of each panel are the cross correlations; bottom plot of each panel are the significant lags chosen by stepwise regression. Significant coefficients (p<0.05) are blue bars.
Figure 5. Coefficients from a combined multiple regression analysis of lagged parameters predicting relativistic electron flux 44 h after the end of storm recovery. Significant lags chosen by stepwise regression.