2.3. Statistical data analysis
All statistical analyses were performed in R v. 4.2.3
(R Core Team, 2023),
and significance was defined at ɑ = 0.05. We compared in-stream nitrate
and DOC concentrations across burned and unburned watersheds to evaluate
the effects of time-since-fire, climate, watershed burn percentage, and
watershed characteristics on post-fire water quality variability. To
compare the mean pseudo yield metric between burned and unburned
watersheds, we used single-factor analysis of variation (ANOVA) tests
(Van Son & Thiel,
2006). To determine whether fire responses varied by time-since-fire,
climate, and burned area, we performed separate linear mixed-effect
models (lmer ;
Bates et al., 2015).
Additionally, we tested for equal variances within the pseudo yield
metric across space and time using Levene’s test
(Fox & Weisberg,
2019). Residuals for both the ANOVA and linear mixed-effect models were
assessed for normality using the Shapiro test
(Shapiro & Wilk,
1965), and homoscedasticity using the Breusch-Pagan test.
To normalize for large concentration differences across all sites, we
analyzed the proportional differences between burned and unburned
in-stream solute concentrations by calculating an effect size
(ln[R]) using the following equation modified from
Emmerton et al.,
(2020) and Paul et al., (2022):
\begin{equation}
\ln\ \left[R\right]\ =\frac{X_{\text{Burned}}}{X_{\text{Unburned}}}\nonumber \\
\end{equation}where XBurned and XUnburned are the mean
concentrations in burned and unburned watersheds, respectively. Mean
effect size was calculated with 95% confidence intervals and standard
error. Effect size was calculated for each burned site associated with a
reference site, per year. A positive effect size indicates a higher
solute concentration in the burned watershed compared to the unburned
site, whereas a negative effect represents a higher solute concentration
in the unburned watershed. Effect size values near 0 represent no
difference between the burned and unburned sites. Post analysis, effect
sizes were converted to percent difference (D), following
(Dove & Hart, 2017):
\(D=100(1-e^{\ln(R)}\))
and used to assess how nitrate and DOC change on an annual basis using
the Sen-Theil linear regression model
(Hurtado, 2020).
To assess the impact of time-since-fire, observations were divided into
seven individual groups including: year of the fire, 1-2 years
post-fire, 2-3 years post-fire; 3-4 years post-fire; 4-5 years
post-fire; 5-6 years post-fire; and >10 years after the
fire. No studies fell within the 7-10 year timeframe. We assessed burn
extent by extracting burn area from MTBS using a custom R script
(Willi & Ross,
2023). Study sites were then grouped based on their burn area in 10%
intervals.
Due to the differences in data reporting or data availability across
publications, the time-since-fire, climate, and burn extent models are
made up of different data. Table 1 shows which publications were
included within each model. This limits our ability to study the
interactive influences of our variables of interest. Therefore, to
determine whether fire responses varied by time-since-fire, climate, and
burned area, we performed separate linear mixed-effect models
(lmer ; Bates et
al., 2015) on the mean effect size for each solute. Time-since-fire,
the eight unique climate classifications, and the burn area percentages
(at 10% intervals) were included as fixed effects for each model,
respectively, while site was incorporated as a random effect to account
for pseudoreplication including the confounding nature of other
variables in all models. All fixed effects were considered as
categorical variables. We focused the examination of climate effects to
studies with data within the first 5 years post-fire, as this has been
shown in previous studies to be the period of time with biggest fire
influences on solute concentrations post-fire
(Rust et al., 2018).
We assessed the variation amongst categorical classifications using
Levene’s test (Fox &
Weisberg, 2019).
We used Random Forests (RF), a regression tree-based machine learning
approach (Breiman,
2001), to quantify the importance of watershed characteristics on
effect sizes of nitrate and DOC. We constructed separate RF models which
included the full nitrate and DOC datasets using the ranger R
package (Wright &
Ziegler, 2017) with default hyperparameters, and variable importance to
each model was quantified by the Gini index. We chose aprioriseven predictors (climate, watershed area, minimum elevation, maximum
watershed elevation, watershed slope, burn percentage, and
time-since-fire) to model effect sizes for nitrate (n=84) and DOC
(n=45), with goodness-of-fit reported as out-of-bag R2values.