2.3. Statistical data analysis
All statistical analyses were performed in R v. 4.2.3 (R Core Team, 2023), and significance was defined at ɑ = 0.05. We compared in-stream nitrate and DOC concentrations across burned and unburned watersheds to evaluate the effects of time-since-fire, climate, watershed burn percentage, and watershed characteristics on post-fire water quality variability. To compare the mean pseudo yield metric between burned and unburned watersheds, we used single-factor analysis of variation (ANOVA) tests (Van Son & Thiel, 2006). To determine whether fire responses varied by time-since-fire, climate, and burned area, we performed separate linear mixed-effect models (lmer ; Bates et al., 2015). Additionally, we tested for equal variances within the pseudo yield metric across space and time using Levene’s test (Fox & Weisberg, 2019). Residuals for both the ANOVA and linear mixed-effect models were assessed for normality using the Shapiro test (Shapiro & Wilk, 1965), and homoscedasticity using the Breusch-Pagan test.
To normalize for large concentration differences across all sites, we analyzed the proportional differences between burned and unburned in-stream solute concentrations by calculating an effect size (ln[R]) using the following equation modified from Emmerton et al., (2020) and Paul et al., (2022):
\begin{equation} \ln\ \left[R\right]\ =\frac{X_{\text{Burned}}}{X_{\text{Unburned}}}\nonumber \\ \end{equation}
where XBurned and XUnburned are the mean concentrations in burned and unburned watersheds, respectively. Mean effect size was calculated with 95% confidence intervals and standard error. Effect size was calculated for each burned site associated with a reference site, per year. A positive effect size indicates a higher solute concentration in the burned watershed compared to the unburned site, whereas a negative effect represents a higher solute concentration in the unburned watershed. Effect size values near 0 represent no difference between the burned and unburned sites. Post analysis, effect sizes were converted to percent difference (D), following (Dove & Hart, 2017):
\(D=100(1-e^{\ln(R)}\))
and used to assess how nitrate and DOC change on an annual basis using the Sen-Theil linear regression model (Hurtado, 2020).
To assess the impact of time-since-fire, observations were divided into seven individual groups including: year of the fire, 1-2 years post-fire, 2-3 years post-fire; 3-4 years post-fire; 4-5 years post-fire; 5-6 years post-fire; and >10 years after the fire. No studies fell within the 7-10 year timeframe. We assessed burn extent by extracting burn area from MTBS using a custom R script (Willi & Ross, 2023). Study sites were then grouped based on their burn area in 10% intervals.
Due to the differences in data reporting or data availability across publications, the time-since-fire, climate, and burn extent models are made up of different data. Table 1 shows which publications were included within each model. This limits our ability to study the interactive influences of our variables of interest. Therefore, to determine whether fire responses varied by time-since-fire, climate, and burned area, we performed separate linear mixed-effect models (lmer ; Bates et al., 2015) on the mean effect size for each solute. Time-since-fire, the eight unique climate classifications, and the burn area percentages (at 10% intervals) were included as fixed effects for each model, respectively, while site was incorporated as a random effect to account for pseudoreplication including the confounding nature of other variables in all models. All fixed effects were considered as categorical variables. We focused the examination of climate effects to studies with data within the first 5 years post-fire, as this has been shown in previous studies to be the period of time with biggest fire influences on solute concentrations post-fire (Rust et al., 2018). We assessed the variation amongst categorical classifications using Levene’s test (Fox & Weisberg, 2019).
We used Random Forests (RF), a regression tree-based machine learning approach (Breiman, 2001), to quantify the importance of watershed characteristics on effect sizes of nitrate and DOC. We constructed separate RF models which included the full nitrate and DOC datasets using the ranger R package (Wright & Ziegler, 2017) with default hyperparameters, and variable importance to each model was quantified by the Gini index. We chose aprioriseven predictors (climate, watershed area, minimum elevation, maximum watershed elevation, watershed slope, burn percentage, and time-since-fire) to model effect sizes for nitrate (n=84) and DOC (n=45), with goodness-of-fit reported as out-of-bag R2values.