2.4 Connectivity Strength Metrics
To quantify hydrologic connectivity, we identified a source site (Inflow) and considered the magnitude of connectivity between this source and multiple target sites. We first analyzed connectivity information using relative stage dynamics. To do so, we used a graphical analysis approach by plotting the mean daily Inflow stage against the relative stage as represented by stage z-scores (described in section 2.2) at the target sites. Strongly co-varying stage levels between source and target may suggest the presence of connectivity while inflection points in source-target stage relationships can help identify thresholds at which connectivity dynamics shift (Cabezas et al., 2011). To identify inflection points in source-target stage relationships and identify inflection thresholds at Inflow (Istage), we fit broken line linear regression models using the segmented package in R, which identifies a user-defined number of inflection points (Muggeo, 2008). Because hysteresis was observed in the source-target relationship at several sites, we removed the rising limb from the inflection point identification process. For improved interpretability, we constrained the analysis to either a linear fit model (zero inflection points) or a one inflection point model and chose the model that minimized the Bayesian information criterion (BIC). In all cases, the single inflection point model was chosen over the linear fit. It should be noted that while coherent hydrologic fluctuations between sites can be a useful tool for confirming connectivity, it can also be subject to false positives when other factors act similarly on both sites (Rinderer et al., 2018).
We developed an approach to quantify the connectivity magnitude between source and target sites using both geochemical and microbial indicators. For both metrics, we quantified the magnitude, defined hereafter as connectivity strength (σ), as a continuous variable ranging from 0 and 1. Connectivity strength denotes the degree of influence of the source on the target. To measure connectivity strength, we assumed that when strong hydrologic connectivity was present, source and target water compositions would be more similar than when connectivity was weak or absent. This is a commonly used assumption embedded in source water mixing approaches which use aqueous geochemistry to assess hydrologic connectivity (Cabezas et al., 2009; C. N. Jones et al., 2014). For microbial communities, we expected that when hydrologic connectivity was strong, the membership of the water column microbiome would be more similar because the target community would be strongly influenced by immigration from the source community. Conversely, when hydrologic connectivity was weak/absent, we expected inter-species interactions would be the dominant influence on microbiome membership and the source and target would become less similar over time.
To calculate connectivity strength using aqueous geochemistry (σg), we first normalized ion concentrations by their mean and standard deviations and conducted a principle component analysis (PCA) on all major ions present including sodium, chloride, calcium, magnesium, potassium and sulfate ions. Analytical results included several outlying values for chloride and potassium that were removed due to suspected contamination. To maintain a balanced dataset, we replaced the removed outliers by linearly interpolating reported values from the previous and subsequent weeks at the same site. We examined PCA eigenvalues and eigenvectors (Figure 1, Table S1), and based on variable loadings chose to include two principle components (PCs) for further analysis that represented two major water source components. At each sampling date, within the 2-dimensional PC space (PC1 and PC2), the log transformed Euclidean distance was calculated between a given target site geochemical composition and the geochemical composition at Inflow (i.e., source site) (Eq. 2). This value was then rescaled to between 0 and 1 using a min-max normalization and reversed to calculate a chemical similarity score as follows (Eqs. 1 & 2).
\(\text{ED}_{i}=\operatorname{}{\log\left(\sqrt{{({PC1}_{s_{i}}-{PC1}_{t_{i}})}^{2}+{({PC2}_{s_{i}}-{PC2}_{t_{i}})}^{2}}\right)}\)(Eq. 1)
\(\sigma_{i}=1-\left(\frac{\text{ED}_{i}-\ min(ED)}{\max\left(\text{ED}\right)-min(ED)}\right)\)(Eq. 2)
Where EDi is the logged Euclidian distance within the PCA space on a given sampling date, the subscriptssi and ti refer respectively to PC scores at Inflow (i.e., the source) and a target site, σi is the connectivity strength on a given sampling date and ED is the complete dataset.
To calculate connectivity strength using microbiome membership (σm), on each sample date, we calculated a similarity score using the Bray-Curtis similarity index (BC) between microbiome membership at a given target site and Inflow (i.e., the source), as follows (Eq. 3).
\(\text{BC}_{\text{st}}=\frac{2C_{\text{st}}}{S_{s}+S_{t}}\) (Eq. 3)
Where C is the sum of the lower of the two counts of each OTU found at both sites while Ss is the total number of sequence reads at Inflow and St is the total number of sequence reads at the target site. We also conducted a principle coordinate analysis (PCOA) using the BC dissimilarity index to visualize microbiome membership in lower dimensional space (Figure 1c).
To identify the relationship between Inflow stage and site-level connectivity, at each site, we fit natural cubic spline regression equations between Inflow stage and connectivity strength for both geochemical and microbial metrics using the splines package in R (R Core Team, 2016). As with relative stage (i.e., stage z-scores), because hysteresis was observed at two sites, we only used the peak through recession period for the model fitting procedure. At Inflow stages that were outside the range of values when connectivity strength was measured in the field (at very high or very low stages), we assigned a constant value for connectivity strength equal to the mean of measured connectivity strength values measured at the four sampling dates with either the highest or lowest Inflow stage. Using these models, we then generated daily time series of connectivity strength at each site using the Inflow stage record for 2018.