Statistical analyses
At each plot and for each depth, we measured alpha-diversity of Bacteria, Eukaryota, Mycota, Collembola, Insecta and Oligochaeta, through Hill numbers. The joint use of different Hill numbers allows to obtain biodiversity measures in metabarcoding studies that are robust to bioinformatic treatments and other methodological choices (Alberdi & Gilbert, 2019; Calderón‐Sanou et al., 2020; Mächler, Walser, & Altermatt, 2021). We used the parameters q = 0 and q = 1 in thehill_taxa function of the hillR package (Chao, Chiu, & Jost, 2014). Values of q = 0 returns the taxonomic richness and are insensitive to MOTUs frequency, while q = 1 returns the exponential of the Shannon diversity, and limits the weight of rare MOTUs (Alberdi & Gilbert, 2019). We could not use values of q > 1 because they cannot be applied to communities with richness = 0.
We used univariate Bayesian Generalized Linear Mixed Models (GLMMs) to assess the variation in alpha-diversity of Bacteria, Eukaryota, Mycota, Collembola, Insecta and Oligochaeta with time since glacier retreat and depth. We ran mixed models with the alpha-diversity of each sample (log-transformed) as dependent variable, and used a Gaussian error to attain normality of model residuals, considering both the parameters q = 0 and q = 1. As independent variables, we considered time since glacier retreat (log-transformed and scaled: mean = 0, SD = 1) and depth. We included glacier identity and plot identity as random factors. These models also included interactions between depth and time since glacier retreat, to test the hypothesis that depth affects the colonization rate of the studied groups. We used the widely applicable information criterion (WAIC) to compare models with and without interactions (Gelman, Hwang, & Vehtari, 2013). Models using log-normal error distribution and un-transformed alpha diversity values yielded identical results.
Soil nutrient content changes at different stages of soil development in deglaciated areas, with the amount of organic carbon increasing through time (Khedim et al., 2021), potentially influencing alpha diversity of soil communities (Guo et al., 2018). We therefore re-analysed the pattern of alpha diversity using organic carbon as an independent variable, instead of age since glacier retreat. This analysis was run for a subset of samples (N = 276) for which data of total organic carbon content were obtained by elemental analysis (Organic Elemental Analyzer, model Flash 2000, Thermo Fisher; Khedim et al., 2021; Lacchini 2020). Organic carbon was strongly related to age since glacier retreat (GLMM with organic carbon as dependent variable and age as independent variable:R 2 = 0.66) thus it was impossible to include organic carbon and age together in the same model. Organic carbon data were representative of the overall soil core (0-20 cm); thus, the analysis did not allow testing the role of variation in carbon content between surface and deep layers. Two plots (i.e. four samples in total) were excluded from this analysis because no soil data were available.
For each plot, we estimated the beta-diversity between the two soil depths based on incidence data. We used the beta.multi function of the betapart package with the Sorensen family (Baselga & Orme, 2012). This function partitions the total beta diversity (beta.SOR) into its nestedness (beta.SNE) and turnover (beta.SIM) components, reflecting the species gain/loss and replacement, respectively (Baselga, 2010). We excluded plots having zero MOTUs in at least one depth, given that the formula of Baselga’s partitioning retrieves undefined values of nestedness and turnover when one of the compared communities has no taxon (Baselga, 2010). For each taxonomic group, we built GLMMs to test the hypothesis that beta diversity between the two soil depths decreases with time since glacier retreat. We ran mixed models with rescaled indices (Smithson & Verkuilen, 2006) to avoid fixed zeros and ones, using a beta distribution, and included glacier identity as a random factor. Models for beta diversity were limited to plots with at least one detected MOTU in both depths. We then repeated the analyses for the turnover and nestedness components of beta diversity. We ran all generalized mixed models with three MCMC chains, 5,000 iterations and a burn-in of 5,000 in the brms R package (Bürkner, 2017). After processing, c-hat values were always <1.01, indicating convergence.
To visualize the variation of the structure of belowground communities across different stages of soil development, we used distance-based Principal Component Analysis (db-PCA). We calculated distance among samples using the Hellinger distance that allows us to control for the double zero issue (Legendre & Legendre, 2012). Prior to ordination, count data were normalized with a shift-log transformation in order to stabilize extreme values and variance inflation. As for the beta-diversity analysis, we removed plots having zero MOTUs in at least one depth. To test for differences in communities across time, depth and their potential interaction, we performed a permutational multivariate analysis of variance (PERMANOVA) using the adonis function of thevegan package (Oksanen et al., 2019) with glaciers as strata and permutations set to 9999. Time was log-transformed. Results of PERMANOVA can be sensitive to differences in multivariate dispersion (Anderson, 2001), therefore we computed the homogeneity of variance among groups (Anderson, 2006) and tested for its significance by permutations (n = 9999). We used data visualization in ordination plots to support the interpretation of the statistical tests.
Finally, we used the indicator value (IndVal; Dufrêne & Legendre 1997) approach to identify MOTUs that were characteristic for particular stages of soil development and/or soil depth. Prior to the analysis, we grouped samples into three classes based on the years since glacier retreat (i.e., < 40 years; 40-95 years and > 95 years) and depth (surface/deep), for a total of six environmental classes. Metabarcoding approaches can lead to a very large number of MOTUs. Thus, for this analysis we only retained MOTUs with a relative abundance > 0.1% for each taxonomic group. We computed the IndVal index using the indicspecies package (De Cáceres & Legendre, 2009). For a given taxon, the IndVal index is based on its specificity (i.e., the concentration of abundance) and its fidelity (i.e., the relative frequency) within a class. Each MOTU could be an indicator of at maximum two environmental classes (De Cáceres, Legendre, & Moretti, 2010), so that a MOTU could result as indicator of e.g. one or both depths at a given soil stage, or of one or two consecutive stages at a given soil depth. This choice allowed keeping the number and the ecological meaningfulness of the combinations reasonable. The significance of indicator values was tested through random permutations (n = 9999) and p -values were adjusted for multiple comparison tests using the FDR method (Benjamini & Hochberg, 1995). We used the packages ggplot2 (Wickham, 2016), ggpubr (Kassambara, 2020), phyloseq (McMurdie & Holmes, 2013) and vegan(Oksanen et al., 2019) for multivariate statistical analyses and visualization.