Lineages characterisation and meta-phylogeographical patterns
Three per cent and a 15% similarity clusters were used, whereby 3%
clusters are considered a proxy to species, and from here on referred to
as ”OTUs”; while 15% clusters are lineages of one or more species and
are hereon referred to as ”15% lineages”. We evaluated the genetic
diversity, distribution, and degree of habitat specificity for each OTU
and 15% lineage. We then tested the relative roles of the habitat and
the geographical distance in the diversification of soil fauna within
the island. The number of haplotypes was recorded as a measure of the
genetic richness of each OTU, and OTUs were classified as ”single
haplotype” or ”multiple haplotypes”. At the level of 15% lineages and
under the assumption that each arises from a single colonisation of
Tenerife, the number of OTUs within each 15% lineage was used to
classify each lineage as ”non-diversified” or ”diversified” according to
whether they included one or multiple OTUs within the island. BLAST
search (blastn -outfmt 5 -evalue 0.001 ) against a reference
library including all sequences on BOLD (database downloaded at
3-07-2020), together with COI sequences from southern Iberia (Arribaset al., 2020), and COI Collembola sequences from Cicconardi et
al. (2017) from outside the Canary Islands, were used to classify OTUs
as ’non-endemic’ if similarity with non-Canarian sequences was ≥97%;
and ’likely introduced’ if the similarity was ≥99%.
To explore OTU and 15% lineage distributions, the number of sampling
sites with a presence (number of occurrences), the maximum geographical
distance of occurrences, and the different habitats with occurrences
were recorded for each OTU and 15% lineage, the latter summarised using
Venn diagrams. Habitat specificity was estimated for each entity using
the proportion of occurrences in a particular habitat, considering those
with 80-100% of occurrences in one habitat as entities with high
habitat specificity. Estimations of habitat specificity were performed
for those entities sampled in n or more sites, with n = 3,
4, 5, and 6. Finally, we explore the structure of genetic diversity for
each OTU and 15% lineage with a product of its number of sites by its
number of haplotypes ≥ 15. Firstly, we tested the relationship between
the genetic distance (F84 model) and geographic distance (Euclidean
distance between sampling sites). The relationship between both
distances was estimated by randomising spatial distances 1000 times and
computing the proportion of times in which the model deviance was
smaller than the randomised model deviance, adjusting a linear model
using the glm function (link = ”identity”) as in Gómez-Rodríguez
& Baselga (2018). Geographic distances were calculated using the R
package gdistance as before, with calculations performed for each
pair of sites with the lowest and highest limit of permitted movements
restricted to the highest (plus 100 meters) and lowest (minus 400
meters) values of the two sites. We applied these restrictions to avoid
shortest paths transgressing unfavourable habitats over the top of the
island, while also allowing paths to cross the valley separating the
central region of Tenerife from the Anaga peninsula, and facilitating
connectivity over cliffs separating coastal sites. In addition we also
tested the correlation between genetic distance (F84 model) among
haplotypes and their distribution in the four habitats, using
permutational ANOVAs with 999 permutations and the habitat as a grouping
factor. To graphically summarise patterns of haplotype relatedness and
habitat association, we estimated and plotted haplotype networks for all
15% lineages including four or more haplotypes using the functionmjn of the R package pegas (Paradis, 2010). For 15%
lineages with more than 40 haplotypes (four cases), the mjnfunction could not be applied, and networks were alternatively estimated
with the haploNet function, which uses an infinite site
model and uncorrected distances.
Results