Spatial genetic patterns
We investigated spatial genetic patterns and restrictions to gene flow in A. cephalotes , using the software ARLEQUIN version 3.5.2.2 (Excoffier & Lischer, 2010) to perform alternative hierarchical Analyses of Molecular Variance (AMOVA). First, the populations were classified into two regions (Pacific and Andean), separated by the western range of the Colombian Andes, in order to study the effect of this range as a major geographic barrier to gene flow affecting the distribution of genetic variation in the data (isolation by barrier, IBB). Second, the populations were classified into three regions (Pacific, Andean 1, and Andean 2) defined by the climatic conditions mediated by the Andes mountains (isolation by environment, IBE; Supplementary table ST1). In the AMOVA, we estimated the associatedF ST for microsatellites and the genetic distance-based Φ -statistic for mtCOI (Excoffier, Smouse, & Quattro, 1992; Meirmans, 2006; Meirmans & Hedrick, 2011), both globally and between all pairs of populations. The significance of the variance components and associated FST andΦ indices were calculated using 10,000 permutations.
We further investigated the structure of the A. cephalotesnuclear data using two clustering analyses. First, we used a model-based Bayesian clustering method implemented in the software STRUCTURE v. 2.3.4 (Pritchard, Stephens, & Donnelly, 2000), which estimates the number of genetic clusters (K ) independent of spatial sampling. Analyses were performed using one individual per nest, with and without admixture, for correlated allele frequencies. A burn-in of 50,000 and 500,000 sampling generations were implemented for K ranging from 1 to 12, with 10 iterations for each value of K . Evanno’s method (Evanno, Regnaut, & Goudet, 2005), implemented in STRUCTURE HARVESTER (Earl & vonHoldt, 2012), was used to estimate the optimal number of clusters from the STRUCTURE output (Supplementary figure SF1) and the results were visualized using CLUMPAK (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015).
Second, discriminant analysis of principal components (DAPC) was performed using a principal component analysis (PCA) prior to the discriminant analysis (DA) (Jombart, Devillard, & Balloux, 2010). The DA partitions genetic variation, maximizing differences between clusters while minimizing within-cluster variation. We performed a DAPC analysis in the R package ADEGENET (Jombart et al., 2010) using one individual per nest for microsatellite data and a polymorphic nucleotide positions matrix for mtCOI . The function ‘dapc ’ was used to estimate all available principal components (PCs), and to determine the optimal number of PCs used based on cumulative variance.