Genetic diversity and population structure estimates
The following analyses were performed on the ‘nuclear mapped’ and
‘nuclear de novo’ datasets including only the first SNP per tag.
Genome-wide average and per-SNP pairwise FST values were
calculated using GENEPOP (Raymond 1995)
both including all individuals or only larvae. Significance (p
< 0.05) of FST values was estimated by
performing 10,000 permutations. Principal Component Analysis (PCA) was
then performed using the adegenet R package
(Jombart and Ahmed 2011) to illustrate
the main axes of genetic variation among individuals. The number and
nature of distinct genetic clusters was investigated using the model
based clustering method implemented in ADMIXTURE
(Alexander et al. 2009) assuming from 2 to
5 ancestral populations (K) and setting 5000 bootstrap runs. A first
ADMIXTURE run was launched for each value of K to check the number of
steps necessary to reach the default 0.001 likelihood value during the
first run. This information was used to set the “-c” parameter (steps
to be fulfilled in each bootstrapped run) that would assure convergence
for each analysis (from 20 to 100 steps) for the bootstrapped runs. The
value of K (ranging from 2 to 10) with lowest associated error value was
identified using ADMIXTURE’s cross-validation procedure. The convertf
function from ADMIXTOOLS software
(Patterson et al. 2012) was used to
convert from PLINK to eigenstrat format and the qp3Pop function was used
to calculate F3 statistic and Z-score associated values
(Patterson et al. 2012), testing for all
possible admixture scenarios grouping separately samples from different
locations and age classes (Table S5) on the ’nuclear mapped’ catalog
dataset.