Disentangling sources of population structure
The population structure of a species reflects both historical
divergence processes and contemporary gene flow among subpopulations,
and it requires careful analysis to disentangle these evolutionary
processes (Nielsen, & Slatkin, 2007; Schönswetter & Schneeweiss 2019).
Summary statistics, such as Wright’s F ST, are
useful for describing genetic drift and gene flow under equilibrium
models, but provide little information about deeper evolutionary events
that shape patterns of genetic variation. Clustering based algorithms,
which infer individual ancestry components according to a model, are
also challenging to interpret when population divergence is continuous
in space (François & Durand, 2010; but see Bradburd et al. 2018).
Coalescent approaches leverage the distribution of gene genealogies to
separately estimate the population parameters, including migration rate
and divergence time, and can be used to distinguish models of migration
from divergence (Nielsen & Wakeley, 2001). However, coalescent
approaches are often difficult to implement when evolutionary history is
spatially complex, as the number of free parameters grows rapidly
(Beerli, 2009).
In this study, we leveraged aspect of each approach to disentangle the
evolutionary history of the N. ingens complex. First, we used the
clustering algorithm sNMF to identify population structure, though we
found that the log likelihood increased with an increasing number of
clusters. A large shift in K occurred as different morphotypes were
distinguished genetically (K=3), but models K=4 to K=6 had improved log
likelihood scores and revealed further population subdivision reflecting
major drainage systems in the Sierra Nevada. More subtle patterns of
admixture among drainages continued to emerge as the number of clusters
increases (Figure S3 ), but this appears to be the product of
gene flow and uneven sampling (continuous population structure). Indeed,
by explicitly comparing spatial and non-spatial cluster models using
conStruct, we found that the spatial models are generally more accurate
than the non-spatial models (Figure 5 ), indicating the presence
of isolation by distance among sample sites. Analysis of pairwiseF ST values among populations also strongly
supports gradual genetic divergence with geographical distances
(R2 = 0.35; Figure 7 ).
We further assessed patterns of population divergence using a model
fitting framework based on coalescence theory (Rougemont & Bernatchez,
2018). We compared ANCOVA regressions on the paired populations’ TMRCA
for different levels of population structure. For TMRCA, K=6 and K=5
show the best AIC, BIC, and results from the Vuong’s test. Although K=6
and K=5 are indistinguishable owing to the lack of multiple population
samples from the Merced drainage, both support the model of drainage
associated glacial refugia (Table 1 ).