Application 1: lcWGS data simulated from distinct demographic backgrounds and sequencing coverage
Down-sampling, trimming, and mapping of simulated data results in 100% of the simulated chromosome being covered at ≥10 reads per population in the 0.33x coverage datasets and at ≥40 reads for the 1x coverage datasets. The two demographic scenarios simulated by Lou et al. (2021) resulted in notably different numbers of segregating variants: 158,746 variants with MAF ≥ 0.001 (of 245,412 total variants) in the high background FST scenario and 789,423 variants with MAF ≥ 0.001 (1,209,625 total) in the low background FST scenario. PoolParty2 and angsd differed considerably in the dynamics of variant discovery, particularly identification of ‘real’, simulated variants. Both PoolParty2 and angsd recovered a larger number of sites in the datasets with greater coverage (all of which must have MAF ≥ 0.001), but while the proportion of sites that were ‘real’ was similar across coverages for PoolParty2 (≥ 99.5%), this value changed with coverage for angsd (Table 1). In addition, only at the highest significance thresholds did angsd recover sites with similar ‘real’ proportions to PoolParty2, but then in lower numbers.
Across coverage depths, allele frequencies estimated by PPalign were always more accurate than those estimated by angsd, although only modestly (Figure 2, Supplemental Figure 2). Allele frequency estimates generally improved with depth for both PoolParty2 and ANGSD, though more notably for angsd when depth was estimated by angsd rather than PoolParty2, and despite the large decrease in sites passing increasing depth thresholds. Indeed, angsd and PoolParty2 disagreed considerably about depth (correlation in depth estimates decreased) as the threshold for depth increased, implying that angsd’s read filter is more stringent than that applied by PoolParty2 even with apparently similar parameter values. Nonetheless, correlation values for estimated to true allele frequencies ranged from approximately 80% to 90% for the lower coverage datasets and 90% to 95% for the higher coverage datasets, with PoolParty2 slightly higher than angsd in each case. It should also be noted that some diminishment in accuracy was expected due to sampling variance (which individuals and reads were sampled for each dataset), which determines the truly estimable allele frequencies regardless of the efficacy of each analysis, implying that each analysis is slightly closer to accurate than the reported values imply. This is reflected in the observation that correlations between allele frequencies estimated by PoolParty2 and angsd were always higher with each other than with ‘real’ allele frequencies in each case (87-98%; data not shown).
Both analytical suites were able to provide results which allowed visual identification of most if not all of the simulated outlier regions, particularly in the sliding window FST, Local Score, and linkage outlier results (Figure 3). Results provide by PoolParty2 and angsd for FST, sliding window FST, and FET (PoolParty2) or frequency test (angsd -doAssoc 1) were roughly equivalent (Supplemental Figures 3-6). The score and hybrid latent-score tests from angsd (-doAssoc 2 and 5) failed to produce any significant results. Both PoolParty2 and angsd had more difficulty in providing results that unambiguously identified outlier regions for the high background FST scenario at lower coverage, although even at higher coverage, outliers were less obvious than in either of the low background FST scenario datasets. The analyses that were designed to provide less ambiguous identification of outlier regions, Local Score and linkage outliers identified above twice the IQR, also exhibited efficacy moderated by demography and coverage (Table 2). In the case of Local Score, while peaks corresponding to the outlier regions were clearly visible in plots of smoothed FET significance, it was more difficult to determine significance thresholds that effectively identified the outlier regions with high background FST, though this test did not appear to be constrained significantly by coverage for the low background FSTscenario. Moreover, in the high background FST scenario, there was no obvious inverse relationship between smoothing value (ξ) and power, as some replicates with higher ξ values identified more outlier regions at p ≤ 0.05, though an inverse relationship was apparent in the low background FST scenario. Moreover, the precision of identified regions narrowed with increasing ξ values, as expected. In contrast, our identification of outliers using linkage was more constrained by coverage, with lower efficacy in lower coverage datasets regardless of background FST. Notably, the width of the region affected by hitchhiking was smaller with lower coverage in both scenarios, with similarly smaller outlier regions estimated by the Local Score analyses across ξ values at lower coverage in the low background FST scenario. Importantly, none of the analyses that considered broad range divergence or significance (windowed FST, Local Score, windowed linkage) identified any false positive outlier regions.