Sequence alignment and variant calling
Genomic sequences were analyzed in parallel using two pipelines: (1) the
first pipeline trimmed the data using TrimGalore version 0.3.3,
aligned using Bowtie2 (Langmead & Salzberg), and
SAMtools (Li et al. 2009) to include only reads that
mapped concordantly and only once to remove PCR duplicates. Variants
were called for each alignment using HaplotypeCaller, combined into a
common variant file with GenotypeGVCFs and further quality filtered with
VariantFiltration
(QD<2.0||FS>60.0||MQ<40.0||MQRankSum<
-12.5||ReadPosRankSum< -8.0) in Genome
Analysis Tool Kit (GATK) (McKenna et al. 2010; Van der Auweraet al. 2013). (2) The second pipeline trimmed the sequences using
Trimmomatic (Bolger et al. 2014), mapped the reads using
BWAmem (Li 2013), filtered using SAMtools, called
variants using BCFtools call and filtered with
BCFtools view. For both pipelines, the sequence data were
mapped to the monokaryotic Serpula lacrymans isolate S7.3,
version 2 (Eastwood et al. 2011). Repetitive regions of theS. lacrymans v. 2 genome were annotated using the REPET package
v.2.5 (Flutre et al. 2011), following the procedure outlined in
(Sipos et al. 2017). Regions annotated as transposable
element-derived were filtered from the SNP data set using
BEDtools (Quinlan & Hall 2010). A combined data set using only
SNPs called by both pipelines resulted in 419,196 high quality SNPs for
the 36 isolates included in this analysis.