RAD-tag assembly and SNP calling
Five RAD-seq derived catalogs (Figure S1; Table S3) were generated. Three of them included ABFT individuals and were either de novo assembled (‘nuclear de novo’) or mapped to the Pacific bluefin tuna nuclear (PBFT, Thunnus orientalis ) (Suda et al. 2019) or ABFT mitochondrial (accession number NC_014052) genomes (‘nuclear mapped’ and ‘mito’). The two others were mapped to the PBFT genome and included ABFT individuals and 4 Southern bluefin tuna (Thunnus maccoyii ), 4 albacore (Thunnus alalunga ) and 5 PBFT individuals (Díaz-Arce et al. 2016) (‘nuclear mapped + others’) or only ABFT larvae and 4 albacore individuals (‘nuclear mapped + ALB’). Both reference-mapped and de novo assembled catalogs were generated for testing possible bias introduced by the use of the reference genome from a closely related species, which is less fragmented than the one available for ABFT (Accession GCA_003231725). Three nuclear mapped catalogs were generated including different groups of individuals to maximize the number of informative markers included for each type of analysis. In order to avoid inclusion of kins in the resulting datasets, which could bias some population structure results, a genetic relatedness matrix using the GCTA toolbox (Yang et al. 2011) was generated using the genotypes obtained from the ‘nuclear mapped’ catalog (generated as described below), and only one individual (the one with the highest number of assembled RAD tags) of each resulting pair with relatedness higher than 0.1 was included in subsequent analyses. Reference-based assemblies were performed by mapping the quality-filtered reads to the corresponding reference genome using the BWA-MEM algorithm (Li 2013), converting the resulting SAM files to sorted and indexed BAM files using SAMTOOLS (Li et al. 2009) and filtering the mapped reads to include only primary alignments and correctly mate mapped reads. De novo assembly was performed using the ustacks, cstacks, sstacks and tsv2bam modules of Stacks version 2.3e with a minimum coverage depth of 3 reads per allele (this is, each of the two possible versions of one biallelic SNP variant), a maximum of 2 nucleotide mismatches between two alleles at a same locus and a maximum of 6 mismatches between loci (Rodríguez-Ezpeleta et al. 2019). For all mapped and de novo catalogs, SNPs were called using information from paired-end reads with the gstacks module of Stacks version 2.3e. For the ‘mito’ catalog only samples with no missing data for the three diagnostic positions used for detecting introgression were kept, and the heterozygous genotypes, considered to be related to sequencing or assembly errors, were removed. For the rest of the RAD catalogs, only samples with more than 25,000 RAD loci and SNPs contained in RAD-loci present in at least 75% of the ABFT (‘nuclear mapped’ and ‘de novo’) or in 75% of the individuals from each of the species included (‘nuclear mapped + others’ and ‘nuclear mapped + ALB’) were kept and exported into PLINK (Purcell et al. 2007) using thepopulations module of Stacks version 2.3e. Using only SNPs derived from read 1, increasing threshold values for minimum genotyping rate for individuals and SNPs were applied to obtain a final genotype table with a minimum genotyping rate of 0.95 and 0.85 per SNP and individual, respectively (except for the ‘nuclear mapped + ALB’ catalog for which thresholds were 0.95 and 0.90, respectively). SNPs were filtered using different minor allele frequency (MAF) thresholds considering sample sizes of the different datasets to exclude from the analysis rare non-informative variants that are susceptible to being derived from sequencing or assembly errors. For the ‘nuclear mapped’ and ‘nuclear de novo’ catalogs, SNPs with a MAF < 0.05 were removed, for the ‘nuclear mapped + others’ catalog, SNPs with MAF < 0.05 in ABFT and MAF < 0.25 in each of the other species were removed, and for the ‘nuclear mapped + ALB’ catalog, SNPs with a minimum allele count of two in ABFT and those variable within albacore were removed. For all nuclear catalogs, SNPs failing Hardy Weinberg equilibrium test at a p-value threshold of 0.05 in Mediterranean Sea larvae or Gulf of Mexico larvae groups were removed. Resulting genotype tables including all SNPs or only the first SNP per tag were converted to genepop, structure, PLINK, BayeScan, immanc, VCF and treemix formats using populations or PGDSpider version 2.0.8.3 (Lischer and Excoffier 2011) . From the ‘mito’ catalog, genotypes for the three diagnostic positions used for detecting introgression were extracted using PLINK (Purcell et al. 2007).