DISCUSSION
The number of DNA barcodes for oak-feeding Lepidoptera is lower in southern Europe, despite the higher species richness. As expected, the effect of the geographical scale on the genetic divergence depended on the latitude. In pairwise sequence comparisons, for any given spatial distance the genetic divergence was higher when at least one of the sequences came from one of the southern European peninsulas included in the study (Italy and Iberia). This made identification of some southern query sequences problematic, as the genetic distance with respect to the reference barcodes in BOLD was above the maximum intra-specific threshold allowed. The effect of the latitude is due to the presence of southern haplotypes with a reduced geographical distribution. Accordingly, the COI gene-tree showed in different species monophyletic clades exclusive from southern Europe (mainly Iberian). The GMYC single threshold model classified some of those clusters as different OTUs, and thus potentially cryptic species.
The lower availability of DNA barcodes in southern Europe cannot be explained by any factor but the regional scarcity of DNA barcoding initiatives. Before we started the project, only 26.36% of the DNA barcodes available for the study species in BOLD came from the southern European Peninsulas (Iberia, Italy and the Balkans). The Iberian Peninsula was underrepresented according to its area and to its species richness. In fact, for some species (Archips xylosteana ,Bena bicolorana , Dryobotodes eremita , Dryobotodes monochroma , Ennomos quercaria , Eupithecia cocciferata ,Malacosoma neustria , Nycteola columbana , Orthosia cruda , Tortricodes alternella and Tortrix viridana ) no Iberian sample had ever been sequenced before the present study (January 2018) and others, restricted to the southwestern Mediterranean Basin or Iberia (Dryobota labecula , Phycita torrenti ), were sequenced for the first time. The function relating latitude and number of barcodes peaked at 48 degrees north, because that is the latitude around which the largest European barcoding initiative has been carried out in Germany (Gemeinholzer et al . 2011).
Previous studies have shown that the larger the sampling scale the greater the intra-specific divergence and the more likely finding overlapping closely related taxa (Bergsten et al. 2012). In this study, we did not analyze the effects on the barcoding gap, because the closest relatives of the study species often feed on other host plants (Cates 1981; Thompson & Pellmyr 1991). Rather, we focused on intra-specific genetic divergence but considering not only the effect of the spatial distance alone, but its interaction with the latitude. Doing so we found that, intra-specific genetic divergence, was higher in pairwise comparisons that included at least one DNA sequence from a southern peninsula than when both came from elsewhere in the continent.
The peninsulas of southern Europe are hotspots of species and genetic diversity (Murienne & Giribert 2009; Pinto et al . 2012; Geigeret al . 2014), thus, when they are undersampled, intra-specific genetic divergence is underestimated more than expected by the mere reduction of the geographical scale. The low availability of DNA barcodes or their reduced geographic distribution is a main concern in DNA barcoding (Savolainen et al . 2005; Bergsten et al . 2012; Geiger et al . 2014; Dincă et al . 2015). Our results show that, to capture as much intra-specifc genetic variability as possible, sequencing efforts should be concentrated in southern European genetic diversity hotspots (Murienne & Giribert 2009; Pinto et al . 2012; Dincă et al . 2015) where, paradoxically, the number of available DNA barcodes is lower.
The disproportionately strong effect of the Iberian samples is largely related with the distribution patterns of genetic diversity determined by Pleistocene glaciations (Hewitt 1996; Schmitt 2007). The southern European Peninsulas were refugia that hosted a large number of plant and animal taxa when the ice sheet covered large extensions of the continent. This was the case of our study group (insects associated with oaks and other species of broad-leaved trees). In the Iberian Peninsula, where a greater geographic isolation is observed than, for example, in the Italian peninsula, deciduous and evergreen oak forests were restricted to a few refugia close to the coast or at the south-facing slopes of some mountainous systems (Koster 2005; Magri et al . 2007). When the ice retreated, not all haplotypes spread northwards but just some of them. This “founder effect” is responsible for the higher species richness in the south and the genetic homogeneity of the recently colonized areas in the central and northern parts of the continent (Hewitt 1996; Taberlet, Fumagalli, Wurst-Saucy & Cosson 1998; Hewitt 1999). Taking our study species as an example, there are many examples of haplotypes shared by different central and northern European countries, especially between Germany, the Netherlands, the United Kingdom, Czech Republic, Austria or Finland. Similarly, a noteworthy study including hundreds of Lepidoptera species showed little intra-specific genetic variability between central (Austria) and northern (Finland) in Europe (Huemer et al. 2014).
Due to the historical factors linked to the paleoclimate of the continent, in 56 % of our study species we found monophyletic Iberian clades. This was not the case for the Italian peninsula, which shared a higher genetic similarity with the territories northwards, suggesting a stronger effect of the Pyrenees as geographical barrier than that of the Alps. Previous studies have reported similar results for organisms like butterflies or freshwater fish (Geiger et al . 2014; Dincăet al . 2015). Some alpine butterflies, for example, show higher intra-specific genetic distance among the populations in the Alps and the nearby Pyrenees than among the Alps and Scandinavia (Dincă et al . 2015). However, it is well known that the Alps have a deep impact on the genetic variation of other groups of animals (Arntzen 2001; Cornetti et al . 2015; Leys, Keller, Räsänen, Gattolliat & Robinson 2016).
Four Iberian clades were retrieved as different OTUs by the GMYC single threshold model, which confirmed its utility for species delimitation within poorly inventoried biogeographic regions in Europe. According to Fujisawa and Barraclough (2013), GMYC single threshold model is more reliable than the multiple thresholds one, which overestimates the number of OTUs. Moreover, ABGD and jMOTU retrieved identical or very similar results to the single threshold GMYC, while the latter was considered the most reliable of the three in a comparison of performances (Ratnasingham & Herbert 2013).
The presence of different putative species in southern Europe conditions the efficacy of species identification on the basis of DNA barcoding (Derkarabetian & Hedin 2014; Geiger et al . 2014; Dincă et al . 2015; Fossen, Ekrem, Nilsson & Bergsten 2016). In the hypothetical case that there had not been any Iberian barcode in BOLD, in 7 out of 15 species (possible comparisons between Iberia and Europe, Table 2), there would have been at least one haplotype that would have not been determined to the species level (EUIB K2P distance above 1%). Even including the Iberian barcodes available in BOLD before the present study, the same still happened in 3 species. The case ofTortricoides alternella is specially remarkable: from 16 new haplotypes recorded in this study, none of them could not be matched to any reference sequence in BOLD using the 1% threshold (Table S4). This lack of identification due to the absence of Iberian reference sequences is not exclusive of the study species, having been reported for other insect taxa as well (e. g. Cerambyx cerdo , Coleoptera) (Torres-Vila & Bonal 2019).
If well the present data set shows a clear trend, we have to be cautious before generalizing, as it is restricted to a limited number of species of oak-feeding moths. The occurrence of putative cryptic species in southern Europe in other taxa (Geiger et al. 2014, Dincă et al . 2015) suggests that the pattern may be widespread, but further studies are needed to confirm it. Future large-scale DNA barcoding initiatives in Europe should cover latitudinal gradients, rather than large distances at the same latitude, to avoid neglecting genetic diversity hotspots like the Iberian Peninsula.