Discussion
Intense research on plant phylogenetics and phylogeography over the last two decades have allowed the discovery of several major biogeographical trends in the Mediterranean basin (Garcia-Verdugo et al., 2021) and renewed our understandings of plant domestication (Purugganan, 2019). Following an initial focus on biogeographic refugia, recent studies have revealed the genetic imprints of past expansions and migration processes, some involving the entire Mediterranean basin (see reviews in Médail and Diadema 2009; Nieto Feliner 2011; Nieto Feliner, 2014; Migliore et al., 2018; Vargas et al., 2018; Thompson, 2020; Garcia-Verdugo et al., 2021). Our study provides a better understanding of the phylogeography of Mediterranean plants by revealing a new historical scenario: the main gene pools of carob (i.e. CEUS) originated from a biogeographic refugium probably located in southwest Morocco. Our results also highlight that carob domestication has mainly relied on the use of locally selected and disseminated varieties, albeit punctuated by long-distance westward dispersal events by humans, which match major cultural waves by Romans and Arabs.
Evolutionary history of the carob tree
Our phylogeographic investigation allows rejecting a long-standing hypothesis that proposes an introduced origin of the carob tree in most of the Mediterranean. An Eastern Mediterranean or Southern Arabian origin followed by a human-mediated expansion were proposed by several authors partly based on linguistic evidence from vernacular names,Ceratonia siliqua and C. oreothauma occurrences in western Asia and carob agricultural practices (reviewed in Ramon‐Laca & Mabberley, 2004). However, genetic data from SSR and plastid markers based on a thorough population sampling across the Mediterranean (Viruel et al., 2020) found a better explanation to account for all the data according to which current C. siliqua populations originated from two disjunct refugia after the Last Interglacial ca. 116 Ka ago. SSR data revealed introgression in the Central Mediterranean and Northern Morocco, but the strong west vs. central-east pattern based on plastid data revealed a low human influence on the main current patterns of genetic diversity and structure of the carob tree across the Mediterranean. The comprehensive review of carob fossil data done by Viruel et al., (2020) did not provide support for an eastern origin ofC. siliqua either. Instead, the fossil record shows a mostly continuous presence of Ceratonia around the palaeo‐Mediterranean Sea since the Oligocene with a progressive decline starting c. 20 Ma.
Compared to SSR data, RADseq allows bridging phylogenetic and population genetic inferences (Parchman et al., 2018). Here, the inclusion ofCeratonia oreothauma , the sister species of C. siliqua , despite their divergence around 6.4 Ma (Viruel et al., 2020), corroborates our previous conclusion on the importance of western Mediterranean in the history of the carob. Our SVDquartets phylogenetic reconstruction provides further resolution pointing to southwest Morocco as closest to the ancestral population of C. siliqua (Figure 2). This origin was suggested based on a slightly higher genetic diversity revealed for both nuclear (SSR) and plastid data in Viruel et al. (2020). However, in our previous study, coalescent‐based models tested by an approximate Bayesian computation approach supported a two refugia hypothesis to explain the west-east split in the genetic diversity structure of the carob tree. Here, our phylogenetic and population genomic inferences support a different scenario. According to SVDquartets phylogenetic reconstruction based on nuclear genome-wide diversity, the carob tree followed two routes of migration from south Morocco; one northward that reached western north Africa and south Spain (NM, SES and SWS CEUs) and another towards the east that gave rise to the central-eastern CEUs. Both mitochondrial and plastid data extracted from RADseq data also support the existence of the ancestral pool in southwest Morocco. Specifically, eastern mtDNA and pDNA haplotypes are present in the northern part of SM (Imouzzer Ida Ou Tanane area) thus suggesting that this is the most credible source for the eastern populations. By contrast to our previous study, our new scenario explains the west-east split in the carob genetic diversity by two migration routes from an ancestral population situated in south Morocco.
As shown in our previous study, species distribution modelling (SDM) indicates that both the Last Interglacial and the Last Glacial Maximum were periods of contraction for this species during the Pleistocene (Viruel et al., 2020). Moreover, SDM predicted that some areas in the North African and South European Atlantic coasts could have been continuously suitable since the last 130 ka. Southwest Morocco has been identified as a biogeographic refugium and even as a diversification cradle for several taxonomic groups (e.g. Médail and Quézel, 1999; Médail et al. 2001; Ortiz et al., 2009; Martínez-Freiría et al., 2017; Bobo-Pinilla et al., 2018; Villa-Machio et al., 2018; Klesser et al., 2021). Although Mediterranean phylogeographic studies focused mostly on glacial refugia, three recent studies have highlighted South and West Morocco as a refugium for plant populations during the LIG (Villa-Machio et al., 2018: Bobo-Pinilla et al., 2018: Viruel et al., 2020) where an overall pronounced climate continentality could have been buffered by the ocean vicinity.
Footprints of domestication on the current genetic structure of the carob tree across the Mediterranean
Although disentangling the history of cultivated plants is complex, our phylogeographic investigation in the carob tree sheds light on the history of agriculture. Our previous study based on SSR data (Viruel et al., 2020) suggested that local domestication events from wild populations were the most likely scenario. The RADseq data here presented, depicting a strong east-west genome-wide differentiation could not explain domestication solely based on translocations and/or human-based dispersals from east to west. Agriculture practices in the carob tree are based on propagation by grafting (Zohary, 2002) although seeds could have also been transported. In either case, domestication based only on westward propagations of cultivars from the east would have maintained the maternal (eastern) haplotypes in the Western Mediterranean. Instead, our results conclude that the dispersal of selected varieties (vegetatively propagated), between remote geographical areas, was not the main force of domestication in carob tree.
The use of genomic data at the infraspecific level has permitted identifying footprints of domestication in crop models where PCR-based molecular markers had previously failed. In the case of date palm, genomic data revealed that human-mediated dispersal imprints were superimposed on a previous phylogeographical structure (Gros-Balthazar et al. 2017; Flowers et al. 2019). In the carob tree, despite a moderate differentiation (Gst = 11%), genome-wide diversity is structured into three main genetic sources: SM, SWS+ SES and NEM + SEM (Figure 2, and Fig. S5 in supplementary material). Although this pattern does not suggest translocation of eastern domesticated varieties into South Morocco or the Iberian Peninsula, it does fit with the patterns found in geographically intermediate groups (NM, CM and NEM). These are less differentiated, which is explained by high rates of admixture (Fig. 2). To untangle the role of human-based dispersals in these strong genetic admixtures, we used allelic-frequency based models aiming at estimating the intensity and origin of dispersal events throughout the evolutionary tree of the carob tree (Fig. 3): results of Treemix recover westwards migrations that were mostly originated from SEM, or from central-eastern CEUs (SEM, NEM and CM). These translocations match with the beginning of carob agriculture in the East, its dispersal by Greeks, Romans and after by Arabs in historical times (Ramon Laca and Mabberley, 2001; Viruel et al., 2020). They may have contributed to genetic admixed pool used locally for cultivation as observed in North Morocco (NM).
The second footprint of domestication was observed in CM, which is the area among those considered in our study in which cultivated varieties (either local or imported selections) are most diffused (Di Guardo et al., 2019). This CEU is characterized by a slightly lower genetic diversity and a small excess of heterozygosity whereas all other CEUs showed a deficit. We detected a genetic group of individuals without admixture in CM, corresponding to the monumental carobs of the Ragusa district (Sicily, South Italy). Without being clones, these individuals, harvested without interruption for centuries, are genetically very close to each other and form a lineage within CM. The genetic patterns of these ancient CM individuals have not been observed in other CEUs, supporting again the idea that diffusion of selected genotypes at the local scale local, rather than long-distance dispersal, played a major role in the domestication of carob. Despite this pattern, we did not detect any candidate loci under selection due to domestication pressures, which could be explained by the limitations of our method and sampling or by a low effect of domestication on the carob genome. Compared to other perennial crop species for which candidate and adaptive loci have been found by whole genome sequencing as well as RADseq (Cornejo et al., 2018; Alves-Pereira et al., 2020; Groppi et al., 2021), a relatively lower impact of selection is likely in carob. Domestication leading to fine-tuning of gene expression patterns rather than genome-wide evolution, as observed in olive (Gros-Balthazard et al., 2019), maybe almost undetectable by a reduced-representation genomics approach such as RADseq.
Conservation of genetic diversity within Carob Evolutionary Units (CEUs)
Knowing the structure of genome-wide diversity is essential for preserving the genetic resources of cultivated species and for future breeding (Purugganan 2019). We used an integrative approach combining geographic and genetic differentiation to characterize evolutionary units for Ceratonia siliqua across the Mediterranean. In a survey including 1020 samples, seven non-overlapping CEUs were identified as the best solution to minimize intra-group variance and obtain homogenous groups non overlapped geographically. Four genetic clusters, identified within carob tree populations based on a thorough sampling across the Mediterranean using nuclear SSR and SNP data, are contained within the seven CEUs (Figure 2): South Morocco (SM), Iberian Peninsula (SES, SWS), Central Mediterranean (CM, NM) and Eastern Mediterranean (NEM, SEM). These four genetic clusters exhibit moderate introgression in the West and East CEUs, but high patterns of admixture in the Central Mediterranean (CM, NM), more intense in NM. RADseq data further resolved these genetic structuring across the Mediterranean by identifying seven genetic clusters (Fig. 2 A,D), which, in some cases fully matched with a CEU (e.g. SM, SES) or two CEUs (SWS, SES) whereas, in other cases, a mixture of more than one genetic cluster was observed (e.g. CM). These data permit a better interpretation of the genetic diversity patterns between CEUs and are thus important for future designs of ex situ conservation. Our results suggest that moderate genetic diversity is uniformly distributed across CEUs (Table 1). Only a slightly higher genetic diversity was estimated in Western CEUs (SWS, SES, SM) based on SSR loci. Although Central CEUs (CM, NM) are highly admixed, these factors did not entail an increase in genetic diversity compared to non-admixed clusters. Conservation of genetic resources for the carob tree should recover genetic diversity found across the Mediterranean by preserving materials from western and eastern CEUs prioritizing the most differentiated CEUs SM, SES + SWS, and SEM. CM, which contains three genetic groups and for which carob cultivars have been well characterized, specifically in Italy, should benefit from more investigations on carob evolution under domestication.
Aknowledgments
This study is part of the DYNAMIC project supported by the French national agency of research (ANR-14-CE02-0016) and benefited from equipment and services from the molecular biology facility (SCBM) at IMBE (Marseille, France). All bioinformatics and simulations were done on the High-Performance Computing Cluster from the Pytheas informatic facility (OSU Institut Pytheas Aix Marseille Univ, INSU-CNRS UMS 3470) J.V. benefited from a Postdoc Fellowship funded by DYNAMIC and a Marie Skłodowska-Curie Individual Fellowship (704464 - YAMNOMICS - MSCA-IF-EF-ST). The authors thank for their help to complete our sampling: Annette Patzelt (Oman Botanic Garden), Minas Papadopulos (Department of Forests of the Republic of Cyprus), Zahra Djabeur (Oran University), Nabil Benghanem (Tizi-Ouzou University), Gianluigi Bacchetta (Cagliari University), Sonja Yakovlev (Paris-Sud University), Errol Vela (CIRAD), Maria Panitsa (Patras University), and the services of Junta de Andalucia.
Author contributions
H. S., A.B., F.M., S.L.M., M.B.K., L.O., G.N.F. and J.V. conceived, planed the study and collected samples. F.L.M. performed the DNA extraction and quality assessment. J.V, A.B. and V.F. performed curation and analysis of microsatellite data. A.B. performed RADseq data curation and SNPs filtering. A.B. and J.V provided the analysis, tables and figures. A.B., J.V., G.N.F. and F.M. interpreted the results. A.B. and J.V. drafted the manuscript. J.V., G.N.F., S.L.M. and M.D.G. edited the manuscript. J.V., G.N.F. and A.B. wrote the final manuscript. H.S. was in charge with funding acquisition and project administration. All authors read and approved the final version.
Data Availability Stament
Full information on populations sampling and microsatellite data are available in Viruel et al. (2020) and deposited in DRYAD (https://doi.org/10.5061/dryad.k7m020r). Raw RADseq reads are deposited at NCBI under Bioproject accession (######). Assemblies from ipyRAD, data files and R scripts of analyses are available at Zenodo (##############).
References
Alexander, D. H., Novembre, J., & Lange, K. (2009). Fast model-based estimation of ancestry in unrelated individuals. Genome research, 19(9), 1655-1664. https://doi.org/10.1101/gr.094052.109
Alves‐Pereira, A., Clement, C. R., Picanço‐Rodrigues, D., Veasey, E. A., Dequigiovanni, G., Ramos, S. L. F., Baldin Pinheiro, J., Pereira de Souza, A. & Zucchi, M. I. (2020). A population genomics appraisal suggests independent dispersals for bitter and sweet manioc in Brazilian Amazonia. Evolutionary applications, 13, 342-361.
Baumel, A., Mirleau, P., Viruel, J., Bou Dagher Kharrat, M., La Malfa, S., Ouahmane, L., … & Médail, F. (2018). Assessment of plant species diversity associated with the carob tree (Ceratonia siliqua, Fabaceae) at the Mediterranean scale. Plant Ecology and Evolution, 151, 185–193. https ://doi.org/10.5091/plece vo.2018.1423
Borrell, J.S., Wang, N., Nichols, R.A., & Buggs, R.J. (2018). Genetic diversity maintained among fragmented populations of a tree undergoing range contraction. Heredity, 121: 304-318
Besnard, G., Terral, J. F., & Cornille, A. (2018). On the origins and domestication of the olive: a review and perspectives. Annals of botany, 121(3), 385-403. https://doi.org/10.1093/aob/mcx145
Bobo-Pinilla, J., Peñas de Giles, J., López-González, N., Mediavilla, S., & Martínez-Ortega, M. M. (2018). Phylogeography of an endangered disjunct herb: long-distance dispersal, refugia and colonization routes. AoB Plants, 10, ply047
Boivin, N. L., Zeder, M. A., Fuller, D. Q., Crowther, A., Larson, G., Erlandson, J. M., Denham, T. , & Petraglia, M. D. (2016). Ecological consequences of human niche construction: Examining long-term anthropogenic shaping of global species distributions. Proceedings of the National Academy of Sciences, 113(23), 6388-6396.
Brassesco, M. E., Brandão, T. R., Silva, C. L., & Pintado, M. (2021). Carob bean (Ceratonia siliqua L.): A new perspective for functional food. Trends in Food Science & Technology.
Caruso, M., La Malfa, S., Pavlíček, T., Frutos Tomñs, D., Gentile, A., & Tribulato, E. (2008). Characterisation and assessment of genetic diversity in cultivated and wild carob (Ceratonia siliqua L.) genotypes using AFLP markers. The Journal of Horticultural Science and Biotechnology, 83(2), 177-182.
Chavent, M., Kuentz-Simonet, V., Labenne, A., & Saracco, J. (2018). ClustGeo: an R package for hierarchical clustering with spatial constraints. Computational Statistics, 33(4), 1799-1822.
Chen, C., Qi, Z.C., Xu, X.H., Comes, H.P., Koch, M.A., Jin, X.J., Fu, C.X. & Qiu, Y.X. (2014). Understanding the formation of Mediterranean–African–Asian disjunctions: evidence for Miocene climate-driven vicariance and recent long-distance dispersal in the Tertiary relict Smilax aspera (Smilacaceae). New Phytologist, 204, 243–255.
Chifman, J., & Kubatko, L. (2014). Quartet inference from SNP data under the coalescent model. Bioinformatics, 30(23), 3317-3324.
Cornejo, O. E., Yee, M. C., Dominguez, V., Andrews, M., Sockell, A., Strandberg, E., Livingstone D. III, Stack C., Romero A., Umaharan P., Royaert S., Tawari N.R., Ng P., Gutierrez O., Phillips W., Mockaitis K., Bustamante C.D.& Motamayor, J. C. (2018). Population genomic analyses of the chocolate tree, Theobroma cacao L., provide insights into its domestication process. Commun Biol 1, 167 (2018). https://doi.org/10.1038/s42003-018-0168-6
de Medeiros, B.A.S., & Farrell, B.D. (2018). Whole-genome amplification in double-digest RADseq results in adequate libraries but fewer sequenced loci. PeerJ 6:e5089https://doi.org/10.7717/peerj.5089
de Medeiros, B.A.S., (2019). Matrix Condenser v.1.0. Available at:https://github.com/brunoasm/matrix_condenser/
Désamoré, A., Laenen, B., Devos, N., Popp, M., González‐Mancebo, J. M., Carine, M. A., & Vanderpoorten, A. (2011). Out of Africa: north‐westwards Pleistocene expansions of the heather Erica arborea. Journal of Biogeography, 38(1), 164-176.
Di Guardo, M., Scollo, F., Ninot, A., Rovira, M., Hermoso, J. F., Distefano, G., La Malfa S. & Batlle, I. (2019). Genetic structure analysis and selection of a core collection for carob tree germplasm conservation and management. Tree Genetics & Genomes, 15(3), 1-14.
Eaton, D. A., & Overcast, I. (2020). ipyrad: Interactive assembly and analysis of RADseq datasets. Bioinformatics, 36(8), 2592-2594.
Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics, 32(19), 3047-3048.
Flowers, J.M., Hazzouri, K.M., Gros-Balthazard, M., Mo, Z., Koutrumpa, K., Perrakis, A., Ferrand, S., Khierralah, H.S.M., Fuller, D.Q., Aberlenc, F., Fournaraki, C., Purugganan, M.D., 2019. Cross-species hybridization and the origin of North African date palms. Proc. Natl. Acad. Sci. U. S. A. 116, 1651–1658. https://doi.org/10.1073/pnas.1817453116
Frichot, E., & François, O. (2015). LEA: An R package for landscape and ecological association studies. Methods in Ecology and Evolution, 6(8), 925-929.
Fuller, D. Q., Willcox, G., & Allaby, R. G. (2011). Cultivation and domestication had multiple origins: arguments against the core area hypothesis for the origins of agriculture in the Near East. World Archaeology, 43(4), 628-652.
García‐Verdugo, C., Mairal, M., Tamaki, I., & Msanda, F. (2021). Phylogeography at the crossroad: Pleistocene range expansion throughout the Mediterranean and back‐colonization from the Canary Islands in the legume Bituminaria bituminosa. Journal of Biogeography ….
Gosselin, T. (2020). radiator: RADseq Data Exploration, Manipulation and Visualization using R. R package version 1.1.9 https://thierrygosselin.github.io/radiator/. doi : 10.5281/zenodo.3687060
Goudet, J. (2005). Hierfstat, a package for R to compute and test hierarchical F-statistics. Molecular Ecology Notes. 5: 184-186
Groppi, A, Liu, S, Cornille, A, Decroocq, S, Bui, Q T, Tricon, D, & Decroocq, V (2021) Population genomics of apricots unravels domestication history and adaptive events. Nature communications, 12(1), 1-16. https://doi.org/10.1038/s41467-021-24283-6
Gros-Balthazard, M., Galimberti, M., Kousathanas, A., Newton, C., Ivorra, S., Paradis, L., Vigouroux, Y., Carter, R., Tengberg, M., Battesti, V., Santoni, S., Falquet, L., Pintaud, J.-C.C., Terral, J.-F.F., Wegmann, D., 2017. The Discovery of Wild Date Palms in Oman Reveals a Complex Domestication History Involving Centers in the Middle East and Africa. Curr. Biol. 27, 2211–2218. https://doi.org/10.1016/j.cub.2017.06.045
Gros‐Balthazard, M., Besnard, G., Sarah, G., Holtz, Y., Leclercq, J., Santoni, S., Wegmann D., Glémin S., Khadari, B. (2019). Evolutionary transcriptomics reveals the origins of olives and the genomic changes associated with their domestication. The Plant Journal, 100(1), 143-157. https://doi.org/10.1111/tpj.14435
Gruber, B., Unmack, P. J., Berry, O. F., & Georges, A. (2018). dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing. Molecular Ecology Resources, 18(3), 691-699.
Hodel, R.G.J., Chen S, Payton A.C., McDaniel S.F., Soltis P., & Soltis D.E. (2017). Adding loci improves phylogeographic resolution in red mangroves despite increased missing data: comparing microsatellites and RAD-Seq and investigating loci filtering. Sci Rep. 7:17598. https://doi.org/10.1038/s41598-017-16810-7.
Hipp, A. L., Manos, P. S., Hahn, M., Avishai, M., Bodénès, C., Cavender‐Bares, J., … & Valencia‐Avalos, S. (2020). Genomic landscape of the global oak phylogeny. New Phytologist, 226, 1198-1212.
Jombart T, Devillard S and Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genetics11:94. doi:10.1186/1471-2156-11-94
Jombart, T., & Ahmed, I. (2011). adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics, 27(21), 3070-3071
Kamvar ZN, Tabima JF, Grünwald NJ. (2014). Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2:e281.<doi:10.7717/peerj.281>
Klesser, R., Husemann, M., Schmitt, T., Sousa, P., Moussi, A., & Habel, J. C. (2021). Molecular biogeography of the Mediterranean Buthus species complex (Scorpiones: Buthidae) at its southern Palaearctic margin. Biological Journal of the Linnean Society, 133, 166-178.
La Malfa, S., Currò, S., Douglas, A. B., Brugaletta, M., Caruso, M., & Gentile, A. (2014). Genetic diversity revealed by EST-SSR markers in carob tree (Ceratonia siliqua L.). Biochemical Systematics and Ecology, 55, 205-211.
Martínez-Freiría, F., Crochet, P. A., Fahd, S., Geniez, P., Brito, J. C., & Velo-Antón, G. (2017). Integrative phylogeographical and ecological analysis reveals multiple Pleistocene refugia for Mediterranean Daboia vipers in north-west Africa. Biological Journal of the Linnean Society, 122, 366-384.
Médail, F., Quezel, P., Besnard, G., & Khadari, B. (2001). Systematics, ecology and phylogeographic significance of Olea europaea L. ssp. maroccana (Greuter & Burdet) P. Vargas et al., a relictual olive tree in south-west Morocco. Botanical Journal of the Linnean Society, 137(3), 249-266.
Médail, F., & Diadema, K. (2009). Glacial refugia influence plant diversity patterns in the Mediterranean Basin. Journal of biogeography, 36(7), 1333-1345
Meyer R.S., Duval A.E., & Jensen H.R. (2012). Patterns and processes in crop domestication: an historical review and quantitative analysis of 203 global food crops. New Phytologist, 196, 29–48.
Migliore J., Baumel A., Leriche A., Juin M., & Médail F. (2018). Surviving glaciations in the Mediterranean region: an alternative to the long-term refugia hypothesis. Botanical Journal of the Linnean Society, 187, 537–549.
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R (2020) IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol., 37:1530-1534.https://doi.org/10.1093/molbev/msaa015
Nieto Feliner, G. (2014). Patterns and processes in plant phylogeography in the Mediterranean Basin. A review. Perspectives in Plant Ecology, Evolution and Systematics, 16, 265–278.
Nieto Feliner, G. (2011). Southern European glacial refugia: a tale of tales. Taxon, 60, 365–372.
Ortiz MA, Tremetsberger K, Stuessy T, Terrab A, García-Castaño JL, Talavera S. 2009. Phylogeographic patterns in Hypochaeris sect. Hypochaeris (Asteraceae, Lactuceae) of the western Mediterranean. Journal of Biogeography 36:1384–1397
Ortiz, E.M. 2019. vcf2phylip v2.0: convert a VCF matrix into several matrix formats for phylogenetic analysis. DOI:10.5281/zenodo.2540861
Parchman, T. L., Jahner, J. P., Uckele, K. A., Galland, L. M., & Eckert, A. J. (2018). RADseq approaches and applications for forest tree genetics. Tree Genetics & Genomes, 14(3), 1-25.
Pickrell, J., & Pritchard, J. (2012). Inference of population splits and mixtures from genome-wide allele frequency data. Nature Precedings, 1-1.
Purugganan, M. D. (2019). Evolutionary insights into the nature of plant domestication. Current Biology, 29(14), R705-R714
Quézel, P., & Médail, F. (2003). Écologie et biogéographie des forêts du bassin Méditerranéen. Paris, France: Elsevier Editions.
Rambaut, A., & Drummond, A. J. (2012). FigTree version 1.4. 0.
Ramón-Laca, L. & Mabberley, D.J. (2004). The ecological status of the carob-tree (Ceratonia siliqua, Leguminosae) in the Mediterranean. Botanical Journal of the Linnean Society, 144, 431–436.
Silliman, K. (2019). Population structure, genetic connectivity, and adaptation in the Olympia oyster (Ostrea lurida) along the west coast of North America. Evolutionary applications, 12(5), 923-939.
Swofford, D. L. (2018). PAUP*(* Phylogenetic Analysis Using PAUP). Version 4a161.
Thompson, J. D. (2020). Plant Evolution in the Mediterranean: Insights for Conservation. Oxford University Press, USA.
Vargas, P., Fernández‐Mazuecos, M., & Heleno, R. (2018). Phylogenetic evidence for a Miocene origin of Mediterranean lineages: species diversity, reproductive traits and geographical isolation. Plant Biology, 20, 157-165.
Villa-Machío, I., Fernández de Castro, A. G., Fuertes-Aguilar, J., & Nieto Feliner, G. (2018). Out of North Africa by different routes: phylogeography and species distribution model of the western Mediterranean Lavatera maritima (Malvaceae). Botanical Journal of the Linnean Society187 , 441-455.
Viruel J, Le Galliot N, Pironon S, Nieto Feliner G, Suc JP, Lakhal-Mirleau F, Juin M, Selva M, Bou Dagher Kharrat M, Ouahmane L, Malfa S, Diadema K, Sanguin H, Médail F, Baumel A (2020) A strong east–west Mediterranean divergence supports a new phylogeographic history of the carob tree (Ceratonia siliqua, Leguminosae) and multiple domestications from native populations. Journal of Biogeography 47, 460-471
Viruel J, Haguenauer A, Juin M, Mirleau F, Bouteiller D, Boudagher‐Kharrat M, Ouahmane L, La Malfa S, Médail F, Sanguin H, Nieto Feliner, G, & Baumel A (2018). Advances in genotyping microsatellite markers through sequencing and consequences of scoring methods for Ceratonia siliqua (Leguminosae). Applications in Plant Sciences, 6, e01201.
Warschefsky EJ, von Wettberg EJ (2019). Population genomic analysis of mango (Mangifera indica) suggests a complex history of domestication. New Phytologist, 222, 2023-2037. https://doi.org/10.1111/nph.15731
Whitlock, M. C., & Lotterhos, K. E. (2015). Reliable detection of loci responsible for local adaptation: inference of a null model through trimming the distribution of F ST. The American Naturalist, 186(S1), S24-S36.
Winter, D. J. (2012). MMOD: an R library for the calculation of population differentiation statistics. Molecular ecology resources, 12(6), 1158-1160.
Zeder, M. A. (2008). Domestication and early agriculture in the Mediterranean Basin: Origins, diffusion, and impact. Proceedings of the national Academy of Sciences, 105(33), 11597-11604.
Zohary D. (2002). Domestication of the carob (Ceratonia siliqua L.). Israel Journal of Plant Sciences, 50, 141-15.
Zohary, D., & Hopf, M. (2012). Domestication of plants in the Old World: The origin and spread of cultivated plants in West Asia, Europe and the Nile Valley. Oxford, UK: Oxford University Press.
Supporting information
Tab. S1: Statistics from four assemblies conducted on RADseq data (36 samples) elaborated with ipyrad with varying the clustering threshold from 0.9 to 0.96 % of similarity.
Tab. S2: pairwise Gst differentiations among CEUs based on RADseq data. Values above the overall Gst (11%) in bold.
Fig. S1 : Design of Carob Evolutionary Units (CEUs) using ClustGeo, which considers euclidean genetic and geographical distances. 17 SSRs and 15 SNP markers from microsatellite loci were used. The Ward dendrogram of 56 carob populations with a partition in K=7 clusters (A) was obtained with a normalized proportion α of explained inertia of 0.2 for the geographic distance and 0.8 for the genetic distance (B). C) Neighbor Joining tree based on pairwise genetic differentiation (Gst SSR markers) among the seven clusters. See main text for acronyms of CEUs.
Fig. S2 : Genome wide diversity structure of 350 carob trees based on 10,012 RADseq loci with an overall missing data rate of 64%. (A) PCA scatter plot of 350 carob RADseq genotypes. (B) Neighbor joining tree of pairwise Gst differentiations among seven Carob Evolutionary Units (CEUs).
Fig. S3: FST per loci distribution (1 SNP by locus) with 56 loci identified as outliers by OUTFLANK due to their unexpectedly high Fst differentiation (FDR <0.05). The blue line is the inferred neutral distribution.
Fig. S4: Map of mtDNA and pDNA haplogroups from 14 and 21 RADseq loci respectively obtained for 190 carob trees. West and East haplogroups match strictly for both organelle data sets except for South Morocco (SM).
Fig. S5 : Population genetic structure of the carob according to RADseq. A) Genetic admixture plots for 190 carob trees from k=2 to K=7 ancestral populations obtained with the snmf method (LEA package) performed on 3,557 unlinked SNPs. The West and East lineages refer to organellar haplogroups (Fig. S4). B) Cross-entropy criterion suggesting two optimal solutions with K= 5 or 7.
Fig. S6: FST per loci distribution (1 SNP by locus). OUTFLANK method did not detected any outlier (FDR <0.05). The blue line is the inferred neutral distribution.
Figure legends
FIGURE 1: Bioinformatic pipeline to extract and filter SNPs from RADseq data for the carob tree.
FIGURE 2 : Population genetic structure of the carob tree. A) SVDquartets tree of seven genetically and geographically homogeneous groups (CEUs) based on RADseq markers. Genetic admixture plots are based on four ancestral populations for SSR markers (1020 genotypes, 17 loci) and on 7 ancestral populations for RADseq markers (190 genotypes, 3557 neutral and unlinked SNPs). B) & C) PCA scatterplots of RADseq genotypes -the first three components = 15.2% of variance). D) Map of genetic admixture based on RADseq markers and 7 ancestral populations.
FIGURE 3: Evolutionary history of the carob tree reconstructed with Treemix. Maximum likelihood trees obtained without (A) and with gene flow (B) events explaining 96% and 99% of the variance, respectively. The color of the arrows indicates the migration weight which is the fraction of ancestry derived from the migration edge.
TABLE 1: Estimates of genetic diversity based on microsatellites (17 SSR loci) and genome-wide markers (3557 SNPs) for seven Ceratonia siliqua units (CEUs). Within each CEU, samples were split into groups according to their origin (cultivated, seminatural or natural habitats).