Introduction
The genomics revolution has spurred unprecedented growth in the sequencing and assembly of whole genomes in a wide variety of model and non-model organisms (Ellegren, 2014). While this has fueled the development of large genomic diversity panels for studies into the genetic basis of adaptive traits, reliance on a single well-assembled reference genome within a species or across a set of closely related congeners poses significant limitations on genetic and evolutionary inferences (Sherman & Salzberg, 2020). The challenge is particularly acute when working with large, structurally diverse, hybrid or heterozygous genomes, for which low coverage and biases in variant calling may result when mapping short read sequences against a divergent reference genome.
The genus Populus (poplars, cottonwoods, and aspens) has emerged as the leading model in tree ecological genomics and biotechnology, including development of the reference genome assembly for Populus trichocarpa –the first tree to undergo whole genome sequencing (Tuskan et al., 2006). In recent years, the whole genomes of Populus euphratica , Populus tremulaand tremuloides , Populus alba var. pyramidalis andPopulus alba have also been published (Lin et al., 2018; Y. J. Liu, Wang, & Zeng, 2019; J. Ma et al., 2019; T. Ma et al., 2013). However, high genetic heterozygosity and limited application of 3rd generation sequencing technology has limited the quality of many of these genome assemblies, which often remain highly fragmented into thousands of scaffolds (Ambardar, Gupta, Trakroo, Lal, & Vakhlu, 2016).
The availability of multiple highly contiguous, well-assembledPopulus reference genomes would greatly facilitate accurate inferences of synteny, recombination, and chromosomal origins (Lin et al., 2018). Diverse well-assembled reference genomes would also provide a fundamental tool for functional genomics, genetic engineering, and molecular breeding in this economically important genus (L. Zhang et al., 2019). It would also improve phylogenomic analyses of thePopulus pan-genome (Pinosio et al., 2016; L. Zhang et al., 2019), without the need for reliance on reference-guided mapping and variant calling based solely on the P. trichocarpa reference. Recent advances in approaches to whole genome sequencing, including chromosome conformation capture (Hi-C) (van Berkum et al., 2010) and long-read sequencing offer a means to go beyond fragmented draft genomes and generate nearly comprehensive de novo assemblies (El-Metwally, Ouda, & Helmy, 2014).
Populus tomentosa , also known as Chinese white poplar, is indigenous and widely distributed across large areas of China(An et al., 2011). Moreover, it is also the first tree species planted in large-scale artificial plantations in China. Like other white poplars, P. tomentosa has become an important model for genetic research on trees (An et al., 2011), but at present no genome sequence is available and the origin, evolution and genetic architecture of the P. tomentosa genome are unclear. It has been proposed that P. tomentosa is a distinct species in thePopulus section (Dickmann & Isebrands, 2001). However, the origin of P. tomentosa has been remained controversial. Although P. tomentosa was proposed to contain two genetic types with different maternal parents(D. Wang, Wang, Kang, & Zhang, 2019), suggestions of a hybrid origin were based on a limited set of molecular markers and an incomplete collection of provenance materials. Thus, its ancestry and genome structure remains unclear. Our study adds to knowledge of the species by providing a much greater understanding of genomic architecture and structural composition following inferred interspecies hybridization.
Here, we present de novo assembles for P. tomentosa (clone GM15) by the combined application of PacBio, Illumina and Hi-C sequencing technologies. We herein provide two high-quality haplotype-resolved assemblies for all chromosomes whose phylogenetic affinities demonstrate the hybrid origin of this species. Combining phylogenetic analyses of chloroplast genomes in this study, we deduced that the ancestors of P. tomentosa are P. adenopoda(female parent) and P. alba var. pyramidalis (male parent). Furthermore, we uncovered extensive structural variations across the genome. These findings help to elucidate the mechanisms of speciation in Populus , and expand our understanding of the genomic biology of Populus .