Sample collection and genome sequencing
The Qatar date fruit biobank is a collection of date fruit samples from across the date palm growing region, including Qatar, United Arab Emirates (UAE), Iran, Saudi Arabia, Egypt, Pakistan, Libya, Tunisia, United States of America (USA), Morocco, Jordan, Sudan, Oman and Spain (N Stephan et al. 2018). We used 199 date fruit samples from the collection, attempting to include the most important commercial cultivars as well as some lesser-known varieties (Supplementary file 1). DNA from fruit was extracted, sequencing libraries were constructed from total DNA (Mathew LS et al., 2015) and sequenced on the Illumina HiSeq 2500/4000 instruments using 150 bp length reads. All sequencing steps were done as described in Thareja et al (Thareja et al., 2018).