Genome-wide analysis of dry (tamar) date palm fruit color
Shameem Younuskunju1,5, Yasmin a. Mohamoud1, Lisa Sara Mathew4, Klaus F. X. Mayer5,6, Karsten Suhre3 and Joel a. Malek1,2*
1Genomics Laboratory, Weill Cornell Medicine-Qatar, Doha, 24144, Qatar.
2Department of Genetic Medicine, Weill Cornell Medicine-Qatar, Doha, 24144, Qatar.
3Department of Physiology, Weill Cornell Medicine-Qatar, Doha, 24144, Qatar.
4Clinical Genomics Laboratory, Sidra Medicine, P.O Box 26999, Doha, Qatar.
5Shool of Life Sciences, Technical University of Munich, 85354, Munich, Germany.
6Plant Genome and Systems Biology, Helmholtz Center Munich, 85764, Munich, Germany
*Corresponding Author:
Joel A. Malek
e-mail: jom2042@qatar-med.cornell.edu
tel: +974-4492-8420
Abstract
Date palm (Phoenix dactylifera) fruit are an economically and culturally significant crop in the Middle East and North Africa. There are hundreds of different commercial cultivars producing dates with distinctive shapes, colors, and sizes. Genetic studies of some Date palm traits have been performed, including for date palm sex-determination, sugar content and fresh fruit colour. In this study, we used genome sequences and image data of 199 dry date fruit (Tamar) samples collected from 14 countries to identify genetic loci associated with the color of this fruit stage. Here, we find loci across multiple linkage groups (LG) associated with dry fruit color phenotype. We recover the previously identified VIR genotype associated with fresh fruit yellow or red color and new associations with the lightness and darkness of dry fruit. This study will add resolution to our understanding of the date palm fruit color phenotype especially at the most commercially important tamar stage.
Introduction
Date palm (Phoenix dactylifera ) is one of the oldest and most economically important fruit crops in the Middle East and North Africa (Chao & Krueger, 2007; Weiss, Zohary, & Hopf, 2012). While there are thousands of cultivars or varieties, likely, only a few hundred are commercially important (Zaid & Arias-Jimenez, 1999). These cultivars produce fruit (Dates) that have distinctive shapes, colors, and sizes of fruit. Fruit development and ripening involve many complex biological processes, and color changes of the fruit are closely associated with the ripening stage (Abbas & Ibrahim, 1998). Dates have five different development stages: Hababauk, Kimri, Khalal, Rutab, and Tamar (Al-Mssallem et al., 2013; Siddiq & Greiby, 2013). Hababauk and Kimri are the first two development stages, and fruit skin color is whitish-green. In the Khalal stage, dates partially ripen and gain maximum size and weight. During this stage, fruit color changes from green to yellow or red depending on the cultivar. Dates fully mature in the Rutab stage and the color begins its change to brown. Tamar is the final stage of ripening, during which fruit water content is reduced to less than 25%, sugar content increases to 70 to 80 % and the color turns dark brown. During fruit ripening process amino acids act as building blocks for the synthesis of key intermediates and end products (Seymour, Taylor, & Tucker, 2012) and the enrichment of both free amino acids and color or flavor-conferring phenylpropanoids was observed in the early ripening stage of dates as in other fruits (Diboun et al., 2015).
Dates are classified as climacteric fruit like other commercial fruit such as apples, bananas and peach where ethylene functions as a key regulator in fruit ripening (Abbas & Ibrahim, 1998; Al-Qurashi & Awad, 2011; Serrano, Pretel, Botella, & Amoros, 2001). Fruit ripening is associated with an increase in respiration rate, burst of ethylene production, oxidation process, sugar accumulation, chlorophyll break down, pigment synthesis and other processes (Barreveld, 1993; Osorio, Scossa, & Fernie, 2013; Stepanova & Alonso, 2005). Anthocyanin is a class of secondary metabolite synthesized in higher plants and plays a crucial role in pigmentation (red, pin Later k, purple & blue) in fruit and vegetables (Kong et al., 2003; Kayesh E et al.,2013). In anthocyanin biosynthesis, the R2R3 MYB transcription factors act as a regulator (Xie et al., 2020). Previous studies from Hazzouri et al. (K. M. Hazzouri et al., 2015; Khaled M. Hazzouri et al., 2019) revealed that a retrotransposon insertion (named Ibn Majid ) and a start codon mutation in an R2R3 MYB transcription factor encoded by the ortholog of the oil palm VIRESCENS gene (VIR gene ), are likely the key genetic changes leading to the red and yellow color phenotype in dates at the Khalal stage (fresh fruit). Studies by Awad (2007) and Shareef (2020) showed that the color transition form from yellow to brown ( Khalal-to-Rutab) is associated with an increase in endogenous Abscisic Acid (ABA) concentration (Awad, 2007; Shareef & Al-Khayri, 2020). A recent study by Saar Elbar et al. provided evidence of gradual increase of pulp ABA during the ripening stage (Khalal-to-Rutab-to-Tamar) (Elbar et al., 2022). They also showed that the color transition from yellow to brown (Khalal-to-Rutab) was preceded by an arrest of xylem-mediated water transport into water.
Studies of the genetic basis of Date palm’ traits have significantly increased due to the crop’s economic importance. Genome-wide association studies (GWAS) are a powerful method for mapping the association between genetic variation and phenotype (Cantor, Lange, & Sinsheimer, 2010; Korte & Farlow, 2013). Previous GWAS in date palm identified significant associations for fruit sugar content (Khaled M. Hazzouri et al., 2019; Malek et al., 2020), confirmation of previous findings of the sex determination region on LG12 (Khaled M. Hazzouri et al., 2019; Lisa S. Mathew et al., 2014) and fresh fruit color (Khaled M. Hazzouri et al., 2019). A goal of GWAS is to identify the association between the variance of the phenotype of interest and genomic region or loci at genome-wide significance. Challenges in conducting GWAS in date palm exist including the use of clonal propagation, rare existence of outbred populations and geographical population structure. The population structure and cryptic relatedness have the potential to confound the GWAS results and can lead to false discoveries (Chen et al., 2016; Horton et al., 2012; Vilhjálmsson & Nordborg, 2013). Along these lines, multiple studies (Chaluvadi, Khanam, Aly, & Bennetzen, 2014; Flowers et al., 2019; Zehdi-Azouzi et al., 2015), including our own (L. S. Mathew et al., 2015) have confirmed two major subpopulations (western and Eastern populations) in the date palm population. Indeed, our recent study reported that at least three and possibly four novel subpopulations contribute to the current date palm population (Mohamoud et al., 2019) making population structure an important consideration in any date palm genome-wide association study. Different algorithms have been developed to correct for population structure and increase the computational efficiency and statistical power in GWAS (H. M. Kang et al., 2008; Q. Wang, Tian, Pan, Buckler, & Zhang, 2014; Y. Zhang et al., 2018). Fixed and random model Circulating Probability Unification (FarmCPU ) is a statistically powerful and computationally efficient GWAS method to control spurious associations (Liu, Huang, Fan, Buckler, & Zhang, 2016; J. Wang & Zhang, 2021). The iterative usage of the fixed effect and random effect models in the FarmCPU method incorporates population structure and kinship matrix as covariates and eliminates the false positive and false negative association results. We therefore hoped to implement these methods in the challenge of GWAS in the highly structured date palm population.
In this study, we conducted GWAS using 199 date palm samples to identify the significantly associated genetic loci and possible candidate genes with the color variation of Tamar stage fruit (dry fruit). The samples used in this study are extensively diverse in their country of origin and variety, collected from 14 countries.
Materials and Methods