loading page

Data driven trait quantification across a maize diversity panel using hyperspectral leaf reflectance
  • +3
  • Michael Tross,
  • Marcin Grzybowski,
  • Aime V Nishimwe,
  • Guangchao Sun,
  • Yufeng Ge,
  • James C Schnable
Michael Tross
University of Nebraska-Lincoln

Corresponding Author:[email protected]

Author Profile
Marcin Grzybowski
University of Nebraska-Lincoln
Author Profile
Aime V Nishimwe
University of Nebraska-Lincoln
Author Profile
Guangchao Sun
University of Nebraska-Lincoln
Author Profile
Yufeng Ge
University of Nebraska-Lincoln
Author Profile
James C Schnable
University of Nebraska-Lincoln
Author Profile

Abstract

Scoring plant phenotypes across large populations in multiple environments is a necessary precondition to both using natural genetic diversity to build genotype to phenotype models, study genotype by environment interactions and to carry out plant breeding to develop high yielding and more resilient cultivars. Here we explore data driven approaches using latent representations of leaf reflectance data collected from a large field experiment consisting of a subset of diverse maize lines drawn from the Wisconsin diversity panel (Mazaheri et al., 2019). In this experiment, 2 replicates of 752 inbred lines from the Wisconsin diversity panel were grown in field conditions. An ASD spectrometer was used to collect data on intensity of light reflected by leaves at 1 nanometer wide intervals between350 to 2,500 nm, resulting in a total of 2,151 reflectance intensity values measured for each plot. Two dimensional reduction approaches were evaluated for this dataset: conventional principal component analysis and an auto-encoder based neural network. Ten principal components were sufficient to summarize 99% of variance in the dataset. An autoencoder neural network comprising of an encoder having three dense layers and a decoder having four dense layers was able summarize variation within the dataset at a validation loss of 0.006 using 10 latent variables. A number of principal components and latent variables were correlated with several phenotypes quantified for a subset of the same field grown research plots (Figure 2A;2C). Chlorophyll, the major photosynthetic pigment in plant leaves, plays a substantial role in determining the overall pattern of reflectance for maize leaves. The abundance of chlorophyll was significantly correlated with PC2 (R2 = 0.31) (Figure 2B) which explained 11% of the total variance in higher spectral reflectance data. However, autoencoder based summary of the same trait dataset appears to have more accurately captured variation in chlorophyll abundance within this field trial with LV8 exhibiting a R2 = 0.59 (Figure 2D) with ground truth chlorophyll measurements. Both PCA and autoencoder based dimensional reduction captures a mix of variables which were heritable (i.e. a large proportion of total variance was attributable to differences between genotypes) and variables that were not heritable. Two of ten PCs evaluated exhibited H2 values >0.5 as did four of ten latent variables generated (Figure 3A; 3B). Genome wide association studies (GWAS) conducted using high heritability principal components and latent variables identified significant signals in 2 out of 6 cases (Figure 4A; 4B). Ongoing work is needed to evaluate the potential of using candidate genes underlying GWAS peaks to assign putative biological roles to latent variables estimated from raw sensor data by autoencoders or other dimensional reduction approaches.