Recognizing and overcoming context dependency in the application of a
machine learning tool for counting stomata in Setaria versus maize
Abstract
Stomata, microscopic pores on leaf surfaces, regulate the uptake of
carbon dioxide and the simultaneous loss of water vapor by leaves. New
image acquisition and analysis methods are allowing high-throughput
phenotyping of stomatal patterning, which in turn have been applied to
better understand the genetic basis of variation in certain species.
However, it takes considerable data and effort to train the models, and
their ability to accurately detect epidermal structures is constrained
to morphologies found within the training data. This issue of context
dependency, the inability to perform effectively in novel contexts, is
the main hurdle preventing widespread adoption of machine learning in
high-throughput phenotyping of intraspecific, interspecific, and
environmental variation. Here we show the limited ability of a Mask-RCNN
tool, which was previously trained and successfully applied to Zea mays,
to analyze images from a closely related grass, Setaria viridis. We then
demonstrate successful retraining of the tool to cope with the novel
diversity presented by this new species. The stomatal complexes in
optical tomography images of mature Setaria leaves were accurately
identified by comparison to expert raters (R2 = 0.84). This study
highlights the challenge of context dependency for widespread
application of machine learning tools for phenotyping plant traits, even
in closely related species. At the same time, it also provides a new
tool that can be applied to leverage Setaria as a model C4 species,
while also providing a roadmap for translation of a machine learning to
analyze stomatal patterning in new plant species.