Model transferability
To determine the appropriateness of using the model to make predictions
outside of the scope of the NEON dataset, we tested the performance of
the combined model on novel locations and novel species using
independent data from the Botanical Information and Ecology Network
(BIEN) and a subset of TRY leaf trait datasets. These datasets include
trait data on LMA, N% and C% for 62 species, including 27 species
unavailable in training data (Figure S.1, supplement S1). The combined
model showed good transferability to other data sources (mean
R2 = 0.54, 95% coverage = 91%, Figure S.6). The
inclusion of full phylogenetic relationships in addition to
environmental predictors yielded successful model transfer to species
not sampled at NEON sites (mean R2 = 0.4, 95%
coverage = 94%, Figure 2). This was possible because phylogenetic
relationships allow parameter estimates for unsampled species to borrow
strength from closely related species included in the data (Evans et
al., 2016). The model is therefore suitable for large scale application.
Predicting large scale
trait variation
To understand large scale variation in traits using our combined model
and to compare it to the phylogeny-only and environment-only approaches,
we integrated each of the three models with tree species abundance and
topographic data from ~30,000 Forest Inventory and
Analysis (FIA) plots (~1.2 million trees) and climate
data from DayMet (Figure S.7, S.8, S.9) to predict leaf traits across
the eastern US. Predictions from the phylogeny-only model produced
patterns similar to Swenson & Weiser (2010) (Figure S.10), suggesting
that our approach to representing species-only methods (and any
resulting deviations from them) is consistent with patterns previously
reported in literature.
Predictions from the combined model show broad-scale patterns associated
with shifts in forest communities and large-scale climatic and
topographic patterns across latitudinal and altitudinal gradients
(Figure 3, S.11). In some cases, trait distributions shifted abruptly
between neighboring ecoregions due to a combination of shifts in local
environmental conditions and in community assembly (independent of the
environment). Changes in species composition may explain trait patterns
between the Mississippi Plains ecoregion and Southern Plains ecoregions
(Figure S.12). The Mississippi Plains ecoregion is characterized by
heavy disturbance from agricultural activities and forests are often
limited to riparian ecosystems favoring bottomwood broadleaf species
(e.g., Celtis laevigata , Fraxinus pennsylvanica ,Salix nigra ) in contrast to needleleaf species (e.g.,Juniperus virginiana , Pinus taeda , and Pinus
echinata ) more common in the neighboring ecoregions (Coastal plains
mixed forests). Being that broadleaf species are generally characterized
by higher N% and lower LMA, this change in community assembly
translates into predicted regional trait patterns. Other ecoregions show
little change in species composition compared to neighboring ecoregions
instead, suggesting that shifts in predicted patterns may be attributed
to how environmental gradients affect traits directly. This seems to be
the case of the mixed forests in the Appalachian region, where opposing
patterns are exhibited at higher altitudes in the Blue Ridge ecoregion
(lower N% and pigment concentrations; higher LMA and C%) compared to
piedmont and valleys in the neighboring ecoregions (higher N% and
pigments, lower LMA and C%) (Figure S.13).
Predictions from the combined model differed from the phylogeny- and
environment-only models, suggesting that phylogeny and environmental
drivers contain different information at large scales. Divergence from
the combined model varied across ecoregions (Figure 4). These
differences were complex, with regions exhibiting shifts of different
magnitudes and directions from either phylogeny- or environment-only
approaches (Figure 4, Figure S.14, Supplement 5). 80% of
ecoregion-trait-model combinations exhibited significant differences in
predicted traits between the combined and phylogeny- or environment-only
models (p < 0.0001 in paired t-tests). Accordingly,
predictions from the phylogeny-only and environment-only model differed
from each other (p < 0.0001 in paired t-tests) for 93% of
ecoregion-trait combinations, which highlights the distinct effects of
the environment and phylogeny on trait distributions and demonstrates
the importance of a combined modeling approach for prediction and
inference.
Patterns of divergence from the combined model indicate how phylogenetic
and environmental effects vary biogeographically across the eastern US.
Regions with no significant divergence may signal conditions where the
environment affects traits by filtering for species better adapted to
local conditions, and traits are well estimated by both species’
averages and directly from the environment (as in the case of the
Mississippi Alluvial Plains region). For most ecoregions, divergence of
the phylogeny- and environment-only models move in opposite directions
(i.e., positive divergence for one opposed to negative divergence for
the other). Significant divergence for the phylogeny-only model
indicates that continental-scale species averages fail to correctly
represent trait values at finer scales, because local environmental
conditions and/or competition shift the trait values away from the
species mean.
Areas where the combined model predicts higher N% than the
phylogeny-only model (blue shading in Fig. 4a) suggest the presence of
environmental effects that increase N% above species means. Conversely,
orange-shaded areas in Fig. 4a indicate environmental effects that lower
N% below species means. These patterns may reflect regulatory
mechanisms controlled by the environment, where leaves adjust allocation
to proteins, pigments, or structural compounds to balance photosynthetic
capacity, toughness, and chemical defense within ranges constrained by
species life history (Weih & Karlsson, 2001, Tjoelker et al., 2001,
Crous et al., 2019, Albert et al., 2010). In contrast, divergence
between the combined and environment-only models (Fig. 4b) could be
explained by (i) environmental effects that have a phylogenetic signal
not captured by the environmental variables included in our analysis;
and/or (ii) stochastic factors (e.g., disturbance history and dispersal
limitation) that have resulted in species distribution patterns that are
decoupled from current environmental conditions (Burns & Strauss, 2012,
McIntyre et al., 1999). Testing mechanistic hypotheses for the above
divergence patterns is beyond the scope of our study. Nevertheless,
merely identifying these biogeographic patterns is a novel step towards
better understanding trait distributions and is only possible using
modeling approaches that combine phylogenetic and environmental
information.