Overview of Models
We modeled the joint multivariate distribution of the eight leaf traits
(the response variables) using three different approaches: (1)
Environment-only model using climate and topography as fixed effects;
(2) Phylogeny-only model using species as random effects, with
covariances among the random effects structured by the phylogenetic tree
for all woody species detected in the FIA database for the eastern US;
(3) Combined model including both environmental and phylogenetic
effects. Here, we briefly summarize the modeling framework; full details
are in Appendix S2. All approaches used the same joint-multilevel
Bayesian framework and were evaluated using 5-fold cross-validation. The
joint structure of the model allows for modeling traits simultaneously,
considering the correlation structure across traits embedded in the LES
and potentially leveraging the conditional distribution on known trait
values of one or more individuals at a site (Wilkinson et al., 2020).
Environmental effects were fitted using generalized additive models
(GAMs) to account for non-linear relationships. We used thin plate
regression splines to estimate the smooth terms, using the brms R
package (Bürkner, 2017). Phylogenetic relationships across species were
modeled by including species as a random effect and accounting for their
phylogenetic relationships by estimating their correlation structure
from cross-species cophenetic distances (Paradis et al., 2019). The
distance matrix was used to estimate the correlation structure across
taxa, allowing parameter estimates for rare or unsampled taxa to borrow
strength from widely sampled species (de Villemeruil & Nakagawa, 2014).
We used multivariate normal families and weakly informative priors in
all cases (Appendix S2).
To reduce collinearity and the number of climate predictors, we
calculated a PCA for each climate variable (net radiation,
precipitation, vapor pressure, maximum and minimum temperature) using
monthly averages from 1985 to 2015. We used the first component of each
PCA to represent each climate variable in the environment-only and
combined models. To quantify uncertainty in model accuracy, we used the
95% prediction interval of the Bayesian R2 (Gelman et
al., 2018). To reduce computation costs of ~15x without
affecting accuracy (Table S.1), we used predictions from 1fold
cross-validation for making predictions across FIA and test model
generalizability. See additional methods details in Appendix S1-S3. Code
for reproducing analyses is available on Zenodo
(https://zenodo.org/badge/latestdoi/353383665).