Algorithm Selection
Previous efforts to model ecosystems in space have largely emphasized biotic, abiotic, or functional response variables (Box 1, Table S1) to predict ecosystem-level variation (Table S2). Our case study goals differ as we emphasize biotic and abiotic variables equally. Furthermore, we aim to “assemble and predict together” (sensu Ferrier and Guisan 2006) (see case study overview). Given these conditions, we sought candidate ESPM algorithms which could model biotic and abiotic ecosystem constituents simultaneously and predict individual and shared responses of each variable. In addition, we fit ESPM with in situ (i.e., recorded in the field) training data, thereby better capturing the range of biotic and abiotic complexity characterizing ecosystems. By applying this integrative strategy for ESPM, ecosystems are treated as an emergent function (sensu Nieto-Lugilde et al 2018) of biotic-abiotic co-occurrence and patterns of local concordance.
Our initial challenge was defining criteria to select an algorithm from the numerous examples (Table 2) applied for spatial biodiversity modelling. Following our evaluation, we screened algorithms according to their flexibility (e.g., regarding data inputs), implementation (e.g., ease of application), analytical properties, and performance (Table 3). Overall, we emphasized algorithms with the capacity to accommodate pools of presence/absence, or abundance, records of both biotic and abiotic variables as training data; employ species traits; and to allow for interactions among predictors. We also prioritized algorithms which could predict individual and joint responses in space. To assist with our selection, we drew on recent review articles (Nieto-Lugilde et al 2018, Norberg et al 2019) and individual model algorithm assessments (e.g., Warton et al 2015, Wilkinson et al 2021) to identify spatial algorithms with relatively high predictive power. Candidate algorithms that met our requirements include joint species distribution modelling (implemented with Hierarchical Modelling of Species Communities (HMSC)) (Ovaskainen et al 2017), generalized dissimilarity modelling (Mokany et al 2022) and probabilistic bioregion modelling (Hill et al 2020). Of these three algorithms, we selected HMSC for our case study as recent reviews demonstrate its overall flexibility and moderate to high predictive performance for spatial biodiversity modelling (Warton et al 2015, Zhang et al 2018, Norberg et al 2019). Furthermore, availability of a comprehensive methodological guide with tutorials in R (Ovaskainen and Abrego 2020), training modules, and communication (Ovaskainen, Tikhonov pers comm) facilitated implementation for our purpose.