Definition of the Climatic Niche limits for common ancestors in
the hominin tree
The hominin phylogenetic tree was obtained by combining the Primate (and
human) phylogenetic information published in recent
papers49-51. We started by using the six climatic
variables, representing the limits (minima and maxima) in temperature,
precipitation and NPP. Since these variables are highly correlated to
each other, we reduced covariation among variables by performing a
Principal Component Analysis (PCA) on climatic variables associated with
each hominin species occurrence in the fossil record. Then, we extracted
the PC scores and used them as a multivariate dataset for the
phylogenetic ridge regression.
We used the PC scores estimated by RRphylo at each node and back
transformed the scores in climatic variables (MinTemp, MinPrec, min NPP,
MaxTemp, MaxPrec, max NPP). For each node in the hominin tree, we mapped
geographically the areas associated with its corresponding climatic
estimates (i.e. the limits of the climatic envelope). The resulting map
thus represents the geographic areas estimated to be climatically
suitable for occupation by each ancestor in the tree. To account for
uncertainty around the ages of individual nodes in the hominin
phylogeny, we repeated the entire procedure at each node over 100
replicates by using the 100 alternative phylogenies generated fromswapONE function embedded in the RRphylo package. This function
randomly changes the tree topology and branch lengths although it is
possible to keep specific clades monophyletic. However, we accounted for
a few, well-supported, monophyletic clades which are present in the
hominin tree. In particular, in swapping the tree tips, we kept H.
ergaster and H. erectus as sister species. We similarly kept
monophyletic the clade subtending to the four australopithecines in the
tree, the clade representing the genus Homo , and the clade
including H. heidelbergensis , H. neanderthalensis andH. sapiens . Eventually, we recorded the number of times a given
geographical cell counts as climatically suitable out of the 100
replicates. For each cell, estimated habitat quality thus ranges between
0 (never suitable irrespective of the phylogenetic tree used for
climatic reconstructions at the tree nodes) to 1 (100 times suitable).
Since the inclusion of particular taxa in the data may alter
significantly the result of PCA ordination52 we
repeated the swap procedure leaving one species at random out of the
tree for each replicate.
To measure the association between climatic suitability and the presence
of human species, we calculated the Area Under receiver-operator Curve
(AUC) averaging over the 100 replicates. AUC theoretically ranges from 0
to 1. However, since random sampling points, (pseudoabsences) are not
real absences, AUC cannot reach 153, as the maximum
AUC value depends on the actual (unknown) area of distribution of the
species. To obtain a null distribution of AUC values and assess
significance for the real AUC, for each node in the tree we sampled 100
times as many point occurrences as with the real data (i.e. the fossil
occurrences of the species descending from that node) within the
biogeographic domain of the species, and calculated the random AUC.
Climatic tolerance values at the tree node represent the estimated
tolerance limits for hominin ancestors. Since these values are
estimated, rather than observed, to assess the association between
fossil localities and habitat quality for each ancestral species
estimates we selected the fossil occurrences of its descendants,
provided they are not included in a descending node which was itself
tested. For instance, the EHS ancestor was tested by selecting the
fossil occurrence of H. habilis , H. erectus and H.
ergaster , but not H. sapiens , H. neanderthalensis andH. sapiens which were considered only descendant to the MHS
ancestor. To account for sampling differences between the hominin
species we analysed, we repeated the AUC computation after sampling
randomly no more than 100 occurrences per species at each replicate.
We used the evolutionary rates provided by RRphylo to apply the
function search.shift 23 which tests whether
individual clades evolved at significantly different rates as compared
to the rest of the tree. The function compares the rates attached to
each branch descending from a particular node to the rates for the
branches of the rest of the tree. The significance for the rate
difference is assessed by means of randomization. In the case of
multivariate data, as with this particular study, the multivariate rate
is computed as the 2-Norm (Euclidean) vector of the rates of individual
variables.
To look for possible evolutionary trends in climatic tolerances over
time we used the function search.trend 54 in theRRphylo R package23. In search.trend ,
evolutionary rates and phenotypes (including the phenotypic estimates at
the nodes) are regressed against their age and the resulting slopes
compared to slopes randomly generated under the Brownian motion model of
evolution.