Abstract
Identifying the main environmental drivers of SARS-CoV-2
transmissibility in the population is crucial for understanding current
and potential future outbursts of COVID-19 and other infectious
diseases. To address this problem, we concentrate on basic reproduction
number R0, which is not sensitive to testing
coverage and represents transmissibility in an absence of social
distancing and in a completely susceptible population. While many
variables may potentially influence R0, a high
correlation between these variables may obscure the result
interpretation. Consequently, we combine Principal Component Analysis
with feature selection methods from several regression-based approaches
to identify the main demographic and meteorological drivers behind
R0. We robustly obtain that country’s
wealth/development (GDP per capita or Human Development Index) is by far
the most important R0 predictor, probably being a
good proxy for the overall contact frequency in a population. This main
effect is modulated by built-up area per capita (crowdedness in indoor
space), onset of infection (likely related to increased awareness of
infection risks), net migration, unhealthy living lifestyle/conditions
including pollution, seasonality, and possibly BCG vaccination
prevalence. Also, we show that several variables that significantly
correlate with transmissibility do not directly influence
R0 or affect it differently than suggested by
naïve analysis.