Abstract
Identifying the main environmental drivers of SARS-CoV-2
transmissibility in the population is crucial for understanding current
and potential future outbursts of COVID-19 and other infectious
diseases. To address this problem, we concentrate on basic reproduction
number R0, which is not sensitive to testing coverage and represents
transmissibility in an absence of social distancing and in a completely
susceptible population. While many variables may potentially influence
R0, a high correlation between these variables may obscure the result
interpretation. Consequently, we combine Principal Component Analysis
with feature selection methods from several regression-based approaches
to identify the main demographic and meteorological drivers behind R0.
We robustly obtain that country’s wealth/development (GDP per capita or
Human Development Index) is by far the most important R0 predictor,
probably being a good proxy for the overall contact frequency in a
population. This main effect is modulated by built-up area per capita
(crowdedness in indoor space), onset of infection (likely related to
increased awareness of infection risks), net migration, unhealthy living
lifestyle/conditions including pollution, seasonality, and possibly BCG
vaccination prevalence. Also, we show that several variables that
significantly correlate with transmissibility do not directly influence
R0 or affect it differently than suggested by naive analysis.