Deciphering the evolutionary mechanisms of SARS-CoV-2; Absence of ORF8
protein and its potential advantage in the emergence of viral lineages
Lauro Velazquez-Salinas
Affiliation
Plum Island Animal Disease Center,
Agricultural Research Service, USDA, Greenport, NY 11944, USA
Corresponding author:
Lauro Velazquez-Salinas:
lauro.velazquez@usda.gov
To The Editor,
To date (2/5/2023), nearly three years after the official declaration of
the COVID-19 pandemic, a total of 754, 018, 841 cases and 6,817,478
deaths have been officially reported to the world health organization
(WHO) (https://covid19.who.int/). During this time, the global
populace has witnessed the devastating effects of this pandemic, which
can undoubtedly be considered the sanitary event of the century. In my
opinion, understanding the mechanisms associated with the evolution of
SARS-CoV-2 is the most important aspect of understanding the course of
this pandemic. In this respect, the millions of viral sequences
generated during the pandemic (1) have significantly aided in the
tracking of the pandemic, resulting in the identification of multiple
variants of concern.
At the start of the pandemic, the evolutionary patterns of SARS-CoV-2
and the potential role of natural selection in the emergence of new
lineages was heavily debated (2). However, now several years into the
pandemic, it is evident that SARS-CoV-2 has a tremendous ability to
continuously evolve into new lineages with increased viral fitness (3)
highlighting the significance of natural selection in this process (4,
5, 6).
Over the course of the pandemic, more than 1300 lineages have been
identified under the Pango lineage classification (7). This situation
has produced the diversification of this virus in multiple phylogenetic
clades, which have been alternating their dominance in terms of their
ability to rule the infections during this sanitary event (Figure 1A).
Furthermore, multiple variants of concern, (VOC), and interest, (VOI),
have been identified from these clades, with the clade GRA comprising
the VOC omicron, which became dominant in 2022 (Figure 1B). In this
sense, the role of natural selection in the evolutionary history of
SARS-CoV-2 can be evidenced using multiple evolutionary algorithms based
on the detection of natural selection at specific sites (MEME) or branch
level (aBSREL and BUSTED) in a codon-based phylogenetic framework (8).
As a result, the presence of multiple branches evolving under positive
selection (Figure 1C) is consistent with the emergence and the dominance
of different VOC (Figure 1B). Furthermore, the detection of these
branches is associated with the discovery of specific residues under
positive selection, which in some cases has been linked with increased
infectivity (spike-452) (9), antibody resistance (spike-371) (10),
generation of potential cleavage sites (spike-856) (11),
transmissibility (Nucleocapsid-13) (12), and immune evasion (NSP6-105)
(13), supporting the inferences produced by multiple evolutionary
algorithms, highlighting the relevance of natural selection during the
evolution of this pathogen.
Conversely, when the strength of the natural selection was evaluated
through the time of the evolution of different phylogenetic clades by
contrasting evolutionary dN/dS (ratio of non-synonymous to synonymous
substitutions) signatures between internal nodes and leaf nodes using
the evolutionary algorithm Relax (14). Overall, strong evidence of
positive selection was found at internal nodes in comparison to the leaf
nodes (dN/dS= 1.5576 vs 0.5025 respectively). Evidence of relaxation of
the natural selection at leaf nodes K=0.73 p<0.05
(K=relaxation/ intensification parameter) was inferred, with the
diversifying selection, the main evolutionary force affected by the
relaxed selection (Figure 1C). In light of these results, it is possible
to suggest that the emergence of the lineages associated with different
phylogenetic clades is initiated by a strong signal of diversifying
selection (dN/dS= 411.436), which tends to relax during the time of
circulation of different lineages. However, despite this relaxation in
the diversifying selection, the strength of this evolutionary force may
be considered strong enough (dN/dS= 82.176) to improve the viral
fitness, leading to the emergence of new subvariants as is the case of
the current dominant VOV omicron (15). Moreover, considering the current
dominance of the VOC omicron and the lack of clinical cases produced by
other VOC, the strength of the selection between GRA clade and the rest
of the clades was compared (figure 1D). Interestingly, evidence of
selection intensification was observed (K=1.89, p<0.05),
affecting mainly the strength of diversifying selection during the
evolution of lineages associated with GRA clade. The high difference in
the strength of diversifying selection between GRA and the rest of the
clades, clearly explains the dominance of the omicron variant over the
rest of VOC. While more research is needed to understand the factors
favoring the diversification of this variant, some evidence indicates
that vaccine-breakthrough or antibody-resistant mutations may be the
main mechanism associated with this process (16).
However, beyond these evolutionary mechanisms, additional adaptive
strategies should be considered as part of this complex evolutionary
equation. In this regard, the article published by Colson et al., 2022
(21) is an interesting example of another potential evolutionary
strategy used by SARS-CoV-2 to persist during this pandemic. In the
study, the authors described the emergence of the subvariant
Marseille-4B (B.1.160 Pangolin lineage; GH clade) circulating in France
between September 2020 and March 2021. Remarkably, this variant carrying
a nonsense mutation that inactivates the ORF8 gene, a protein that
elicits a strong immune response during the infection of SARS-CoV-2
(22), became the dominant variant in December 2020 after the
disappearance of multiple Marseille-A lineages. This observation led the
authors to propose an intriguing hypothesis about the potential adaptive
advantage of Marseille-4B after losing the ORF8 protein.
This hypothesis is supported based on the dN/dS values which suggest
that the ORF8 gene may be subject to positive selection (Figure 2A).
Furthermore, the presence of the predicted CTL and B cell epitopes (6,
23) in this protein raises the possibility that the absence of ORF8
might have helped Marseille-4B avoid both antibody and T cell
recognition.
On the other hand, some important considerations regarding the function
of ORF8 must be considered before supporting the hypothesis proposed by
Colson et al. The ORF8 gene may also contribute to the antiviral
properties of SARS-CoV-2, which is highlighted by its ability to
downregulate the expression of MHC-I or lead to the inhibition of IFNb
or IFNg (24). This may explain why SARS-CoV-2 viruses lacking a
functional ORF8 gene did not become the dominant variant during the
pandemic, since the majority of these infections were associated with
healthy immune competent individuals, a theory that is shared by others
(27). Although the prevalence of viral lineages lacking ORF8 during the
pandemic can be considered low, an increase of this phenotype was
observed during 2020-2021(Figure 2B), a situation mainly associated with
the emergence of the VOC Alpha (Figures 2C and 2D). According to my
estimation, the prevalence of omicron phenotypes lacking ORF8 was as low
as 0.017%. However, considering the emergence of the Alpha variant and
subvariant Marseille-4B, the lack of ORF8 protein in new emergent
lineages should be monitored. Together this illustrates the importance
of the hypothesis proposed by Colson et al. as a potential mechanism of
adaptation of SARS-CoV-2. Future research is needed to understand the
relevance of this finding.