Deciphering the evolutionary mechanisms of SARS-CoV-2; Absence of ORF8 protein and its potential advantage in the emergence of viral lineages
Lauro Velazquez-Salinas
Affiliation
Plum Island Animal Disease Center, Agricultural Research Service, USDA, Greenport, NY 11944, USA
Corresponding author:
Lauro Velazquez-Salinas: lauro.velazquez@usda.gov
To The Editor,
To date (2/5/2023), nearly three years after the official declaration of the COVID-19 pandemic, a total of 754, 018, 841 cases and 6,817,478 deaths have been officially reported to the world health organization (WHO) (https://covid19.who.int/). During this time, the global populace has witnessed the devastating effects of this pandemic, which can undoubtedly be considered the sanitary event of the century. In my opinion, understanding the mechanisms associated with the evolution of SARS-CoV-2 is the most important aspect of understanding the course of this pandemic. In this respect, the millions of viral sequences generated during the pandemic (1) have significantly aided in the tracking of the pandemic, resulting in the identification of multiple variants of concern.
At the start of the pandemic, the evolutionary patterns of SARS-CoV-2 and the potential role of natural selection in the emergence of new lineages was heavily debated (2). However, now several years into the pandemic, it is evident that SARS-CoV-2 has a tremendous ability to continuously evolve into new lineages with increased viral fitness (3) highlighting the significance of natural selection in this process (4, 5, 6).
Over the course of the pandemic, more than 1300 lineages have been identified under the Pango lineage classification (7). This situation has produced the diversification of this virus in multiple phylogenetic clades, which have been alternating their dominance in terms of their ability to rule the infections during this sanitary event (Figure 1A). Furthermore, multiple variants of concern, (VOC), and interest, (VOI), have been identified from these clades, with the clade GRA comprising the VOC omicron, which became dominant in 2022 (Figure 1B). In this sense, the role of natural selection in the evolutionary history of SARS-CoV-2 can be evidenced using multiple evolutionary algorithms based on the detection of natural selection at specific sites (MEME) or branch level (aBSREL and BUSTED) in a codon-based phylogenetic framework (8). As a result, the presence of multiple branches evolving under positive selection (Figure 1C) is consistent with the emergence and the dominance of different VOC (Figure 1B). Furthermore, the detection of these branches is associated with the discovery of specific residues under positive selection, which in some cases has been linked with increased infectivity (spike-452) (9), antibody resistance (spike-371) (10), generation of potential cleavage sites (spike-856) (11), transmissibility (Nucleocapsid-13) (12), and immune evasion (NSP6-105) (13), supporting the inferences produced by multiple evolutionary algorithms, highlighting the relevance of natural selection during the evolution of this pathogen.
Conversely, when the strength of the natural selection was evaluated through the time of the evolution of different phylogenetic clades by contrasting evolutionary dN/dS (ratio of non-synonymous to synonymous substitutions) signatures between internal nodes and leaf nodes using the evolutionary algorithm Relax (14). Overall, strong evidence of positive selection was found at internal nodes in comparison to the leaf nodes (dN/dS= 1.5576 vs 0.5025 respectively). Evidence of relaxation of the natural selection at leaf nodes K=0.73 p<0.05 (K=relaxation/ intensification parameter) was inferred, with the diversifying selection, the main evolutionary force affected by the relaxed selection (Figure 1C). In light of these results, it is possible to suggest that the emergence of the lineages associated with different phylogenetic clades is initiated by a strong signal of diversifying selection (dN/dS= 411.436), which tends to relax during the time of circulation of different lineages. However, despite this relaxation in the diversifying selection, the strength of this evolutionary force may be considered strong enough (dN/dS= 82.176) to improve the viral fitness, leading to the emergence of new subvariants as is the case of the current dominant VOV omicron (15). Moreover, considering the current dominance of the VOC omicron and the lack of clinical cases produced by other VOC, the strength of the selection between GRA clade and the rest of the clades was compared (figure 1D). Interestingly, evidence of selection intensification was observed (K=1.89, p<0.05), affecting mainly the strength of diversifying selection during the evolution of lineages associated with GRA clade. The high difference in the strength of diversifying selection between GRA and the rest of the clades, clearly explains the dominance of the omicron variant over the rest of VOC. While more research is needed to understand the factors favoring the diversification of this variant, some evidence indicates that vaccine-breakthrough or antibody-resistant mutations may be the main mechanism associated with this process (16).
However, beyond these evolutionary mechanisms, additional adaptive strategies should be considered as part of this complex evolutionary equation. In this regard, the article published by Colson et al., 2022 (21) is an interesting example of another potential evolutionary strategy used by SARS-CoV-2 to persist during this pandemic. In the study, the authors described the emergence of the subvariant Marseille-4B (B.1.160 Pangolin lineage; GH clade) circulating in France between September 2020 and March 2021. Remarkably, this variant carrying a nonsense mutation that inactivates the ORF8 gene, a protein that elicits a strong immune response during the infection of SARS-CoV-2 (22), became the dominant variant in December 2020 after the disappearance of multiple Marseille-A lineages. This observation led the authors to propose an intriguing hypothesis about the potential adaptive advantage of Marseille-4B after losing the ORF8 protein.
This hypothesis is supported based on the dN/dS values which suggest that the ORF8 gene may be subject to positive selection (Figure 2A). Furthermore, the presence of the predicted CTL and B cell epitopes (6, 23) in this protein raises the possibility that the absence of ORF8 might have helped Marseille-4B avoid both antibody and T cell recognition.
On the other hand, some important considerations regarding the function of ORF8 must be considered before supporting the hypothesis proposed by Colson et al. The ORF8 gene may also contribute to the antiviral properties of SARS-CoV-2, which is highlighted by its ability to downregulate the expression of MHC-I or lead to the inhibition of IFNb or IFNg (24). This may explain why SARS-CoV-2 viruses lacking a functional ORF8 gene did not become the dominant variant during the pandemic, since the majority of these infections were associated with healthy immune competent individuals, a theory that is shared by others (27). Although the prevalence of viral lineages lacking ORF8 during the pandemic can be considered low, an increase of this phenotype was observed during 2020-2021(Figure 2B), a situation mainly associated with the emergence of the VOC Alpha (Figures 2C and 2D). According to my estimation, the prevalence of omicron phenotypes lacking ORF8 was as low as 0.017%. However, considering the emergence of the Alpha variant and subvariant Marseille-4B, the lack of ORF8 protein in new emergent lineages should be monitored. Together this illustrates the importance of the hypothesis proposed by Colson et al. as a potential mechanism of adaptation of SARS-CoV-2. Future research is needed to understand the relevance of this finding.