Legends
Figure 1 . Phylogenomics of SARS-CoV-2 showing the relevance of diversifying selection during the evolution of this pathogen . A) Spatial dynamic of different phylogenetic clades during the evolution of SARS-CoV-2 (Based on GISAID classification). B) Circulation dynamic of diverse variants of concern (VOC) or variants of interest (VOI) during the pandemic. C) Phylogenetic analysis conducted by maximum likelihood using the general reversible model, showing multiple clades predicted during the evolution of SARS-CoV-2 (GISAID classification). Tree was constructed using 40 representative sequences. Determination of branches under diversifying selection was conducted using the algorithm aBSREL (p-<0.0001) (17) (blue branches with black asterisk). Evidence of diversifying selection on the tree was also confirmed by the algorithm BUSTED (p-<0.0001) (18). Sites under positive selection were determined by the algorithm MEME (p-<0.05) (19). The mapping of these sites at different branches in the tree was conducted by BUSTED considering an empirical bayes factor >10. Evidence of relaxation in the diversifying selection during the evolution of distinct lineages was produced by the comparison in the strength of the selection between internal nodes and leaf nodes using the algorithm RELAX. D) Evidence of intensified diversifying selection in the GRA clade (comprising the VOC omicron) in relation with the rest of the clades was obtained by the algorithm RELAX. Representative sequences and statistics about the number sequences associated with multiple clades, and VOC and VOI were obtained from the GISAID database (20).
Figure 2. Prevalence of SARS-CoV-2 phenotypes lacking ORF8 protein. A) to put in context the evolutionary dynamic of ORF8 in comparison with the rest of the proteins present in SARS-CoV-2 during the pandemic, the sequences used to develop the tree shown in figure 1C were analyzed by the evolutionary algorithm SLAC (28). B) Using the database of GISAID, the overall proportion of lineages lacking ORF8 protein was quantified. Based on the information shown in figures 1A and 1B, the proportion of lineages (C) and VOC or VOI (D) lacking ORF8 protein in each phylogenetic clade was estimated.