We, the students of MICI5029/5049, a Graduate Level Molecular Pathogenesis Journal Club at Dalhousie University in Halifax, NS, Canada, hereby submit a review of the following BioRxiv preprint:Keir M. Balla, Marlen C. Rice, James A. Gagnon, Nels C. Elde. Discovery of a prevalent picornavirus by visualizing zebrafish immune responses. BioRxiv 823989; doi: http://dx.doi.org/10.1101/823989We will adhere to the Universal Principled (UP) Review guidelines proposed in:Krummel M, Blish C, Kuhns M, Cadwell K, Oberst A, Goldrath A, Ansel KM, Chi H, O’Connell R, Wherry EJ, Pepper M; Future Immunology Consortium. (2019) Universal Principled Review: A Community-Driven Method to Improve Peer Review. Cell . 2019 Dec 12;179(7):1441-1445. doi: https://doi.org/10.1016/j.cell.2019.11.029SUMMARY: The revolution in metagenomics has revealed the ubiquity of viruses in the environment. Viruses outnumber hosts, and only a small fraction cause disease. One way to identify potentially pathogenic viruses specifically associated with a host is to select for engagement of antiviral immune responses. Here, Balla KM, et al ., report on the development of a transgenic zebrafish line that produces green fluorescent protein (GFP) in response to the antiviral zebrafish type I interferon (IFN) protein, IFNφ. They observed spontaneous GFP expression in a minority of zebrafish only days after hatching. They employed RNA sequencing and 5’-RACE to identify the complete genome of a new picornavirus, ZfPV, in the GFP-expressing fish that was distantly related to known viruses. By conducting bioinformatic analyses on publicly available sequence data they identified ZfPV in seemingly asymptomatic fish in labs worldwide. They observed higher viral load in clonal CG2 zebrafish that have a single core MHC haplotype. They documented infection of the GI tract, as well as other tissues, but the natural history of infection remains to be determined. They confirmed authentic IFN responses in the GFP+ zebrafish by identifying increased expression of numerous interferon stimulated genes (ISGs).OVERALL ASSESSMENT:STRENGTHS: In general, the data presented adequately supports the authors’ conclusions. The manuscript is well written and the data is clearly presented. Attention is paid to controls and appropriate statistical tests are applied to demonstrate significance. The authors make good use of pre-existing datasets to add strength to their findings. Obviously, there is a strong element of discovery in the manuscript with the serendipitous discovery of a new zebrafish picornavirus. The discovery methods described in this manuscript may be employed in other model organisms.WEAKNESSES: The primary weakness of the study was that the authors did not clearly establish causality. The conclusion could be strengthened by isolating the candidate picornavirus (using well-established methods for human picornaviruses) and transmission to a naïve zebrafish, and demonstration of replication and ISG production, which could be accomplished using RT-qPCR. To be clear, further characterization of the natural history of infection and host determinants is not required, but some demonstration of an infectious agent is necessary to support the authors’ conclusions. Other minor weaknesses are detailed below.DETAILED U.P. ASSESSMENT:OBJECTIVE CRITERIA (QUALITY)Quality: Experiments (1–3 scale) SCORE = 1Figure by figure, do experiments, as performed, have the proper controls?The experiments are properly controlled throughout.Are specific analyses performed using methods that are consistent with answering the specific question?The bioinformatic analysis methods are largely consistent with standards in the field. Generating phylogenies with different methods (Fig. 2 – supplement 2) was considered a strength of the study.One minor point is that the sequencing depth of non-zebrafish reads is relatively low. There were 175 million starting reads, but only 8% of these were non-zebrafish, with 14 million remaining. Most of these remaining reads mapped to bacteria or were unknown so in terms of truly discovering novel viruses they really only had the power to detect abundant viruses in these experiments. This clearly was sufficient to identify ZfPv, but it would have been nice to get at the whole virome with higher sequencing depth. In this regard, it would be helpful if the authors could address potential limitations of their analysis in more detail.Is there the appropriate technical expertise in the collection and analysis of data presented?The experimentation and collection of data was generally of high quality throughout.In the Methods section, it would be helpful if the authors could clarify their reasoning behind pooling 30 GFP positive fish to harvest RNA for sequencing.Do analyses use the best-possible (most unambiguous) available methods quantified via appropriate statistical comparisons?For the most part, appropriate statistical comparisons are made throughout.In Figure 4, RT-qPCR data was displayed as relative ZfPV levels on a log scale y-axis by normalizing to lowest non-zero amount of ZfPV RNA. This is common practice in the field. However, the unpaired t-test assumes equal variances which may not be the case. The analysis could be strengthened by determining whether the data is normally distributed (e.g. Shapiro-Wilk test). If the data is not normally distributed, then a non-parametric test (e.g. Wilcoxon test) could be performed.Are controls or experimental foundations consistent with established findings in the field? A review that raises concerns regarding inconsistency with widely reproduced observations should list at least two examples in the literature of such results. Addressing this question may occasionally require a supplemental figure that, for example, re-graphs multi-axis data from the primary figure using established axes or gating strategies to demonstrate how results in this paper line up with established understandings. It should not be necessary to defend exactly why these may be different from established truths, although doing so may increase the impact of the study and discussion of discrepancies is an important aspect of scholarship.The experimental foundations of the study are generally solid.FIG 1: As a minor point, there was discussion in the group about the merits of the dual promoter construct driving eGFP from the cmlc2 (cardiac) promoter or the IFN-inducible ISG15 promoter. There was concern that using the same fluorescent protein for both promoters made it impossible to determine IFN-inducible gene expression in cardiac tissue. However, the counterargument was that the levels of eGFP in the cardiac tissue could provide a useful reference for IFN-inducible eGFP expression throughout the fish, acting as a normalization control for fluorescence images. Some additional rationale describing the merits of the expression construct would clarify this point.FIG 5: There was also discussion about the strength of the evidence for a virus-specific response to ZfPV infection, by comparison toM. marinum infection. Does the fact that 12 or 13 genes are induced in response to ZfPV infection, but not M. marinuminfection, truly represent a specific signature? The authors may choose to soften this description until more data becomes available in support of a ZfPV-specific host response signature.Quality: Completeness (1–3 scale) SCORE = 2.5Does the collection of experiments and associated analysis of data support the proposed title- and abstract-level conclusions? Typically, the major (title- or abstract-level) conclusions are expected to be supported by at least two experimental systems.Overall, the data generally supports the conclusions as stated in the title and abstract, but there was significant discussion in this room full of microbiologists-in-training of the merits of in silico discovery of a new virus vs. discovery methods that adhere to Koch’s postulates or Falkow’s molecular Koch’s postulates. There was consensus that isolation of the virus and experimental inoculation of naïve zebrafish and characterization of responses would add great value to the study. RT-qPCR could be performed to detect viral transcripts and ISGs in the same infected fish. Such experiments would provide greater confidence that ZfPV is truly the etiologic agent. Subsequent experiments to fully characterize immune responses in the fish and host genetic determinants are beyond the scope of this first report, but establishing causality is essential.For Figure 3, there was some discussion about the extent to which the findings of ZfPV RNA in different tissues could be influenced by dissection technique.In the absence of these experiments, the authors should acknowledge these limitations and mention future research directions in the Discussion.Are there experiments or analyses that have not been performed but if “true” would disprove the conclusion (sometimes considered a fatal flaw in the study)? In some cases, a reviewer may propose an alternative conclusion and abstract that is clearly defensible with the experiments as presented, and one solution to “completeness” here should always be to temper an abstract or remove a conclusion and to discuss this alternative in the discussion section.See above re: virus isolation and infection of naïve animals.Furthermore, there was some concern about independent validation of the RNA-seq experiments. Can ZfPV RNA be recovered from GFP-negative animals? The RNA-seq data could be validated on samples of RNA from individual fish using RT-qPCR and virus gene-specific primers. What does it mean if ZfPV can be recovered from GFP-negative animals?Quality: Reproducibility (1–3 scale) SCORE = 2Figure by figure, were experiments repeated per a standard of 3 repeats or 5 mice per cohort, etc.?There was considerable discussion about whether the sample sizes in Figure 4 were sufficient to support the conclusions, but the group was not able to come to consensus on this point.Is there sufficient raw data presented to assess rigor of the analysis?Yes, although the uploading of raw data is still in progress at this time. The dataset could be strengthened by making the video publicly available.Are methods for experimentation and analysis adequately outlined to permit reproducibility?The bioinformatics methods are clearly described, but the scripts should be made available on GitHub.There should be a description of how the genome viewer analysis was performed - did they do this for a representative set of samples as sanity checks or was this done systematically?If a “discovery” dataset is used, has a “validation” cohort been assessed and/or has the issue of false discovery been addressed?Independent confirmation of RNA-seq data in individual fish by RT-qPCR is lacking.Quality: Scholarship (1–4 scale but generally not the basis for acceptance or rejection) SCORE = 1Has the author cited and discussed the merits of the relevant data that would argue against their conclusion?Yes.Has the author cited and/or discussed the important works that are consistent with their conclusion and that a reader should be especially familiar when considering the work?In general, the work of others was appropriately cited in the manuscript, although there was consensus that the Discussion could be strengthened by more clearly describing how this study differs from the findings of the Altan E, et al ., paper.Altan E, Kubiski SV, Boros A, Reuter G, Sadeghi M, Deng X, Creighton EK, Crim MJ, Delwart E. 2019. A Highly Divergent Picornavirus Infecting the Gut Epithelia of Zebrafish (Danio rerio) in Research Institutions Worldwide. Zebrafish16 :291– 299. doi:10.1089/zeb.2018.1710Specific (helpful) comments on grammar, diction, paper structure, or data presentation (e.g., change a graph style or color scheme) go in this section, but scores in this area should not to be significant bases for decisions.Overall, the writing was lean and straightforward, and written in a way to engage broad audience.MORE SUBJECTIVE CRITERIA (IMPACT)Impact: Novelty/Fundamental and Broad Interest (1–4 scale) SCORE = 1A score here should be accompanied by a statement delineating the most interesting and/or important conceptual finding(s), as they stand right now with the current scope of the paper. A “1” would be expected to be understood for the importance by a layperson but would also be of top interest (have lasting impact) on the field.Most important finding, easily understandable by a layperson, was the actual discovery of ZfPV.Likely to improve integrity of future zebrafish infection/innate immunity studies.Highlights method to discover new viruses in existing datasets. This would be further strengthened by a candid analysis of the limits ofin silico discovery methods.How big of an advance would you consider the findings to be if fully supported but not extended? It would be appropriate to cite literature to provide context for evaluating the advance. However, great care must be taken to avoid exaggerating what is known comparing these findings to the current dogma (see Box 2). Citations (figure by figure) are essential here.Yes, there was broad agreement that the current manuscript is a significant advance without further extension.Impact: Extensibility (1–4 or N/A scale) SCORE = N/AHas an initial result (e.g., of a paradigm in a cell line) been extended to be shown (or implicated) to be important in a bigger scheme (e.g., in animals or in a human cohort)?This criterion is only valuable as a scoring parameter if it is present, indicated by the N/A option if it simply doesn’t apply. The extent to which this is necessary for a result to be considered of value is important. It should be explicitly discussed by a reviewer why it would be required. What work (scope and expected time) and/or discussion would improve this score, and what would this improvement add to the conclusions of the study? Care should be taken to avoid casually suggesting experiments of great cost (e.g., “repeat a mouse-based experiment in humans”) and difficulty that merely confirm but do not extend (see Bad Behaviors, Box 2).N/A