4.3. Machine learning-based prediction
The masking effect of the abundant bacterial community associated with
the copepod diet and ambient water column should not hinder the
detection of core-OTUs, as evidenced by previous studies[1, 2]. QIIME2 core_abundance algorithms used in
the present study did not predict single bacterial s-OTUs (data not
presented). Hence, we used the machine learning approaches to detect
important core sub-OTUs specific to copepod genera.
From our SML classifier results, the important s-OTUs predicted inCalanus spp. and Plueromamma spp. were found to have high
prediction accuracy (AUC=1.00). So, we discuss the important s-OTUs
predicted for these two copepod genera (Calanus spp. andPlueromamma spp.). To begin with, among the important s-OTUs
predicted in Calanus spp. from the present analysis (both SML
models: RFC and GBC) Gammaproteobacteria was a dominant member (14 and 9
s-OTUs from RFC and GBC respectively) followed by Alphaproteobacteria
representing six and three s-OTUs from RFC and GBC, respectively. This
observation was similar to an earlier study where Gammaproteobacteria
and Alphaproteobacteria were reported as core OTUs in Calanus
finmarchicus [2]. Also, within the
Gammaproteobacteria, seven (RFC) and five (GBC) s-OTUs representing theAcinetobacter (Moraxellaceae) were predicted as important s-OTUs
in the present study. This result was similar to an earlier study in
which Moraxellaceae was reported to be closely associated withCalanus finmarchicus [10]. Moreover, four
s-OTUs of Acinetobacter (Moraxellaceae) were also reported as
core OTUs in Calanus finmarchicus [2].
Adding to it in the present analysis, three s-OTUs from both the SML
classifiers RFC and GBC belonging to Vibrio shilonii were
predicted as important s-OTUs in Calanus spp. Comparably, four
OTUs of Vibrionaceae (three OTUs of Vibrio sp. and one similar toVibrio harveyi ) were observed in Calanus
finmarchicus [2].
In the present SML analysis, one genus, Bradyrhizobium (order
Rhizobiale), was predicted as an important s-OTUs in Pleuromammaspp. by GBC classifiers. Moreover, in the present ANCOM analysis,Bradyrhizobium was found to be in high percentile withinPleuromamma spp. This Bradyrhizobium is also known to have
nifH gene, as they usually occur in seawater [50]and also SML-GBC predicted this genus as important s-OTU inCalanus spp.. Bradyrhizobiaceae was also found to be the most
abundant OTU, i.e.79 of the total 137 sequences in the negative control
in a similar analysis [1]. So, in the case ofBradyrhizobium, a further investigation shall require to come to
a meaningful conclusion.
Moreover, in a previous study, order Vibrionale was also predicted as a
core bacterium (based on presence/absence) in Pleuromamma spp.[1]. Also, the genus Pseudoalteromonas was
already reported to occur in high abundance in Pleuromamma spp.[11]. However, in the present analysis, GBC
predicted five s-OTUs of Pseudoalteromonas as important s-OTUs inPleuromamma spp., whereas, the RFC predicted two s-OTUs ofPseudoalteromonas as important s-OTUs in Acartia spp.,Calanus spp., and Centropages sp. (Figure 4e) This
observation was similar to that of Pseudoalteromonas reported as
a constant and stable OTU in Acartia sp.[38], Calanus sp. [2]and Centropages sp. [10]. So, it is not
wise to consider Pseudoaltermonas to be specific to one copepod
genera.
In the present study, GBC model predicted three s-OTUs ofAlteromonas and two s-OTU of Marinobacter as important
s-OTUs in Pleuromamma spp. and the ANCOM analysis also showed
that the genus Marinobacter proportion was found to be high inPleuromamma spp. Comparably, both the Alteromonas andMarinobacter were reported to appear commonly inPleuromamma spp. [11]. Even though the
abundance of genus Sphingomonas was low, it was reported to
appear consistently in the Pleuromamma spp.[11]. And our analysis predicted this genus as an
important s-OTU of Pleuromamma spp. (GBC) (Figure 4f).
In the present study, the GBC model predicted Limnobacter as an
important s-OTU in Pleuromamma spp. and the ANCOM analysis also
showed that the genus Limnobacter proportion was found to be high
in Pleuromamma spp. Moreover, in a previous study,Limnobacter was reported to occur in high abundance as well as
unique to copepods (Pleuromamma spp.) [11].
Also, the genera Methyloversatilis was reported to be low in abundance
in Pleuromamma spp., Whereas the present SML -GBC model predicted
this genus as an important s-OTU in Pleuromamma spp. (Figure 4f).
The order Pseuomonadales was reported as a core member inPleuromamma spp. [1]. But our GBC model
predicted the bacterial genera Enhydrobacter (Pseuomonadales) as
an important s-OTU in Pleuromamma spp. (Figure 4F). Besides, from
the ANCOM analysis, this genus Enhydobacter was found to be in
high percentile in Pleuromamma spp.. But this genus Enhydrobacter
was reported to be high in proportion in calanoid copepods[6]. One another important s-OTU predicted inPleuromamma spp. by our GBC model was Desulfovibrio and
the ANCOM analysis also showed that the genus Desulfovibrioproportion was found to be high in Pleuromamma spp.
HTCC2207 (Gammaproteobacteria) was predicted as an important s-OTU inCalanus spp. by both SML models. Also, from our ANCOM analysis,
HTCC2207 was found to be in high percentile in Calanus spp. This
HTCC2207 is usually more abundant in the seawater and has been reported
to be present in a few Acartia longiremis ., Calanusfinmarchicus . and Centropages hamatus with full gut[10]. Because of their known proteorhodopsin gene
and being free water living bacteria [51], the
probability of detecting this bacterium in the copepod gut might be due
to food ingestion.
Sediminibacterium (Chitiniphagaceae) was reported to be in low
abundance, but regularly present in Pleuromamma spp.[11]. However, in the present analysis RFC model
predicted Sediminibacterium as important s-OTUs in Acartiaspp., Calanus spp. and Temora spp. (Figure 4e and f).
Whereas the GBC model predicted Sediminibacterium as important
s-OTUs in Acartia spp. and Temora spp.. (Figure 4).
Chitiniphagaceae was reported to be associated with calanoid copepods in
the North Atlantic Ocean [6]. Earlier studies
showed that the genus Photobacterium (Phylum: Proteobacteria) was
abundant in Pleuromamma spp. [11],Centropages sp. [11], and Calanusfinmarchicus . [2]. Herein,Photobacterium was detected as an important s-OUT inCalanus spp. by the RFC model only. Furthermore, in the present
analysis, Nitrosopumilus was predicted as an important s-OTU inAcartia spp. and Temora spp. by both the SML models and
this genus, Nitrosopumilus, was also reported to be high in
percentage in Acartia spp. and Temora spp.[38].
Further, RFC predicts the Pelomonas as an important s-OTU inAcartia spp., Centropages sp. and Calanus spp.
However, in an earlier study, the Pelomonas were ruled out from core
OTUs in Calanus spp. [2]. Moreover, the GBC
predicted two s-OTUs of RS62 and one s-OTUs ofPlanctomyces as important s-OTUs in Acartia spp. andTemora spp. This RS62 belongs to order Burkholderiales,
and even though this order was reported to be abundant, their abundance
varied between the individual copepods (Acartia spp. andTemora spp.) [38]. Burkholderiales was also
reported as a main copepod associated community[9]. However, in the present study, the family
Comamonas belonging to Burkholderales was predicted as an important
s-OTU in Acartia spp., Temora spp. by both SML models.
About 25 taxa detected by RFC approach were also found to be in high
percentile in ANCOM analysis. Among them, eight s-OTUs, i.e.Anaerospora, Micrococcus, Micrococcus luteus, Vibrio shilonii andMethylobacteriaceae, were predicted as important s-OTUs inCalanus spp. for the first time in our report (Figure. 4e). From
the 28 taxa detected by the GBC model, four s-OTUs, i.e.Phaeobacter, Acinetobacter johnsonii, Vibrio shilonii, and
Piscirickettsiaceae, were predicted as important s-OTUs inCalanus spp. for the first time in our report (Figure 4f). Also,
seven s-OTUs i.e. Marinobacter , Limnobacter.
Methyloversatilis, Desulfovibrio, Enhydrobacter, Sphingobium,Alteromonas and Coriobacteriaceae, were predicted as important
s-OTUs in Pleuromamma spp. for the first time in the GBC model.