2.5 | Bioinformatics analyses
Based
on the unique barcode of each sample, single-end reads were identified
and truncated by shearing off the barcode and primer sequences. The
Cutadapt (Martin, 2011) quality controlled process was used to filter
the raw reads under specific filtering conditions, we obtained the
high-quality clean reads. The reads were compare with the reference
database (Silva and Unite database) (Kõljalg et al., 2013;
Quast
et al., 2013) using UCHIME (Edgar, et al., 2011) algorithm to detect
chimera sequences, and removed them. Then final clean reads were
generated. We did sequence analysis using Uparse software (Uparse
v7.0.1001) (Edgar, 2013). Sequences with a similarity higher than 97%
were assigned to the same operational taxonomic unit (OTU).
Representative sequence for each OTU was screened for further
annotation. The taxonomic information of bacterial representative
sequences was
annotated
using the Silva132 SSUrRNA database (Quast et al., 2013) based
on
Mothur algorithm. For fungal representative sequences, we annotated
taxonomic information using the Unite (v7.2) database (Kõljalg et al.,
2013) based Blast algorithm which was calculated by QIIME software
(Version 1.9.1).
In total, we obtained 947,987 and 958,264 bacterial high-quality
sequences from rhizosphere and non-rhizosphere samples, respectively.
These reads were clustered into 23,160 and 37,278 OTU. For fungi, we
obtained 961,649 and 928,290 high-quality sequences from rhizosphere and
non-rhizosphere samples, respectively, which were clustered into 11,695
and 17,424 OTU (Table S1).