not-yet-known
not-yet-known
not-yet-known
unknown
Comparison of short and long-read metabarcoding sequencing: challenges
and solutions for plastid read removal and microbial community
exploration of seaweed samples
Erwan Legeay
Genomer Platform, FR2424, Station Biologique de Roscoff, Sorbonne Université, CNRS, 29680 Roscoff, France & Adaptation and Diversity in the Marine Environment (UMR 7144), Station Biologique de Roscoff, Sorbonne Université, CNRS, 29680 Roscoff, France
Author ProfileAbstract
Short-read metabarcoding analysis is the gold standard to access to
partial 16S and ITS genes with high read quality. With the advent of
long-read sequencing, the amplification of full-length target genes is
possible but with low read accuracy. Moreover, the amplification of 16S
rDNA genes in seaweed or plant samples results in a large proportion of
plastid reads, which are directly or indirectly derived from
cyanobacteria. Primers designed not to amplify plastid sequences are
available for short-read sequencing, while Oxford Nanopore Technology
offers adaptive sampling, a unique way to remove reads in real-time. In
this study, we compare three options to address the plastid read issue:
deleting plastid reads with adaptative sampling, using optimized primers
with Illumina MiSeq technology, and sequencing large numbers of reads
with Illumina NovaSeq technology with universal primers. We showed that
adaptive sampling using default settings of the MinKNOW software was
ineffective for plastid depletion. We also demonstrated with a mock
community that the SAMBA workflow provided the most accurate taxonomic
assignment at the bacterial genus level compared to the IDTAXA and
KRAKEN2 pipelines, but many false positives were generated at species
level. Although NovaSeq sequencing with universal primer stood out for
studying the algal bacterial community due to its deep coverage, the
inclusion of eukaryotes and bacteria in the same sequencing run, and the
low error rate, the combination of Illumina and ONT sequencing helped us
explore the fungal diversity and allowed for the retrieval taxonomic
information for genera poorly represented in the sequence databases.