Choosing a metabarcoding approach
It is clear that there is no such thing as a perfect metabarcoding
sample-labelling approach, and that choosing which one is right for a
given study or lab should be an informed trade-off of pros and cons
balanced to the needs. Within metabarcoding studies, those needs can
range widely.
Metabarcoding studies range from those that look for one or a few taxa
within sample units
(e.g. Bohmannet al. 2018) to studies that look for many taxa within sample
units (e.g.
Seersholm et al. 2018), and sample numbers can range from tens
(e.g. Elbrechtet al. 2017), to hundreds
(Rodgerset al. 2017; e.g. Galan et al. 2017) or even thousands
(e.g.
Schnell et al. 2018; Ji et al. 2020). The research
question and experimental set-up can require taxonomic identifications
to be made within individual samples
(e.g. Coghlanet al. 2012), while in other studies, taxonomic identifications
from pools of individual samples or from a number of samples within e.g.
a geographic location is the goal
(e.g.
Grealy et al. 2016; Schnell et al. 2018). Sample types
can range from bulk specimen samples consisting of high quality DNA from
pools of entire organisms
(e.g. Tanget al. 2015) to environmental samples in which DNA from target
organisms can be fragmented and scarce
(e.g. Statet al. 2017). Furthermore, studies differ in how many
metabarcoding primer sets are used - from only one
(e.g.
Bohmann et al. 2011; Drinkwater et al. 2018) to several
(e.g.
De Barba et al. 2014; Drummond et al. 2015; Zhang et
al. 2018). Furthermore, the budget for a metabarcoding project will
differ between studies, and lastly so will whether the metabarcoding
primers are to be used in future studies. Lastly, some applications of
metabarcoding, such as biosecurity or forensics, will necessitate a
‘high bar’ for data fidelity and controls.
A multitude of combinations of the above metabarcoding study parameters
exist, and as witnessed by this article, the significance of the pros
and cons of the metabarcoding approaches will differ with them. For
example, while the tagged PCR approach (Fig. 2D) might be more sensitive
to low-abundance templates, the one-step PCR offers a quick turnaround
(Fig. 2B). However, this comes at the cost of buying long fusion primers
and is only worthwhile if the metabarcoding primers are to be used
again.
When choosing a metabarcoding approach, the need for future multiplexing
of the metabarcoding primers should be considered. That is, to use
several metabarcoding primer sets that target different markers and
taxonomic groups in individual PCR reactions to simultaneously screen
for many taxonomic groups within the same reaction, and thereby keep
costs and work load at a minimum
(e.g. De Barbaet al. 2014). For this, the nucleotide tagged primers in the
tagged PCR approach should theoretically be the most applicable, whereas
the long additions to the metabarcoding primers in the one-step and
two-step PCR approaches will be far less conducive to multiplexing due
to the extensive sequence homology.
Lastly, it should be noted that whatever metabarcoding strategy is
chosen, it should be clear from the present article that one should not
change workflows within an experiment. Moreover, there is some justified
concern within the metabarcoding community that the nuances in
metabarcoding workflows makes inter-lab comparison difficult
(e.g.
Murray et al. 2015; Zizka et al. 2019; Blackman et
al. 2019).
PerspectivesAll metabarcoding strategies can generate robust data. However, like all
laboratory workflows if they are not executed well or are inappropriate
for the application, they may lead to flawed data. We advocate that just
because PCR is a relatively simple method it does not mean that
metabarcoding is simple, and there are many traps in metabarcoding
workflows that can trip-up new users. Here, we have presented an
overview of the three main metabarcoding strategies for assessment of
biodiversity on Illumina sequencing platforms, and the downstream
consequences for the resulting data with regards to cross-contamination
risk, PCR amplification efficiency, chimera formation, tag-jumping,
index-misassignment, cost, and workload. In doing so we wish to enable
researchers and practitioners to make an informed choice of which
metabarcoding strategy is best suited for their specific study.
Ultimately, this is to avoid the worst case scenario, generation of
unusable data and wasting a considerable amount of time and money, or
even worse making wrong conclusions due to flawed data.
Metabarcoding of environmental DNA has some commonalities with the field
of ancient DNA in which low quality and quantity of target DNA is also
targeted amongst non-target (and potentially more abundant) templates.
In the early days of ancient DNA studies, PCR-based techniques
(including amplifying already amplified DNA to enhance signals) were
used, which caused authentication issues, as amplification of modern
templates was mistaken for true ancient signals. This was followed by
urgent calls for precautions to ensure reliability and authenticity of
ancient DNA sequences
(e.g.
Cooper & Poinar 2000; Pääbo et al. 2004). Also similarly to the
field of ancient DNA, the take-home message should be that metabarcoding
is becoming a self-critical and self-correcting field in which technical
reliability is promoted and rewarded, with the long-term benefit of
uptake by stakeholders who will employ metabarcoding for environmental
management. Reputational setbacks as the result of practitioners not
executing their metabarcoding workflows well will likely resonante
across a variety of biomonitoring, forensic and bioseurity applications.
We thus stress the importance of being informed about the pros and cons
of the chosen metabarcoding approach with regards to cross-contamination
risk, PCR amplification efficiency, chimera formation, tag-jumping,
index-misassignment, cost, and workload and to include appropriate
quality assurance and quality control measures. This will help ensure
that the generated data will facilitate informed data analysis and
interpretation. Therefore, we advocate that metabarcoding publications
should include detailed information about the metabarcoding strategy and
how its challenges have been taken into account in the laboratory, data
processing, and interpretation of results. Furthermore, it may be
appropriate to eventually develop a set of metabarcoding guidelines
similar to the MIQE guidelines for qPCR
(Bustin et al.2009), ultimately further increasing the power and reliability of
metabarcoding.