Choosing a metabarcoding approach
It is clear that there is no such thing as a perfect metabarcoding sample-labelling approach, and that choosing which one is right for a given study or lab should be an informed trade-off of pros and cons balanced to the needs. Within metabarcoding studies, those needs can range widely.
Metabarcoding studies range from those that look for one or a few taxa within sample units (e.g. Bohmannet al. 2018) to studies that look for many taxa within sample units (e.g. Seersholm et al. 2018), and sample numbers can range from tens (e.g. Elbrechtet al. 2017), to hundreds (Rodgerset al. 2017; e.g. Galan et al. 2017) or even thousands (e.g. Schnell et al. 2018; Ji et al. 2020). The research question and experimental set-up can require taxonomic identifications to be made within individual samples (e.g. Coghlanet al. 2012), while in other studies, taxonomic identifications from pools of individual samples or from a number of samples within e.g. a geographic location is the goal (e.g. Grealy et al. 2016; Schnell et al. 2018). Sample types can range from bulk specimen samples consisting of high quality DNA from pools of entire organisms (e.g. Tanget al. 2015) to environmental samples in which DNA from target organisms can be fragmented and scarce (e.g. Statet al. 2017). Furthermore, studies differ in how many metabarcoding primer sets are used - from only one (e.g. Bohmann et al. 2011; Drinkwater et al. 2018) to several (e.g. De Barba et al. 2014; Drummond et al. 2015; Zhang et al. 2018). Furthermore, the budget for a metabarcoding project will differ between studies, and lastly so will whether the metabarcoding primers are to be used in future studies. Lastly, some applications of metabarcoding, such as biosecurity or forensics, will necessitate a ‘high bar’ for data fidelity and controls.
A multitude of combinations of the above metabarcoding study parameters exist, and as witnessed by this article, the significance of the pros and cons of the metabarcoding approaches will differ with them. For example, while the tagged PCR approach (Fig. 2D) might be more sensitive to low-abundance templates, the one-step PCR offers a quick turnaround (Fig. 2B). However, this comes at the cost of buying long fusion primers and is only worthwhile if the metabarcoding primers are to be used again.
When choosing a metabarcoding approach, the need for future multiplexing of the metabarcoding primers should be considered. That is, to use several metabarcoding primer sets that target different markers and taxonomic groups in individual PCR reactions to simultaneously screen for many taxonomic groups within the same reaction, and thereby keep costs and work load at a minimum (e.g. De Barbaet al. 2014). For this, the nucleotide tagged primers in the tagged PCR approach should theoretically be the most applicable, whereas the long additions to the metabarcoding primers in the one-step and two-step PCR approaches will be far less conducive to multiplexing due to the extensive sequence homology.
Lastly, it should be noted that whatever metabarcoding strategy is chosen, it should be clear from the present article that one should not change workflows within an experiment. Moreover, there is some justified concern within the metabarcoding community that the nuances in metabarcoding workflows makes inter-lab comparison difficult (e.g. Murray et al. 2015; Zizka et al. 2019; Blackman et al. 2019).
PerspectivesAll metabarcoding strategies can generate robust data. However, like all laboratory workflows if they are not executed well or are inappropriate for the application, they may lead to flawed data. We advocate that just because PCR is a relatively simple method it does not mean that metabarcoding is simple, and there are many traps in metabarcoding workflows that can trip-up new users. Here, we have presented an overview of the three main metabarcoding strategies for assessment of biodiversity on Illumina sequencing platforms, and the downstream consequences for the resulting data with regards to cross-contamination risk, PCR amplification efficiency, chimera formation, tag-jumping, index-misassignment, cost, and workload. In doing so we wish to enable researchers and practitioners to make an informed choice of which metabarcoding strategy is best suited for their specific study. Ultimately, this is to avoid the worst case scenario, generation of unusable data and wasting a considerable amount of time and money, or even worse making wrong conclusions due to flawed data.
Metabarcoding of environmental DNA has some commonalities with the field of ancient DNA in which low quality and quantity of target DNA is also targeted amongst non-target (and potentially more abundant) templates. In the early days of ancient DNA studies, PCR-based techniques (including amplifying already amplified DNA to enhance signals) were used, which caused authentication issues, as amplification of modern templates was mistaken for true ancient signals. This was followed by urgent calls for precautions to ensure reliability and authenticity of ancient DNA sequences (e.g. Cooper & Poinar 2000; Pääbo et al. 2004). Also similarly to the field of ancient DNA, the take-home message should be that metabarcoding is becoming a self-critical and self-correcting field in which technical reliability is promoted and rewarded, with the long-term benefit of uptake by stakeholders who will employ metabarcoding for environmental management. Reputational setbacks as the result of practitioners not executing their metabarcoding workflows well will likely resonante across a variety of biomonitoring, forensic and bioseurity applications.
We thus stress the importance of being informed about the pros and cons of the chosen metabarcoding approach with regards to cross-contamination risk, PCR amplification efficiency, chimera formation, tag-jumping, index-misassignment, cost, and workload and to include appropriate quality assurance and quality control measures. This will help ensure that the generated data will facilitate informed data analysis and interpretation. Therefore, we advocate that metabarcoding publications should include detailed information about the metabarcoding strategy and how its challenges have been taken into account in the laboratory, data processing, and interpretation of results. Furthermore, it may be appropriate to eventually develop a set of metabarcoding guidelines similar to the MIQE guidelines for qPCR (Bustin et al.2009), ultimately further increasing the power and reliability of metabarcoding.