Genomic Observatories: a framework for harmonised high
throughput barcode sequencing of island arthropods
The biodiversity, ecology, and evolution of island arthropod communities
can be studied at unprecedented scales and resolution through the
individual or joint application of (i) wocDNA metabarcoding, (ii)
barcode reference libraries, (iii) multiplex barcoding and (iv) image
analyses. Harmonisation across the first three approaches can also
provide for a common data currency, facilitating comparisons and
synthetic analyses across independent studies. By incorporating the
universally accepted arthropod barcode region of the mitochondrial
cytochrome oxidase subunit I (COI) gene into wocDNA metabarcoding
(Andújar et al., 2018; Elbrecht et al., 2019), the COI barcode region
can act as a directly comparable species tag across any given study,
transcending potential taxonomic assignment errors within individual
studies. The Genomic Observatory concept, within which HTS serves as a
core tool for biodiversity assessment (Arribas et al., 2021a), provides
a solid foundation for implementing genome-based inventory and
monitoring of insular arthropod biodiversity.
Harmonised HTS data generation and bioinformatic workflows for general
arthropod inventory and assessment are emerging (e.g. Arribas et al.,
2022; Creedy et al., 2021; Srivathsan et al., 2021). However, more
development is needed for an inclusive range of sampling protocols that
can capture important arthropod fractions of biodiversity on islands
(see Montgomery et al., 2021 for a review). For terrestrial fractions of
arthropod biodiversity, these can be developed as submodules within the
recently proposed framework of Arribas et al. (2022), taking advantage
of their proposed downstream submodules for the processing and
sequencing of samples.
In addition to the need for harmonised data generation protocols, there
are other generic obstacles for Genomic Observatories that need to be
addressed for an efficient island Genomic Observatories Network. One
important challenge is to ensure that metabarcode data conform to
Findable, Accessible, Interoperable and Reusable (FAIR) Data Principles
(Wilkinson et al., 2016), such that new wocDNA metabarcode and multiplex
barcoding data sets can be cross-referenced to previous work. In the
same way that cross-referencing sequence reads to barcode sequence
repositories can assign taxonomy and clarify species origins, additional
cross-referencing to a metabarcode sequence repository would facilitate
understanding the structure of community similarity over a range of
spatial scales. The GEOME (Genomic Observatories Metadatabase; Deck et
al., 2017; Riginos et al., 2020) initiative offers a very useful
platform, facilitating FAIR data archival practices. GEOME also
facilitates DNA data sharing through the deposition of raw genetic data
to the Sequence Read Archive (SRA, www.ncbi.nlm.nih.gov/sra), while
maintaining persistent links to standard-compliant metadata held in the
GEOME database. Achieving seamless cross-referencing among de
novo wocDNA metabarcode sequences, multiplex barcoding sequences and
repositories of both barcode sequences and wocDNA metabarcode sequences
has the potential to dramatically extend the scope and reach of such
data.