Fast-tracking bespoke DNA reference database generation from museum
collections for biomonitoring and conservation
Abstract
Despite recent advances in high-throughput DNA sequencing technologies,
a lack of locally relevant DNA reference databases may limit the
potential for DNA-based monitoring of biodiversity for conservation and
biosecurity applications. Museums and national collections represent a
compelling source of authoritatively identified genetic material for DNA
database development yet obtaining DNA barcodes from long-stored
specimens may be difficult due to sample degradation. We demonstrate a
sensitive and efficient laboratory and bioinformatic process for
generating DNA barcodes from hundreds of invertebrate specimens
simultaneously via the Illumina MiSeq system. Using this process, we
recovered full-length (334) or partial (105) COI barcodes from 439 of
450 (98 %) national collection-held invertebrate specimens. This
included full-length barcodes from 146 specimens which produced
low-yield DNA and no visible PCR bands, and which produced as little as
a single sequence per specimen, demonstrating high sensitivity of the
process. In many cases, the identity of the most abundant sequences per
specimen were not the correct barcodes, necessitating the development of
a taxonomy-informed process for identifying correct sequences among the
sequencing output. The recovery of only partial barcodes for some taxa
indicates a need to refine certain PCR primers. Nonetheless, our
approach represents a highly sensitive, accurate, and efficient method
for targeted reference database generation, providing a foundation for
DNA-based assessments and monitoring of biodiversity.