loading page

wholeskim: Utilizing genome skims for taxonomically annotating ancient DNA metagenomes
  • +2
  • Lucas Elliott,
  • Frédéric Boyer,
  • Téo Lemane,
  • Inger Alsos,
  • Eric Coissac
Lucas Elliott
UiT— The Arctic University of Norway, Tromsø, Norway

Corresponding Author:[email protected]

Author Profile
Frédéric Boyer
Universite Grenoble Alpes
Author Profile
Téo Lemane
CEA Genoscope National Sequencing Center
Author Profile
Inger Alsos
University centre in Svalbard
Author Profile
Eric Coissac
Université Grenoble Alpes
Author Profile

Abstract

Inferring community composition from shotgun sequencing of environmental DNA is highly dependent on the completeness of reference databases used to assign taxonomic information as well as the pipeline used. While the number of complete, fully assembled reference genomes is increasing rapidly, their taxonomic coverage is generally too sparse to use them to build complete reference databases that span all or most of the target taxa. Low-coverage, whole genome sequencing, or skimming, provides a cost-effective and scalable alternative source of genome-wide information in the interim. Without enough coverage to assemble large contigs of nuclear DNA, much of the utility of a genome skim in the context of taxonomic annotation is found in its short read form. However, previous methods have not been able to fully leverage the data in this format. We demonstrate the utility of wholeskim, a pipeline for the indexing of k-mers present in genome skims and subsequent querying of these indices with short DNA reads. Wholeskim expands on the functionality of kmindex, a software which utilizes Bloom filters to efficiently index and query billions of k-mers. Using a collection of thousands of plant genome skims, wholeskim is the only software that is able to index and query the skims in their unassembled form. We also explore the effects of taxonomic and genomic completeness of the reference database on the accuracy and sensitivity of read assignment.
27 Aug 2024Submitted to Molecular Ecology Resources
29 Aug 2024Submission Checks Completed
29 Aug 2024Assigned to Editor
29 Aug 2024Review(s) Completed, Editorial Evaluation Pending
09 Sep 2024Reviewer(s) Assigned