Bioinformatics: Basecalling, fastq clean up, and demultiplexing
The sequence signal data in multi-fast5 format were basecalled using
Guppy version 3.4.4. The resulting fastq outputs were adapter trimmed,
low-quality reads ends trimmed (-q 10), and short reads of <30
base pairs removed using cutadapt version 2.5 (Martin, 2011). Cleaned
fastq files were mapped against Human Genome Reference Assembly GCRh37
using BWA (Li, 2013) (version 0.7.17), and sample targets were extracted
from the resulting BAM file using SAMtools (Li et al., 2009) (version
0.1.19).