Abstract
Genomics data analysis requires efficient tools to address the vast
amount of data generated by current next-generation sequencing
technologies. K-mer counting works face difficulties in balancing high
memory overhead with statistical precision. We designed a high-frequency
k-mer statistical computation based on the Space Saving algorithm and a
novel hash table structure, which reduces the memory overhead by
46\% while ensuring high computational efficiency.