Abstract
Modern processors have instructions to process 16 bytes or more at once.
These instructions are called SIMD, for single instruction, multiple
data. Recent advances have leveraged SIMD instructions to accelerate
parsing of common Internet formats such as JSON and base64. The two
major Web browser engines (WebKit and Blink) have adopted SIMD
algorithms for parsing HTML on 64-bit ARM processors. During HTML
parsing, they quickly identify specific characters with a strategy
called vectorized classification. We review their techniques and
compare them with a faster alternative. We measure a 20-fold performance
improvement in HTML scanning compared to traditional methods on recent
ARM processors. Our findings highlight the potential of SIMD-based
algorithms for optimizing Web browser performance.