Abstract
Volcanic ash provides information that can help understanding the
evolution of volcanic activity during the early stages of a crisis, and
possible transitions towards different eruptive styles. Ash consists of
particles from a range of origins in the volcanic system and its
analysis can be indicative of the processes driving activity. However,
classifying ash particles into different types is not straightforward.
Diagnostic observations for particle classification are not standardized
and vary across samples. Here we explore the use of machine learning
(ML) to improve the classification accuracy and reproducibility. We use
a curated database of ash particles (VolcAshDB) to optimize and train
two ML-based models: an Extreme Gradient Boosting (XGBoost) that uses
the measured physical attributes of the particles, from which
predictions are interpreted by the SHAP method, and a Vision Transformer
(ViT) that classifies binocular, multi-focused, particle images. We find
that the XGBoost has an overall classification accuracy of 0.77 (macro
F1-score), and specific features of color (hue_mean) and texture
(correlation) are the most discriminant between particle types.
Classification using the particle images and the ViT is more accurate
(macro F1-score of 0.93), with performances across eruptive styles from
0.85 in dome explosion, to 0.95 for phreatic and subplinian events.
Notwithstanding the success of the classification algorithms, the used
training dataset is limited in number of particles, ranges of eruptive
styles, and volcanoes. Thus, the algorithms should be tested further
with additional samples, and it is likely that classification for a
given volcano is more accurate than between volcanoes.