Enhancing debris flow warning via machine learning feature reduction and
model selection
Abstract
Machine learning can improve the accuracy of identifying mass movements
in seismic signals and extend early warning times. However, we lack a
profound understanding of the effective seismic features and the
limitations of different machine learning models, especially for
debris-flow warning. Here, we investigate the importance of seismic
features for the binary debris flow classification tasks using two
ensemble models: Random Forest (RF) and eXtreme Gradient Boosting (XGB)
models. We find that an established approach to training machine
learning models for debris flow classification task based on more than
seventy seismic signal features may be affected by redundant input
information. These seismic features are derived from physical and
statistical knowledge of impact sources and are grouped into waveform,
spectrum, spectrogram, and network sets. Our results show that only six
selected seismic features can perform similarly for the binary debris
flow classification task compared to published benchmark results trained
with seventy features. Considering models that aim to capture patterns
in sequential data rather than focusing on information only in one given
window as ensemble models, using the Long Short-Term Memory (LSTM)
algorithm does not improve the performance of binary debris flow
classification tasks over RF and XGB. As a debris flow alarm task, the
LSTM model predicts debris flow initiation more consistently and
generates fewer false warnings. Our proposed framework simplifies
seismic signal-driven early warning for debris flows and provides an
appropriate workflow for identifying other mass movements.