Abstract
Elevated seismic noise for moderate-size earthquakes recorded at
teleseismic distances has limited our ability to see their complexity.
We develop a machine-learning-based algorithm to separate noise and
earthquake signals that overlap in frequency. The multi-task
encoder-decoder model is built around a kernel pre-trained on local
(e.g., short distances) earthquake data
\cite{yin_multitask_2022} and is modified by
continued learning with high-quality teleseismic data. We denoise
teleseismic P waves of deep Mw5.0+ earthquakes and use the clean P waves
to estimate source characteristics with reduced uncertainties of these
understudied earthquakes. We find a scaling of moment and duration to be
$M_0\simeq \tau^{4.16}$, and a
resulting strong scaling of stress drop and radiated energy with
magnitude ($\sigma\simeq
M_0^{0.2}$ and $E_R \simeq M_0^{1.23}$).
The median radiation efficiency is 5\%, a low value
compared to shallow earthquakes. Overall, we show that deep earthquakes
have weak rupture directivity and few subevents, suggesting a simple
model of a circular crack with radial rupture propagation is
appropriate. When accounting for their respective scaling with
earthquake size, we find no systematic depth variations of duration,
stress drop, or radiated energy within the 100-700 km depth range. Our
study supports the findings of
\citeA{poli_global_2016} with a doubled amount of
earthquakes investigated and with earthquakes of lower magnitudes.