Robert A Granat

and 9 more

We present a data-driven approach to clustering or grouping Global Navigation Satellite System (GNSS) stations according to their observed velocities, displacements or other selected characteristics. Clustering GNSS stations has the potential for identifying useful scientific information, and is a necessary initial step in other analysis, such as detecting aseismic transient signals (Granat et. al., 2013). Desired features of the data can be selected for clustering, including some subset of displacement or velocity components, uncertainty estimates, station location, and other relevant information. Based on those selections, the clustering procedure autonomously groups the GNSS stations according to a selected clustering method. We have implemented this approach as a Python application, allowing us to draw upon the full range of open source clustering methods available in Python’s scikit-learn package (Pedregosa et. al., 2011). The application returns the stations labeled by group as a table and color coded KML file and is designed to work with the GNSS information available from GeoGateway (Heflin et. al., 2020; Donnellan et al, 2021) but is easily extensible. We focused on California and western Nevada. The results show partitions that follow faults or geologic boundaries, including for recent large earthquakes and post-seismic motion. The San Andreas fault system is most prominent, reflecting Pacific-North American plate boundary motion. Deformation reflected as class boundaries is distributed north and south of the central California creeping section. For most models the southernmost San Andreas fault connects with the Eastern California Shear Zone (ECSZ) rather than continuing through the San Gorgonio Pass.

John B. Rundle

and 4 more

We propose a new machine learning-based method for nowcasting earthquakes to image the time-dependent earthquake cycle. The result is a timeseries which may correspond to the process of stress accumulation and release. The timeseries is constructed by using Principal Component Analysis of regional seismicity. The patterns are found as eigenvectors of the cross-correlation matrix of a collection of seismicity timeseries in a coarse grained regional spatial grid (pattern recognition via unsupervised machine learning). The eigenvalues of this matrix represent the relative importance of the various eigenpatterns. Using the eigenvectors and eigenvalues, we then compute the weighted correlation timeseries (WCT) of the regional seismicity. This timeseries has the property that the weighted correlation generally decreases prior to major earthquakes in the region, and increases suddenly just after a major earthquake occurs. As in a previous paper (Rundle and Donnellan, 2020), we find that this method produces a nowcasting timeseries that resembles the hypothesized regional stress accumulation and release process characterizing the earthquake cycle. We then address the problem of whether the timeseries contains information regarding future large earthquakes. For this we compute a Receiver Operating Characteristic and determine the decision thresholds for several future time periods of interest (optimization via supervised machine learning). We find that signals can be detected that can be used to characterize the information content of the timeseries. These signals may be useful in assessing present and near-future seismic hazard.