loading page

Clustering Analysis Methods for GNSS Observations: A Data-Driven Approach to Identifying California’s Major Faults
  • +7
  • Robert A Granat,
  • Andrea Donnellan,
  • Michael B. Heflin,
  • Gregory Lyzenga,
  • Margaret T Glasscoe,
  • Jay Parker,
  • Marlon Pierce,
  • Jun Wang,
  • John B. Rundle,
  • Lisa Grant Ludwig
Robert A Granat
City College of New York
Author Profile
Andrea Donnellan
Jet Propulsion Laboratory, California Institute of Technology & University of Southern California

Corresponding Author:[email protected]

Author Profile
Michael B. Heflin
Jet Propulsion Laboratory, CALTECH
Author Profile
Gregory Lyzenga
Jet Propulsion Laboratory, California Institute of Technology
Author Profile
Margaret T Glasscoe
Jet Propulsion Laboratory
Author Profile
Jay Parker
Jet Propulsion Lab (NASA)
Author Profile
Marlon Pierce
Indiana University Bloomington
Author Profile
Jun Wang
Indiana University Bloomington
Author Profile
John B. Rundle
University of California - Davis
Author Profile
Lisa Grant Ludwig
University of California, Irvine
Author Profile


We present a data-driven approach to clustering or grouping Global Navigation Satellite System (GNSS) stations according to their observed velocities, displacements or other selected characteristics. Clustering GNSS stations has the potential for identifying useful scientific information, and is a necessary initial step in other analysis, such as detecting aseismic transient signals (Granat et. al., 2013). Desired features of the data can be selected for clustering, including some subset of displacement or velocity components, uncertainty estimates, station location, and other relevant information. Based on those selections, the clustering procedure autonomously groups the GNSS stations according to a selected clustering method. We have implemented this approach as a Python application, allowing us to draw upon the full range of open source clustering methods available in Python’s scikit-learn package (Pedregosa et. al., 2011). The application returns the stations labeled by group as a table and color coded KML file and is designed to work with the GNSS information available from GeoGateway (Heflin et. al., 2020; Donnellan et al, 2021) but is easily extensible. We focused on California and western Nevada. The results show partitions that follow faults or geologic boundaries, including for recent large earthquakes and post-seismic motion. The San Andreas fault system is most prominent, reflecting Pacific-North American plate boundary motion. Deformation reflected as class boundaries is distributed north and south of the central California creeping section. For most models the southernmost San Andreas fault connects with the Eastern California Shear Zone (ECSZ) rather than continuing through the San Gorgonio Pass.
Nov 2021Published in Earth and Space Science volume 8 issue 11. 10.1029/2021EA001680