Abstract
Formal data citation is a growing practice increasingly required by
scientific journals. Roughly a decade ago, the Federation of Earth
Science Information Partners (ESIP) began developing formal guidelines
for data citation including acknowledgement of authors and archives and
careful use of persistent identifiers (PIDs). Many Earth science data
centers now provide formal citation text and PIDs for their data sets,
typically a Digital Object Identifier (DOI). A central purpose of data
citation (amongst many) is to aid scientific reproducibility through
direct, unambiguous reference to the precise data used in a particular
study, i.e., to aid provenance tracking. How has that worked in
practice? ESIP is now in the process of revising and updating their
guidelines and seeks to ensure that data citation meets its stated
purpose. This presentation explores whether and how formal citation and
the the use of PIDs for data sets has improved the tracking of data
provenance. For example, is there is some commonality in the nature and
granularity of objects that are assigned PIDs? We review how the
guidelines are being revised to further enhance the transparency and
reusability of data.