What If We Reconsidered How We Ask Scientists to Share Their Data: When
FAIR Meets Crowd-Sourcing and Nudge Theory
Abstract
Journals, funding agencies, and researchers are more frequently
expecting manuscripts to include links to shared research data.
Effective data sharing requires that data be findable, accessible,
interoperable, and reusable (FAIR), and is thus predicated on
establishing a common understanding on how to communicate: data exchange
standards, common data formats, controlled vocabularies, and a communal
data repository. When conducting research, we still communicate in
shorthand that is effective for everyone on the team who understands our
context, but is lost when data is shared in the absence of that context.
“Water temperature” means only one thing to my research team, yet can
mean dozens of things outside of that context. Data sharing is thus an
exercise in sharing not just the data, which is typically readily
available, but also the context of that data, which requires additional
effort. This effort is one of the barriers to sharing data. We’ll
describe an alternative model for accepting data to a repository: the
immediate ingestion of data regardless of its metadata quality, then
behavioural nudges and crowd-sourcing features that ensure this data
meets appropriate standards prior to publication. We’ll show a
work-in-progress prototype software tool that supports this alternative
model, capable of accepting and standardizing a research data set to use
CF conventions and ISO 8601 dates.