Essential Site Maintenance: Authorea-powered sites will be updated circa 15:00-17:00 Eastern on Tuesday 5 November.
There should be no interruption to normal services, but please contact us at [email protected] in case you face any issues.

Valerie C Hendrix

and 13 more

Diverse, complex data are a significant component of Earth Science’s “big data” challenge. Some earth science data, like remote sensing observations, are well understood, are uniformly structured, and have well-developed standards that are adopted broadly within the scientific community. Unfortunately, for other types of Earth Science data, like ecological, geochemical and hydrological observations, few standards exist and their adoption is limited. The synthesis challenge is compounded in interdisciplinary projects in which many disciplines, each with their own cultures, must synthesize data to solve cutting edge research questions. Data synthesis for research analysis is a common, resource intensive bottleneck in data management workflows. We have faced this challenge in several U.S. Department of Energy research projects in which data synthesis is essential to addressing the science. These projects include AmeriFlux, Next Generation Ecosystem Experiment (NGEE) - Tropics, Watershed Function Science Focus Area, Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE), and a DOE Early Career project using data-driven approaches to predict water quality. In these projects, we have taken a range of approaches to support (meta)data synthesis. At one end of the spectrum, data providers apply well-defined standards or reporting formats before sharing their data, and at the other, data users apply standards after data acquisition. As these projects continue to evolve, we have gained insights from these experiences, including advantages and disadvantages, how project history and resources led to choice of approach, and enabled data harmonization. In this talk, we discuss the pros and cons of the various approaches, and also present flexible applications of standards to support diverse needs when dealing with complex data.

Emily Robles

and 6 more

Quality metadata and data are critical to advancing science and preserving data for long-term use. The Next Generation Ecosystem Experiments (NGEE) Tropics project funded by the U.S. Department of Energy generates and utilizes ecological, hydrological, and meteorological data from tropical forests for scientific analysis and model parameterization. The project’s data team manages an archive for users to internally curate and publish data with a digital object identifier (DOI). A key focus of our project is to ensure NGEE Tropics data can be interpreted and utilized by current and future research teams. However, the education and participation of project members to prioritize and be involved in data curation is necessary to reach this goal. We have taken an interdisciplinary approach involving domain and data scientists to create a process that makes it easy for scientists to curate high-quality data packages for archival. First, the NGEE Tropics Archive and metadata reporting templates (FRAMES) were designed using user-experience research methods to incorporate user feedback through interviews and surveys. Upon submission of data packages, thorough checks are performed to ensure quality expectations are met. Each dataset is curated individually, and feedback is provided directly to scientists to identify the optimal data organization for their packages. The data team also provides training to project members using presentations, tutorials, and 1:1 training. As a result of our efforts, package and file-level metadata reporting to the NGEE Tropics archive fits within the existing workflow of scientists, establishing data curation as a core aspect of research. By educating the NGEE Tropics team through integration and communication, we have enabled the production of quality data packages that are findable, accessible and usable by any member of the public. This work will enhance the legacy of NGEE Tropics, and provide a lasting resource for the tropical research community.