Roland Schweitzer

and 5 more

The Unified Access Framework (UAF) project of the NOAA Global Earth Observation - Integrated Data Environment (GEO-IDE) in an on-going effort to provide access to NOAA-wide data in a way that is FAIR and meets PARR requirements. The first priority of UAF is to copy success. We recognize: data that follows the Climate and Forecast netCDF convention is readily used by working scientists; THREDDS Data Servers and ERDDAP servers are a popular ways to serve such data; these servers can be interrogated by software to determine that the data follows the conventions and the servers can be federated. To make the collection we construct a master “raw” catalog of candidate data set from THREDDS servers around NOAA and other organizations. The raw catalog is examined by custom software to eliminate large data collections which are not aggregated in time and organize the results into a “clean” catalog. The catalog is then examined by ERDDAP to provide ERDDAP GridDAP access and to verify that the data sources follow the CF convention. The gridded data sets are merged into a collection of TableDAP (netCDF Discrete Sampling Geometry) data sources. Currently the UAF ERDDAP server is home to 10,712 data sets. After the UAF ERDDAP server has examined the data collection, a Live Access Server (LAS) is configured to offer data analysis and visualization access to all the data sets. The final piece of the puzzle is to make the data FAIR and to achieve PARR compliance. This requires some tools that have been adapted and developed for this purpose. We resurrected the ncISO tool which can examine the contents of CF netCDF data sources and create ISO metadata and score the data according the the Unidata Attribute Convention for Data Discovery. We can help the centers hosting the data meet their PARR requirements by properly integrating the resulting metadata from ncISO into NOAA’s central data catalog. We have recently updated the templates which are used to generate the metadata to insure they are meeting the latest ISO and ADDC specifications. Work is underway at NOAA and Unidata to integrate the ncISO code back into the GitHub repository for the THREDDS Data Server. This will bring together two disparate ncISO implementations. UAF is a few people working a few hours a month to maintain and large and useful data collection and in this talk we’ll tell you how we do it.

Roland Schweitzer

and 3 more

ERDDAP is a data server developed at NOAA Southwest Fisheries Science Center that allows users to easily access and download subsets of gridded and tabular data. At the Scientific Data Integration Group at NOAA’s Pacific Marine Environmental Laboratory we have focused on ERDDAP as the primary mechanism by which we serve all of our tabular data. Besides offering data access in a variety of useful formats, ERDDAP takes care of normalizing the data structure so that all tabular data sets have a simple row and column structure. Additionally, time values are normalized to have a consistent units across all data sets on the server. The simplified tabular nature of the discrete data in ERDDAP lends itself nicely to use with the Google Visualization API (Google Charts). Charts is an API which renders nearly all of the graphics in the browser client. The basis of all charts is the Google Charts DataTable which is exactly analogous to a TableDap table in ERDDAP. Charts are interactive, visually pleasing, free and easy to use and so connecting ERDDAP data to Google Charts is a natural fit. Even though ERDDAP has many output formats including several different JSON renderings, none fit exactly into the specified Charts data types. Initially, as a result, it was necessary to create custom code (either in the client or on a server between the client and the ERDDAP server) to manipulate the ERDDAP data stream to match the Charts API requirements. However, since Charts has a well-documented specification for a JSON representation of the required data type and because ERDDAP is an open source project, it was possible to extend ERDDAP to support the Charts DataTable as a new data type. Using this new data type, it’s possible to instantiate a Charts DataTable object directly from the ERDDAP URL using a simple Ajax call in the client which downloads the JSON DataTable representation. Because the newly added data type is now native to ERDDAP, it is possible to use default constraint syntax and ERDDAP RESTful services (using the ERDDAP user interface if desired) to build the URL for the exact data subset which is to be visualized by the desired google chart. In this presentation, we’ll explain what we’ve done to add the JSON data type to ERDDAP and provide many examples of client-side rendering of earth science and engineering data using Google Charts directly from ERDDAP URLs.