ARM Data Center: Data Discovery Updates, New Data Retrieval and Distribution Options, and use of the Digital Object Identifier in ARM

 

Authors

Ranjeet Devarakonda — Oak Ridge National Laboratory
Richard T. Cederwall — Oak Ridge National Laboratory
Kyle K Dumas — Oak Ridge National Laboratory
Kyle K Dumas (Quicklooks) — Oak Ridge National Laboratory
Alka Singh — Oak Ridge National Laboratory
Anthony Clodfelter — Oak Ridge National Laboratory
Giri Prakash — Oak Ridge National Laboratory

Category

ARM infrastructure

Description

The ARM Data Center (ADC: http://www.archive.arm.gov/), located at the Oak Ridge National Laboratory (ORNL), is responsible for providing end-to-end capabilities for ARM’s multi-dimensional climate data, including storing, managing, and distributing data. Every month, ADC gets tens of thousands of user download requests totaling 20 Terabytes (TB) to 25 TB/month. The popularity of the ARM data set results from many characteristics, but at the forefront is the careful consideration of community needs both in terms of data content and accessibility. Fundamental to this is adherence to data archive and distribution best practices by providing open, standardized, and self-describing data, which enables specialized tools and web services. In this poster, we discuss how ADC is “currently” cataloging, distributing, and visualizing and we describe some “new and creative” techniques that ADC is currently exploring, including OPeNDAP, THREDDS, DropBox, and Globus Online for cataloging and distributing such large volumes of multi-dimensional observations and model data. We will also discuss the importance and advantage of storing TB-scale, model data in a proper storage facility and formats and how ADC is using standard data access protocols and community-recognized formats as we deal with continually growing data volumes. The ARM Data Center established a data citation strategy based on Digital Object Identifiers (DOIs) for the ARM data sets in order to facilitate citing continuous and diverse ARM data sets in scientific articles and other papers. Accurate documentation allows other researchers to reproduce results published by users of ARM data. The ARM Data Center continues to remain aware of citation requirements and improvements outside ARM to help make our citation strategy more efficient for users. DOI procedures within ARM are being extended to capture data levels to accommodate more complete attribution to developers and PIs. Also, a consistent strategy is being developed to incorporate instrument handbooks and other key documentation for understanding the ARM data being used. Guidance information on use of DOIs is currently incorporated in the new ARM website design, and additional information is being developed to assist users. This poster gives an update on use of DOIs for ARM data, where key information can be found within ARM and DOI applications outside ARM (such as DataCite), and planned improvements. The poster is also an opportunity for the ARM Data Center to obtain user feedback.