Paleoclimate Data Standards

From Linked Earth Wiki
Revision as of 05:32, 22 February 2017 by Jeg (Talk | contribs)

Jump to: navigation, search

Background

Why do we need standards?

This is a bit like asking why we need water. Modern life would simply be unlivable without standards. Imagine needed a separate browser for each web page your visit, or a a separate power-transmission system for every appliance you use! You only have to travel to a country that uses a different electric plug to appreciate this. In science, the ultimate objective of a standard to make data understandable by others (including machines), and the derived analyses reproducible. Thus, a key objective of LinkedEarth is to promote the development of a community standard for paleoclimate data.

What is a standard?

EarthCube defines a standard as follows:

a public specification documenting some practice or technology that is adopted and used by a community. [..] There is a continuum starting with any documented practice in some community.  If lots of people use a particular documented practice it could be adopted as a best practice. If almost everyone uses some documented practice, then it is a de facto standard.

Notice the emphasis on community and on practice. If only person uses a technical specification, it's not a standard. If it's voted on but not applied in practice, it's worthless as well. Thus, the objective of this EarthCube activity is to propose a standard with broad community appeal and adoption.

Prior Work

Despite some ad-hoc gatherings among communities of interest over many years, until recently there had never been a concerted effort to produce a standard applicable to all paleoclimate observations. Given the increased importance of synthesis work (e.g. PAGES2k, Shakun et al 2012, Marcott et al 2013, MARGO, others), it is increasingly important that a common solution be found.

The Linked Paleo Data (LiPD) format embodies one aspect of such a solution: it offers a container that can wrap tightly around a wide varieties of datasets, provides a stepping stone for this effort. However, LiPD's infinite flexibility means that all manner of information can be encapsulated in such a file, regardless of whether this aligns with community best practices. It is thus necessary for the community to decide on such practices. In other words, if LiDP provides a field-tested answer to the question: how should paleoclimate data be stored?, it says nothing about what should be stored: that decision is up to the community.

The 2016 workshop on paleoclimate data standards serves as a stepping stone to initiate a broader process of community engagement and feedback elicitation, with the goal of generating such a community-vetted standard. The workshop identified a need to delineate a set of essential, recommended and desired properties for each dataset. Three additional themes emerged:

  • Cross-Archive Standards
  • Archive-specific standards
  • Legacy vs Modern datasets


A consensus emerged that these levels are archive-specific, as what is needed to intelligently re-use a marine-annually resolved record could be quite different than what is needed to intelligently re-use an ice core record, for instance

Process for achieving a paleoclimate data standard

Attendees of the  2016 PDS workshop proposed that archive-centric working groups (WGs; self-assembled coalitions of knowledgeable experts) would be best positioned to elaborate and discuss the components of a data standard for their specific sub-field of paleoclimatology. It is also critical to ensure interoperability between standards to enable longitudinal (multiproxy) investigations.

This process contributes to the data stewardship initiative of our PAGES/Future Earth partners. Therefore, we are working together with PAGES to reach out to the broadest cross-section of paleoscientists and invite them to contribute to the process. The end goal is a standard to be precisely documented and adopted by LinkedEarth and PAGES. The standard will be implemented in all LinkedEarth activities and proposed for adoption to EarthCube, the Research Data Alliance, the Federation of Earth Science Information Partners, NOAA WDS-Paleo and Pangaea.

Standard Publication

Once the community has spoken on these matters, the decisions will be summarized in a publication.

A formal standard  is a specification of some practice that is adopted by a recognized standards body. The set of formal standards and set of de facto standards intersect, but are not the same; some formal standards are not very widely used. Nonetheless, because of the community participation and rigor required to formalize the standard we recognize that they merit careful evaluation. [1] 

In the internet age, a standard can be a web-based document that details all the specifications pertaining to a technical matter. However, to encourage participation and promote transparency, the LinkedEarth team decided that the standard should be published in a crowd-sourced peer-reviewed publication. Anyone contributing to the discussion on developing standards for paleoclimatology (either during our workshop on paleoclimate data standards, on the wiki, or teleconference) will be included in the list of authorships.

A scholarly product will be a peer-reviewed publication presenting the standard and detailing the decisions that led to it. Pursuant to PAGES policies, authorship will be extremely inclusive and acknowledge all scientific input into the process.