The preliminary timeline of WG activities is defined [http://linked.earth/projects/data-standards/ here] but will evolve as work begins.
The preliminary timeline of WG activities is defined [http://linked.earth/projects/data-standards/ here] but will evolve as work begins.
−
−
== Getting Started with standards==
−
−
Getting started with creating a standard for the entire community can be seen as a daunting task. The most obvious questions are "what does a standard actually look like?", "Where do I start?".
−
−
=== Where do I start?===
−
−
The ultimate goal of LinkedEarth is to provide scientists with tools to make better science, including allowing scientists to upload the metadata they need to make their science reproducible, allows complex queries on the system to easily retrieve datasets, and provide packages in R and Python for the analysis of these datasets using the standards developed by the community.
−
−
==== Uploading the needed metadata ====
−
−
The strength and appeal of the field of paleoclimatology reside in its diversity. This community relies on the work of geochemists, geologists, computer modelers, statisticians ... to understand past climates. This diversity is also reflected in the datasets generated within the community. Therefore, a one-size-fit-all template for all paleoclimate datasets isn't useful. The LinkedEarth wiki allows to store the metadata each community needs to represent their data. For instance, the cleaning methodology may be important to the foraminiferal Mg/Ca community but completely irrelevant to the tree-ring community. Therefore, one possible way to create a standard is to ask oneself: "What pieces of metadata would I require to reproduce this particular dataset and therefore which one should I provide for my own datasets?"
−
−
Some of these metadata will be standard across all archives (i.e., geographic coordinates, publication information). Some will be archive (and even observation)-specific. See [[:Category:Marine_Sediment_Working_Group | here]] for the beginning of a discussion on foraminiferal Mg/Ca.
−
−
==== Querying the datasets ====
−
−
Another way to think about the essential/recommended/optional metadata is to think about the kind of queries one would want to enable for their research. For instance, [http://linked.earth/wp-content/uploads/2016/01/Khider_AGUFall16.pdf Testing the Millennial-Scale Solar-Climate Connection in the Indo-Pacific Warm Pool] required the following query and associated metadata:
Finally, this problem can be approached from a data analysis point-of-view. "Given my interest, what are the required information I would need to perform my analysis?"
−
−
For instance, if all is know about a dataset is the geographical coordinates from which the archive was taken, then the only possible analysis is to create a location map of said archive (with or without its nearest neighbor contained in the database.) On the other hand, if an age and inferred variable are contained within the PaleoDataTable, the resulting time series can be used for correlation analysis and spectral analysis. However, without the raw data, it would be impossible to recalibrate the record using updated techniques, further limiting its usefulness.
−
−
For the Holocene study, the Dataset had to contain the raw radiocarbon measurements for use with Bchron.
−
−
== Prior discussions on Standards Developments in Paleoclimatology ==
−
−
We welcome any summaries of prior discussions on data standards you may have had at meetings, workshops. To do so, either link to a document that you have uploaded on the wiki or create a new page and list it below. Sign and data using the following notation: <pre>~~~~</pre>
−
−
- [[media: Reporting_Standards_for_Paleoceanographic_PMIP3_Dec2013.docx | PMIP3 Workshop in Corvallis]] (December 2013) [[User:Khider|Deborah Khider]] ([[User talk:Khider|talk]]) 17:00, 30 September 2016 (PDT)
In the Linked Earth context, a working group (WG) is a self-organized coalition of knowledgeable experts who elaborate and discuss the components of a data standard for their specific sub-field.
Working groups can be created by any member of the LinkedEarth community to discuss topics of interest.
Each WG page needs the following elements:
A list of the various group Members
Polls on specific questions.
A log of decisions made (e.g., table with dates and content of discussions)
On your profile page, you have the option of joining working groups simply by inputting their names in the "Working Groups" tab.
The only WGs to pop up are those currently in existence. If the one you want doesn't exist yet, please create it.
Note that joining a working group adds all pages belonging to that working group category to be added to the user's watchlist.
Joining a working group doesn't commit you to participate in every discussion on the subject. Rather, it lends support to the need for the creation of a standard and showing confidence in the community.
How to create WG?
To create a WG, create a new page and tag it as a sub category of the WG category. The list of working groups will be shown automatically at the bottom of this page (as subcategories of the Working_Groups category).
When you want to categorize any wiki page as belonging to that working group, add the working group category to the wiki page.
Working Group Charter
Membership
Membership of each WG is open to all wiki users, with some monitoring from WG coordinators. The coordinators are community members of recognized expertise who have volunteered to serve in this position. Their role is to organize discussions and check progress of other WGs to ensure maximum uniformity.
Aims
The primary goal of WGs is to standardize how paleoclimate data are described and shared.
The process is defined here and the specific charge emanating from the Paleoclimate Data Standards workshop is outlined on this page. Each WG should take up these questions as relevant to their archive of choice; to ensure that common solutions are designed, WG coordinators will regularly check-in with each other and with LinkedEarth leadership.
Decision-process
Each WG operates by consensus. Wiki polls track the answers to specific questions.
Timeline
The preliminary timeline of WG activities is defined here but will evolve as work begins.
Cross-Archive Metadata
This section is dedicated to the discussion of metadata that would apply for all the paleoclimate archives. For discussion on a specific archive, see the archive working group.
How this section is organized
The section is structured around polls to gather community votes on whether a particular metadata property should be considered essential, required, or desired metadata (for a definition of these terms, see this page).
The polling is divided around two "types" of datasets: New Datasets and Legacy Datasets.
The subsections are organized around the categories available on the wiki.
Click on the links to see the current definition of the term. If you disagree on the definition, please use the Discussion page on the term.
If you think a metadata property is missing, add a poll directly in the appropriate section (don't forget to add the poll for both new and legacy datasets!)
If the page freezes and you're unable to vote, refresh the page!
Just a few questions to get you started. These concern both legacy and new datasets:
In answering this question, remember that all the datasets on the wiki are public. The LinkedEarth team is considering adding a feature where a dataset could be uploaded and kept private until publication for use with codes and softwares but this feature is not yet available.
Should the LinkedEarth wiki contain datasets:
You are not entitled to vote. You are not entitled to view results of this poll.
There were 12 votes since the poll was created on 19:18, 6 October 2016.
poll-id 08538E02A7A5FB47FF6F9EDF0F120D99
What type of datasets should be considered "legacy"
You are not entitled to vote. You are not entitled to view results of this poll.
There were 6 votes since the poll was created on 21:26, 7 November 2016.
poll-id 69B1FF54AECE1662C63EDBA771C4A2FE
This poll was also put on Twitter on November 7 2016.
Comments:
Simon Goring: "I would think that legacy data would be any data contributed that doesn't meet up with "new" standards." Twitter link
Kaustubh Thirumalai: "In the current stage of LinkedEarth - pre-digitization IMO. e.g. "Legacy Seismic" = data on tapes" Twitter link
PaleoData refers to the "y-axis" of the dataset and contains the information about paleoenvironment. Before answering the poll below, head to the Discussion page for an example of a dataset not containing y-axis information.
For new records, should the presence of PaleoData information be:
You are not entitled to vote. You are not entitled to view results of this poll.
There were 5 votes since the poll was created on 19:02, 6 October 2016.
poll-id 782F49FF9A11ACB2A0F500780D2A0ACC
Depth vs age
The information obtained from the archive is taken at a certain "depth".
For new datasets, should PaleoData depth be:
You are not entitled to vote. You are not entitled to view results of this poll.
There were 5 votes since the poll was created on 20:30, 6 October 2016.
poll-id 7BDBEB57F6140BCEF04271EF58F146C1
For new datasets, should the PaleoData contain information about how this "depth" was transformed to age, and if possible give an estimate of age at each horizon in the archive.
You are not entitled to vote. You are not entitled to view results of this poll.
There were 5 votes since the poll was created on 19:41, 6 October 2016.
ChronData refers to the "x-axis" of the dataset and contains the information about the age model. Before answering the polls below, head to the Discussion page for an example of a dataset not containing x-axis information.
For new datasets, should the presence of ChronData information be (assuming there is a chronology, maybe in addition to depth, in the PaleoData):
You are not entitled to vote. You are not entitled to view results of this poll.
There were 6 votes since the poll was created on 20:49, 6 October 2016.
poll-id 2D41F92772AC4741B52137EFCE0AB9BB
For new datasets, should the presence of ChronData information be (assuming no chronology in the PaleoData):
You are not entitled to vote. You are not entitled to view results of this poll.
There were 6 votes since the poll was created on 20:49, 6 October 2016.
PaleoData refers to the "y-axis" of the dataset and contains the information about paleoenvironment. Before answering the poll below, head to the Discussion page for an example of a dataset not containing y-axis information.
For legacy datasets, should the presence of PaleoData information be:
You are not entitled to vote. You are not entitled to view results of this poll.
There were 4 votes since the poll was created on 20:34, 6 October 2016.
poll-id 73A7170BBB4982C1F9480440F2579468
Depth vs age
The information obtained from the archive is taken at a certain "depth".
For legacy datasets, should PaleoData depth be:
You are not entitled to vote. You are not entitled to view results of this poll.
There were 4 votes since the poll was created on 20:34, 6 October 2016.
poll-id E6CC0641CF9DBDB4D7305F9ECE25A6BC
For legacy datasets, should the PaleoData contain information about how this "depth" was transformed to age, and if possible give an estimate of age at each horizon in the archive.
You are not entitled to vote. You are not entitled to view results of this poll.
There were 4 votes since the poll was created on 20:34, 6 October 2016.
ChronData refers to the "x-axis" of the dataset and contains the information about the age model. Before answering the polls below, head to the Discussion page for an example of a dataset not containing x-axis information.
For legacy datasets, should the presence of ChronData information be (assuming there is a chronology, maybe in addition to depth, in the PaleoData):
You are not entitled to vote. You are not entitled to view results of this poll.
There were 3 votes since the poll was created on 20:37, 6 October 2016.
poll-id 7EE181C479EFAEF5AA284F9EFD3BE5D0
For legacy datasets, should the presence of ChronData information be (assuming no chronology in the PaleoData):
You are not entitled to vote. You are not entitled to view results of this poll.
There were 3 votes since the poll was created on 20:38, 6 October 2016.