Difference between revisions of "Creating a LiPD file"
(→What constitutes a measurement table?: description of a measurement table) |
(refined criteria) |
||
Line 29: | Line 29: | ||
=== General Guidelines === | === General Guidelines === | ||
− | ==== What | + | ==== What goes into a LiPD file?==== |
− | + | This is a trickier question than it appears at first. Consider two extremes: (1) every little data table could have its own LiPD file; (2) we could try and squeeze all the paleo data generated thus far into one giant LiPD file. Where is the happy medium? There are two ways to think about this: | |
+ | |||
+ | ===== Study Level ===== | ||
+ | All data and metadata that are part of the same study should be placed in the same LiPD file. There are exceptions to this rule of thumb. For instance, if the study involves two physical samples in drastically different locations (i.e., different regimes), then each physical sample and associated data and metadata should be placed in separate LiPD files. In other words, if the data from each specific physical sample can be reused on their own in another study, then each should be placed in its own LiPD files. | ||
+ | |||
+ | ===== Signal Level ===== | ||
+ | All the paleo data recording the same environmental signal (ii.e. having the same [[:Category:Interpretation_©]]). Again, there are exception, such as studies done at the same site by different groups and very different points in time. | ||
+ | Follow-up studies where one investigator goes back to the same site to expand the dataset (e.g. longer core/higher resolution sampling) probably warrants a new LiPD file, unless the results don't lead to any science, in which case it qualifies more as a "replication" study. | ||
Examples: | Examples: | ||
Line 50: | Line 57: | ||
* Marine sediments with different oceanographic regimes | * Marine sediments with different oceanographic regimes | ||
* Corals from different islands. | * Corals from different islands. | ||
+ | |||
+ | On the whole: there are no hard and fast rules, and feedback is welcome. | ||
==== What constitutes a [[:Category:MeasurementTable © | measurement table]]? ==== | ==== What constitutes a [[:Category:MeasurementTable © | measurement table]]? ==== |
Revision as of 21:17, 11 April 2017
The most straightforward way to upload a dataset onto the wiki is to first create a LiPD file and upload it directly.
Contents
What is LiPD?
LiPD (Linked Paleo Data) is a convenient way to store and exchange paleoclimate data format and provides the backbone of the LinkedEarth edifice. LiPD is closely aligned with the LinkedEarth Ontology; changes in one are mirrored in the other.
How to read a LiPD file?
LiPD was designed so that is can capture much richer sets of (meta)data than ASCII or Excel files and to have a fixed backbone around which scientific codes can be built. There is a price to pay for this power: LiPD is undoubtedly more difficult to interact with than a plain text file. Although it is possible to unzip a LiPD file and navigate through the native JSON-LD and csv files, this not the best way to harness the power of LiPD files.
The easiest way to interact with a LiPD file is by using this very wiki, which allows you to navigate the hierarchical structure of the file easily.
In addition, we have developed several utilities to read and write LiPD files in Matlab, Python, and R.
What can I do with a LiPD file?
LiPD was designed to facilitate coding around paleoclimate data. We have already developed software in R and Python to analyze and visualize paleoclimate data:
In addition, CSciBox (an integrated system for age-model reconstruction) makes use of LiPD.
How do I get my data into LiPD?
As of April 2017, the most efficient way to get you paleoclimate dataset in LiPD format is to fill our our Excel Template [link] and use the Python LiPD Utilities to convert the template into a LiPD file. Make sure you are using the latest version of the template for compatibility.
By the end of 2017, a web-based interface should be able to automate a lot of the manual steps.
General Guidelines
What goes into a LiPD file?
This is a trickier question than it appears at first. Consider two extremes: (1) every little data table could have its own LiPD file; (2) we could try and squeeze all the paleo data generated thus far into one giant LiPD file. Where is the happy medium? There are two ways to think about this:
Study Level
All data and metadata that are part of the same study should be placed in the same LiPD file. There are exceptions to this rule of thumb. For instance, if the study involves two physical samples in drastically different locations (i.e., different regimes), then each physical sample and associated data and metadata should be placed in separate LiPD files. In other words, if the data from each specific physical sample can be reused on their own in another study, then each should be placed in its own LiPD files.
Signal Level
All the paleo data recording the same environmental signal (ii.e. having the same Category:Interpretation_©). Again, there are exception, such as studies done at the same site by different groups and very different points in time. Follow-up studies where one investigator goes back to the same site to expand the dataset (e.g. longer core/higher resolution sampling) probably warrants a new LiPD file, unless the results don't lead to any science, in which case it qualifies more as a "replication" study.
Examples:
All data and metadata should be in the same file for the following studies:
- Lake cores from the same lake
- Speleothems from the same cave
- Ice cores from the same hole
- Marine sediments from the same hole (IODP), same location (multi-core, piston core/gravity cores)
- Corals from the same head
- Trees from the same geographical region
- Lake cores from different lakes but with the same climate interpretation. For instance, a regional composite.
- Speleothems from different caves with the same climatology
Data and metadata should be in different files for the following studies:
- Speleothems from different caves in different monsoon regimes
- Lake cores from different lakes with different catchment basins
- Marine sediments with different oceanographic regimes
- Corals from different islands.
On the whole: there are no hard and fast rules, and feedback is welcome.
What constitutes a measurement table?
Simply put, one table/physical sample. So if a study uses two speleothems, the measurements for each sample should be reported in two different tables.
A good rule of thumb is to ask: How is the data going to be reused? For instance, if radiocarbon chronologies for different cores are meant to be independent of each other, then each physical sample should get their own measurement table. On the other hand, if a composite depth is used, then the measurements for each physical sample can be placed in the same table.