Difference between revisions of "Best Practices"
m (→Uploading a dataset for the first time on the wiki: update draft) |
m (→Versioning system: fix links) |
||
Line 38: | Line 38: | ||
One of the properties of a [[:Category:Dataset (L) | dataset]] is the [[:Property:DatasetVersion (L) | dataset version]]. In LinkedEarth, the [[:Property:DatasetVersion (L) | dataset version]] follows the x.y.z notation where: | One of the properties of a [[:Category:Dataset (L) | dataset]] is the [[:Property:DatasetVersion (L) | dataset version]]. In LinkedEarth, the [[:Property:DatasetVersion (L) | dataset version]] follows the x.y.z notation where: | ||
− | * x refers to changes in metadata and data following a publication. Examples of such changes include the creation of a new age model as part of a compilation or comparison or changes in the way a [[:Category:MeasuredVariable (L) | measured variable]] is [[:Property:CalibratedVia (L) | calibrated]] to obtain an [[:Category:InferredVariable (L) | inferred variable]] (i.e. applying a different [[:Category:CalibrationModel (L) | calibration model). | + | * x refers to changes in metadata and data following a publication. Examples of such changes include the creation of a new age model as part of a compilation or comparison or changes in the way a [[:Category:MeasuredVariable (L) | measured variable]] is [[:Property:CalibratedVia (L) | calibrated]] to obtain an [[:Category:InferredVariable (L) | inferred variable]] (i.e. applying a different [[:Category:CalibrationModel (L) | calibration model]]). |
* y refers to changes to the data following a publication. Examples include adding data further back in time without changing the [[:Category:Model (L) | model]] underlying the [[:Category:Interpretation (L) | interpretation]]. | * y refers to changes to the data following a publication. Examples include adding data further back in time without changing the [[:Category:Model (L) | model]] underlying the [[:Category:Interpretation (L) | interpretation]]. | ||
* z refers to changes not associated with a publication and includes typos, addition of metadata either lifted from the [[:Category:Publication (L) | publication]] or from the original [[:Property:Contributor (L) | contributor]] of the data (e.g., information from a laboratory notebook). | * z refers to changes not associated with a publication and includes typos, addition of metadata either lifted from the [[:Category:Publication (L) | publication]] or from the original [[:Property:Contributor (L) | contributor]] of the data (e.g., information from a laboratory notebook). | ||
Line 44: | Line 44: | ||
After the initial [[Special:WTLiPD | upload]], set the [[:Property:DatasetVersion (L) | dataset version]] to '0.0.0'. | After the initial [[Special:WTLiPD | upload]], set the [[:Property:DatasetVersion (L) | dataset version]] to '0.0.0'. | ||
− | '''Note''': The [[:Property:DatasetVersion (L) | dataset version]] is different from the [[Property:CompilationVersion (L) | compilation version]]. The versioning system of each [[:Category:Compilation (L) | compilation]] is left at the discretion of the group who created the [[:Category:Compilation (L) | compilation]] but should be explained on the [[:Category:Compilation (L) | compilation]] page. | + | '''Note''': The [[:Property:DatasetVersion (L) | dataset version]] is different from the [[:Property:CompilationVersion (L) | compilation version]]. The versioning system of each [[:Category:Compilation (L) | compilation]] is left at the discretion of the group who created the [[:Category:Compilation (L) | compilation]] but should be explained on the [[:Category:Compilation (L) | compilation]] page. |
=== Uploading a dataset for the first time on the wiki === | === Uploading a dataset for the first time on the wiki === |
Revision as of 19:53, 19 April 2017
By design, the LinkedEarth wiki is a collaborative platform to edit paleoclimate datasets and contribute knowledge about the field. As such, anyone within the LinkedEarth community can edit datasets and most of the pages on this wiki (with the exception of pages with a copyright sign, see this page for an explanation.). This page is meant as a best practice guide for creating new pages and modifying existing ones. Specifically, we propose guidelines for:
- Editing existing datasets by third-party contributors
- Naming pages with a unique identifier
- Version the datasets following changes to model outputs (e.g., inferring new temperatures from existing raw measurements) and changes to the raw measurements.
We expect this guide to be updated often as new datasets are added and needs arise, so please check for updates regularly.
Contents
Datasets
The following section aims to provide guidelines on creating new dataset or editing existing wiki pages, including datasets used in compilations.
Question | Link to Answer |
---|---|
What constitutes a dataset? | See this page. |
What constitutes a data table? | See this page. |
Updating datasets following a compilation | |
Updating datasets following the creation of a new model output | |
Updating datasets following the creation of new raw measurements |
New vs legacy datasets
New datasets are datasets that have recently been published and are often contributed by the original contributor of the study or someone closely associated with the creation of the datasets. This definition also includes older datasets that the PI may have placed on other public databases or have not come around to upload anywhere yet. In this instance, the contributors and the LinkedEarth member uploading the dataset may be the same.Therefore, most of the metadata fields can be filled by the person who was involved in the study since he/she might have the information readily available.
Legacy datasets are datasets that are publicly available (i.e., either on another database or published under U.S. funding) and are contributed by a LinkedEarth member not originally involved in the creation of the dataset. For datasets that are not publicly available (i.e., emailed directly to the LinkedEarth member by the original contributors), we recommend informing the contributors of your intent to upload their dataset on the LinkedEarth wiki.
The guidelines suggested below apply to both new and legacy datasets.
Versioning system
One of the properties of a dataset is the dataset version. In LinkedEarth, the dataset version follows the x.y.z notation where:
- x refers to changes in metadata and data following a publication. Examples of such changes include the creation of a new age model as part of a compilation or comparison or changes in the way a measured variable is calibrated to obtain an inferred variable (i.e. applying a different calibration model).
- y refers to changes to the data following a publication. Examples include adding data further back in time without changing the model underlying the interpretation.
- z refers to changes not associated with a publication and includes typos, addition of metadata either lifted from the publication or from the original contributor of the data (e.g., information from a laboratory notebook).
After the initial upload, set the dataset version to '0.0.0'.
Note: The dataset version is different from the compilation version. The versioning system of each compilation is left at the discretion of the group who created the compilation but should be explained on the compilation page.
Uploading a dataset for the first time on the wiki
We strongly recommend first creating a LiPD file rather than entering all the data and metadata from scratch on the wiki. As of April 2017, the most expeditious way to convert your data into the LiPD format is to use our Excel Template (File:LiPDv1.2 template.xlsx) and the Python LiPD Utilities. This guide will assist you in entering the necessary data and metadata information.
Once your dataset is in LiPD format, you can upload it on the wiki. This will automatically create most of the pages. Check that all the information is correct and once satisfied, update the dataset version to '0.0.0'.
If you decide to enter a dataset manually (not recommended):
- Upload your data in csv format using the 'Upload File' link in the sidebar. Make sure you name them appropriately by referring to the nomenclature section on this page. The wiki will suggest names for you to use.
- Create a new page using the name SiteName.DatasetYear.ContributorName and set the Category of the new page to Category:Dataset (L). Note: To be able to create a page, you need to enter some text in the WikiText box. You'll be able to delete this extra text from the page after you create it by clicking on edit at the top of the page.
- The wiki will automatically suggest standard properties. Answer as many as possible. Note: If the answer to a Property results in the creation of a new class (i.e., the box doesn't specify text or number), then you'll be essentially creating a new wiki page. Follow our nomenclature. If you make a typo, just fixing the typo in the link will not automatically redirect the page. The best approach is to rename the landing page.
Changes to a dataset already on the wiki
For existing datasets, we recommend updating the data and metadata directly on the wiki rather than uploading a new LiPD file.
All changes to a dataset after the initial upload requires a change in the version of the file as outlined below. If you are planning to make a series of updates over the course of several days as part of the same work, only update the dataset version once you're trough with all the changes.
Original contributors
You can update your dataset at any times on the wiki. Just follow the versioning and nomenclature rules.
- To update data:
If you spot a mistake into the csv file that doesn't result in adding a variable and, therefore a column to the file, follow these steps:
- go to this page and search for the name of the csv file you need to update.
- Download the contributed csv file onto your computer by right-clicking on the name
- Make the necessary corrections to the file and save it, using the same file name
- To re-upload to the wiki, go back to file page from which you originally downloaded the file.
- Click on Upload a new version of this file at the bottom of the page.
Note: Only the original contributor to the data and the person uploading the dataset can override the original csv file.
Third Party contributors
Anyone with basic editor privilege can edit wiki pages. Below are suggested etiquette rules for editing datasets that you haven't originally contributed. If you're concerned about your own dataset, please remember that you can add these pages to your watchlist and receive an email every time an update is being made.
- Typo: The power of the wiki is to be able to fix minor mistakes on the fly. If you see a typo, correct it and update the dataset version using these rules. Examples of typos include, misspelling a standardized term. These include entering "MgCa" instead of "Mg/Ca" or "T" instead of "Temperature". See the Proxy Archive Ontology, the Proxy Observation Ontology, the Proxy Sensor Ontology, the Inferred Variable Ontology, and the Instrument Ontology for details.
- Changes to the originally-contributed data (i.e., the data stored in the .csv files).
By default, only the original contributor and the person who uploaded the data on the wiki can overwrite existing csv file. If you spot a mistake, first contact these persons. If you do not get an answer after 14 business days, contact the LinkedEarth team.
- Changes to already existing metadata:
Please contact the person who made the contribution on the wiki. If you do not get an answer after 14 business days, contact the LinkedEarth team.
- Adding new metadata, including if used in a compilation:
If you wish to add new metadata to an existing dataset, do so directly on the wiki.
Compilation
Nomenclature
Dataset Name
The naming convention for datasets is SiteName.FirstAuthor.PubYear. For instance, for this dataset, the SiteName is MD98-2181 (the name of the marine sediment core), the first author is Khider and the dataset was first published in 2014: MD982181.Khider.2014 as show on Figure 1.
If uploading from a LiPD file, the dataset name should be automatically filled out for you.
Following this convention from the start is important since many of the pages are named after the DatasetName.
Publication
Publications are identified on the wiki using their DOI. For instance, the Anand et al. (2003) publication corresponds to the page Publication.10.1029/2002PA000846.