Difference between revisions of "Dataset Tutorial"

From Linked Earth Wiki
Jump to: navigation, search
(Move the creating a personal page to the getting started guide)
(Update information with new ontology)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
This tutorial is meant to get you started with uploading and editing a dataset onto the LinkedEarth platform. The LinkedEarth Wiki is based upon semantic [https://www.mediawiki.org/wiki/MediaWik iMediaWiki] and therefore uses the MediaWiki language. If you are new to Wiki formatting, take a few minutes to learn how to edit pages, create new pages, use links on the [https://www.mediawiki.org/wiki/Help:Formatting Help Page].
+
This tutorial is meant to get you started with uploading and editing a dataset onto the LinkedEarth platform.  
  
 
After completing this tutorial, you will be able to:  
 
After completing this tutorial, you will be able to:  
*LiPD and LinkedEarth: Upload LiPD datasets and enter basic metadata for the record.
+
* Upload LiPD datasets and enter basic metadata for the record.
 
*Annotate a dataset: Create and reuse properties for annotation.  
 
*Annotate a dataset: Create and reuse properties for annotation.  
*Advanced Functionalities: Create and edit wiki pages, special pages, and uploading files.
+
 
 +
When annotating a dataset, the LinkedEarth wiki follows some '''nomenclature rules''' that you can find on [[Best Practices | this page]].
  
 
= LiPD and LinkedEarth =
 
= LiPD and LinkedEarth =
  
The [http://linked.earth/projects/ontology/ LinkedEarth Ontology] represents the backbone of the wiki. The ontology allows us to not only define terms commonly used to describe a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also specify the relationship among these terms (e.g., a variable has uncertainty). As such, it allows us to make inferences, support complex queries, as well as perform quality control on the data.  
+
The [[LinkedEarth Ontology]] represents the backbone of the wiki. The ontology allows us to not only define terms commonly used to describe a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also specify the relationship among these terms (e.g., a variable has uncertainty). As such, it allows us to make inferences, support complex queries, as well as perform quality control on the data. '''Remember that no formal knowledge about ontologies is required to use and contribute to the wiki!'''
  
The LinkedEarth ontology was developed from the [http://linked.earth/projects/lipd/ LiPD] format championed by Nick McKay and Julien Emile-Geay. Therefore, the wiki platform is currently optimized to accept data already in the LiPD format. There are several ways to convert your dataset to LiPD. Go to the [http://lipd.net LiPD webpage] for more information.  
+
The LinkedEarth ontology was developed from the [[Linked Paleo Data]] (LiPD) format championed by Nick McKay and Julien Emile-Geay. Therefore, the wiki platform is currently optimized to accept data already in this format. To convert you dataset into a LiPD file, follow [[Creating a LiPD file | this guide]].  
  
 
== Uploading a LiPD file ==
 
== Uploading a LiPD file ==
You need to be logged in to upload a LiPD file, using a [[Special:WTLiPD | special page]] dedicated to the management of datasets already in the LiPD format.
+
You need to be '''logged in''' to upload a LiPD file, using this [[Special:WTLiPD | special page]] dedicated to the management of datasets already in the LiPD format. This page can be access directly from the left sidebar: [[Special:WTLiPD | Managing LiPD files]].
 +
 
 
Select the browse button and choose the .lpd file you want to upload as shown in Figure 1:
 
Select the browse button and choose the .lpd file you want to upload as shown in Figure 1:
  
 
[[File:TutorialFig2.png|thumb|none|1000px|Figure1: Manage LiPD Data page]]
 
[[File:TutorialFig2.png|thumb|none|1000px|Figure1: Manage LiPD Data page]]
  
Dataset pages will be automatically created from the content of the LiPD file and your dataset will appear at the top of the "Current LiPD Dataset list" on the [[Main Page]]. By clicking on the dataset link, you will be able to see the automatically extracted data and metadata from the LiPD file, as shown in Figure 2. New "crowd" properties will be automatically created from the LiPD file if these properties are not in the current core ontology.  
+
Dataset pages will be automatically created from the content of the LiPD file and your dataset will appear at the top of the "Current LiPD Dataset list" on the [[Main Page]]. By clicking on the dataset link, you will be able to see the automatically extracted data and metadata from the LiPD file, as shown in Figure 2. New "crowd" properties will be automatically created from the LiPD file if these properties are not in the current ontology.  
  
 
[[File:TutorialFigure3.png|thumb|none|1000px|Figure2: Wiki page created automatically from the metadata of the LiPD dataset]]
 
[[File:TutorialFigure3.png|thumb|none|1000px|Figure2: Wiki page created automatically from the metadata of the LiPD dataset]]
  
 
Congratulations! Your LiPD file has been successfully added to the Linked Earth wiki.
 
Congratulations! Your LiPD file has been successfully added to the Linked Earth wiki.
 
'''Exercise: upload your own LiPD file through the "manage dataset" page in the wiki.'''
 
  
 
== Annotating a LiPD file ==
 
== Annotating a LiPD file ==
Once a LiPD file is uploaded, the metadata about that file is shown in a table with two columns. The column on the left of each table contains the properties describing the LiPD file, while the column on the right states the value associated with each property. Figure 3 shows an example, where the "archive type" of CAN9Neukom2014 is "Tree".
+
Once a LiPD file is uploaded, the metadata about that file is shown in a table-like manner on the page with two columns. The column on the left of each table contains the properties describing the LiPD file, while the column on the right states the value associated with each property. Figure 3 shows an example, where the "archive type" of CAN9Neukom2014 is "Tree". You can also click on the property to access its definition in the [[LinkedEarth Ontology]]. Properties followed by (L) indicates that they are part of the [[Linked Paleo Data | LiPD architecture]] and, therefore, cannot be directly edited by basic editors of the LinkedEarth community.  
  
 
Please check that the metadata for your record is correct and edit it if appropriate. For instance, to add an investigator to the CAN9Neukom2014 dataset in Figure 2, click on its corresponding row as indicated in Figure 3. Then type the value of the property. In this example, we added "Daniel" as the investigator.
 
Please check that the metadata for your record is correct and edit it if appropriate. For instance, to add an investigator to the CAN9Neukom2014 dataset in Figure 2, click on its corresponding row as indicated in Figure 3. Then type the value of the property. In this example, we added "Daniel" as the investigator.
Line 39: Line 39:
 
Each property/property value added to the page is tracked on the page history. The page history is accessible through the "View history" button at the top of any wiki page, allowing to monitor the edits done by other wiki users. Figure 5 illustrates the edits for the CAN9Neukom 2014 dataset: the latest change added a propertyValue adding "Daniel" as an investigator.  
 
Each property/property value added to the page is tracked on the page history. The page history is accessible through the "View history" button at the top of any wiki page, allowing to monitor the edits done by other wiki users. Figure 5 illustrates the edits for the CAN9Neukom 2014 dataset: the latest change added a propertyValue adding "Daniel" as an investigator.  
  
LinkedEarth members who are at the dataset contributors level can only edit the properties associated with the dataset they contributed. Starting at the Basic Editor level, users can edit datasets contributed by other users. If you would like to become a basic editor, please [mailto:linkedearth@gmail.com email the Editorial Board].
+
Anyone with basic editor privileges and logged into the wiki can edit property values. Don't worry, all changes are revertible and you will get an automatic email if anyone changes a page in your watchlist.  
  
 
[[File:TutorialFig6.png|thumb|none|800px|Figure 5: History of the edits done to the CAN9Neukon2014 dataset]]
 
[[File:TutorialFig6.png|thumb|none|800px|Figure 5: History of the edits done to the CAN9Neukon2014 dataset]]
  
If there is a disagreement between two researchers, a discussion may be started on the "Discussion" page, as depicted on Figure 6. Contributors and Editors may edit any Discussion Pages on the wiki. To learn more about how to contribute and edit discussion pages, follow this [[Tutorial | Discussion Page Tutorial]].
+
If there is a disagreement between two researchers, a discussion may be started on the "Discussion" page, as depicted on Figure 6. To learn more about how to contribute and edit discussion pages, follow this [[Discussion Page Tutorial | tutorial]].
  
 
[[File:TutorialFig7.png|thumb|none|600px|Figure 6: Creating a discussion page on CAN9Neukon2014 dataset]]
 
[[File:TutorialFig7.png|thumb|none|600px|Figure 6: Creating a discussion page on CAN9Neukon2014 dataset]]
  
Contributions is tracked automatically on the wiki and displayed in the "Credit" section, which can be found at the bottom of each wiki page.
+
Contributions is tracked automatically on the wiki and displayed in the "Credit" section, which can be found at the bottom of each wiki page.  
  
 
[[File:TutorialFig8.png|thumb|none|600px|Figure 7: credits of the CAN9Neukon2014 dataset]]
 
[[File:TutorialFig8.png|thumb|none|600px|Figure 7: credits of the CAN9Neukon2014 dataset]]
  
==Dataset versioning==
+
== Concept annotation in a LiPD file ==
In LiPD all the uploaded datasets should follow a x.y.z notation, where "x" refers to important changes in the dataset's metadata (e.g., the creation of a new age model using a different code), "y" refers to changes to the data following a publication (e.g., adding data further back in time without changing the model underlying the interpretation) and z refers to minor changes not associated with a publication (e.g., typos). For example, the first official release of a dataset would be 1.0.0. If I fix a small typo, I would create version 1.0.1.
+
  
'''Exercise: Annotate the version of the dataset in the recently created page, following the x.y.z notation and using the property "datasetVersion".'''
 
 
== Concept annotation in a LiPD file ==
 
 
As shown in Figures 3 and 4, some of the annotated values like "Tree" already have links to other pages.  These pages can be further populated and edited by domain experts. To do so click on the Edit tab at the top of the page as shown on Figure 8 for the "Tree" archive. The article was created as a stub, awaiting domain experts' contribution, further gathering field knowledge.  
 
As shown in Figures 3 and 4, some of the annotated values like "Tree" already have links to other pages.  These pages can be further populated and edited by domain experts. To do so click on the Edit tab at the top of the page as shown on Figure 8 for the "Tree" archive. The article was created as a stub, awaiting domain experts' contribution, further gathering field knowledge.  
  
To learn more about Wiki editing, visit the [Quick Guide to Editing Wiki Pages].
+
To learn more about Wiki editing, visit the [[Quick Guide to Editing Wiki Pages]].
  
 
By linking to other existing pages we can connect different LiPD datasets (e.g., if two different datasets are of the same archive type, they will link to the same page) and support queries. For instance, one can look for all the datasets on the LinkedEarth wiki using "Tree" as archives.   
 
By linking to other existing pages we can connect different LiPD datasets (e.g., if two different datasets are of the same archive type, they will link to the same page) and support queries. For instance, one can look for all the datasets on the LinkedEarth wiki using "Tree" as archives.   
  
Red links means that the page does not yet exist. Editors and contributors can create the new page and edit its content. An example can be seen on Figure 9.
+
Red links means that the page does not yet exist. Editors can create the new page and edit its content. An example can be seen on Figure 9.
  
 
[[File:TutorialFig9.png|thumb|none|700px|Figure 8: Tree concept definition on the Linked Earth wiki]]
 
[[File:TutorialFig9.png|thumb|none|700px|Figure 8: Tree concept definition on the Linked Earth wiki]]
  
 
[[File:TutorialFig10.png|thumb|none|700px|Figure 9: Creating a page for an unexisting concept (highlighted in red)]]
 
[[File:TutorialFig10.png|thumb|none|700px|Figure 9: Creating a page for an unexisting concept (highlighted in red)]]
 
'''Exercise: edit the page "Tree ring width", which does not have a definition at the moment, and add a test definition. Use this link as the "Archive Type" value on your uploaded LiPD dataset .'''
 
  
 
=Annotate a dataset=
 
=Annotate a dataset=
Line 81: Line 75:
  
 
Before adding a new property name, it is important to note that a similar property may already exist in the ontology to describe the same metadata. '''The properties are case-sensitive.''' For instance, imagine that we want to add a "description" to the dataset. If we start typing the property, we see that it already exists, and we can select it for our purposes. Selecting existing properties is important, as it helps to structure and control the content uploaded to the wiki.
 
Before adding a new property name, it is important to note that a similar property may already exist in the ontology to describe the same metadata. '''The properties are case-sensitive.''' For instance, imagine that we want to add a "description" to the dataset. If we start typing the property, we see that it already exists, and we can select it for our purposes. Selecting existing properties is important, as it helps to structure and control the content uploaded to the wiki.
 
'''Exercise: create a "title" property and use it to annotate your uploaded dataset. Check if the property already exists. If it doesn't, create it.'''
 
  
 
== Adding location to a dataset ==
 
== Adding location to a dataset ==
Line 94: Line 86:
 
[[File:TutorialFig24.png|thumb|none|700px|Figure 11: Steps for adding location to your dataset.]]
 
[[File:TutorialFig24.png|thumb|none|700px|Figure 11: Steps for adding location to your dataset.]]
  
== Adding and extending concepts ==
+
== See Also ==
Sometimes one may want to extend some of the concepts that already exist in the wiki. For example, imagine that I have measured a variable of a table (d18Og.rub-w) with a specific stable isotope ratio mass spectrometer housed in my lab. If I state that the variable d18Og.rub-w was measured by a "stable isotope mass spectrometer instrument" (under the "instrument" category) I would be losing information: are all stable isotope mass spectrometer instruments the same? For instance, do they have the same uncertainty? Are the runs parameterized in the same way? The answer is probably no. Therefore, we need to state which stable isotope mass spectrometer was used. In this case, the one in my lab.
+
 
+
Hence, we need to create the concept "StableIsotopeRatioMassSpectrometerInMyLab", referring to a specific "Instrument" with its own property values. For instance, two instruments from the same brand/model could have different reported uncertainty. Following the example shown in the previous section, we would need to create the "StableIsotopeRatioMassSpectrometerIfMyLab" page and annotate it with the category "StableIsotopeRatioMassSpectrometer". This category would become a new category in the wiki, referring to stable isotope ratio mass spectrometer in the more general sense.
+
 
+
'''Exercise: Use an instrument of your lab to describe a variable from a dataset. If the category of your instrument does not exist, create a new instrument category (e.g., stable isotope mass spectrometer). '''
+
 
+
=Advanced wiki functionality=
+
 
+
 
+
== Creating a new wiki page ==
+
 
+
Just go to [http://wiki.linked.earth/New_Page http://wiki.linked.earth/New_Page]
+
 
+
Replace "New Page" above to the name that you want for the page.
+
 
+
Then, either select a category for this page (Figure 16):
+
 
+
[[File:TutorialFig16.png|thumb|none|300px|Figure 16: Selecting a category for a new page]]
+
 
+
or, just click on the "Create" link to create a page without any category (Figure 17)
+
 
+
[[File:TutorialFig17.png|thumb|none|200px|Figure 17: create page button, located on the bottom right of the page]]
+
 
+
Alternatively, you can search for the page you are looking for (remember that the wiki is case sensitive). If the page does not exist, you will be prompted to create it.
+
 
+
== Deleting an existing wiki page ==
+
 
+
Go to [http://wiki.linked.earth/Name_of_Page http://wiki.linked.Earth/Name_of_Page]
+
Replace "Name of Page" above to the name of the page to delete, as shown in Figure 18.
+
 
+
[[File:TutorialFig18.png|thumb|none|300px|Figure 18: Wiki page option menu]]
+
 
+
Then click on the "Delete" link, and delete the page. Figure 19 shows an example:
+
 
+
[[File:TutorialFig19.png|thumb|none|500px|Figure 19: Deleting menu]]
+
 
+
== Searching existing wiki pages==
+
Before creating any page, it is recommended to search if they already exist. Searching a page can be done by entering the terms on the search bar located on the top right of any page (see Figure 20).
+
 
+
[[File:TutorialFig20.png|thumb|none|500px|Figure 20: Search bar for finding existing wiki pages]]
+
 
+
Any page containing the word introduced in the search bar will be returned.
+
 
+
== Renaming wiki pages ==
+
Wiki pages may need to be renamed. For example, due to typos in the page name or community agreement to rename a term in the LinkedEarth ontology. In order to rename an existing page without losing any of its contents, you should click on the "Move" button under the "More" menu on the top of the page. An example is shown in Figure 21.
+
 
+
[[File:TutorialFig21.png|thumb|none|500px|Figure 21: Moving an existing wiki page]]
+
 
+
By clicking on the Move button, the page showed in Figure Figure 22 will ask for the new page name, as well as a reason of the change.
+
 
+
[[File:TutorialFig22.png|thumb|none|700px|Figure 22: Moving a wiki page form.]]
+
 
+
After hitting the "Move page" button, your page will be renamed.
+
 
+
== Uploading images and files ==
+
If you want to add an image or document to your a wiki page ('''Not a LiPD file, use the special page described in this tutorial to upload your LiPD file'''), click on the "upload files" button that can be seen on the left of any wiki page (see Figure 23).
+
 
+
[[File:TutorialFig23.png|thumb|none|700px|Figure 23: uploading an image or document to the Linked Earth wiki.]]
+
  
After you are done selecting the document you want to upload, just hit the "Upload file" button at the bottom of the page. Now you may reference to this file using brackets (e.g., [ [ File:File.png ] ]). You can see the different options for showing files in this page: [https://www.mediawiki.org/wiki/Help:Images https://www.mediawiki.org/wiki/Help:Images]
+
* [[Quick Guide to Editing Wiki Pages]]
 +
* [[Discussion Page Tutorial]]
 +
* [[Best Practices]]
 +
* [[Creating a LiPD file]]

Latest revision as of 22:32, 3 May 2017

This tutorial is meant to get you started with uploading and editing a dataset onto the LinkedEarth platform.

After completing this tutorial, you will be able to:

  • Upload LiPD datasets and enter basic metadata for the record.
  • Annotate a dataset: Create and reuse properties for annotation.

When annotating a dataset, the LinkedEarth wiki follows some nomenclature rules that you can find on this page.

LiPD and LinkedEarth

The LinkedEarth Ontology represents the backbone of the wiki. The ontology allows us to not only define terms commonly used to describe a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also specify the relationship among these terms (e.g., a variable has uncertainty). As such, it allows us to make inferences, support complex queries, as well as perform quality control on the data. Remember that no formal knowledge about ontologies is required to use and contribute to the wiki!

The LinkedEarth ontology was developed from the Linked Paleo Data (LiPD) format championed by Nick McKay and Julien Emile-Geay. Therefore, the wiki platform is currently optimized to accept data already in this format. To convert you dataset into a LiPD file, follow this guide.

Uploading a LiPD file

You need to be logged in to upload a LiPD file, using this special page dedicated to the management of datasets already in the LiPD format. This page can be access directly from the left sidebar: Managing LiPD files.

Select the browse button and choose the .lpd file you want to upload as shown in Figure 1:

Figure1: Manage LiPD Data page

Dataset pages will be automatically created from the content of the LiPD file and your dataset will appear at the top of the "Current LiPD Dataset list" on the Main Page. By clicking on the dataset link, you will be able to see the automatically extracted data and metadata from the LiPD file, as shown in Figure 2. New "crowd" properties will be automatically created from the LiPD file if these properties are not in the current ontology.

Figure2: Wiki page created automatically from the metadata of the LiPD dataset

Congratulations! Your LiPD file has been successfully added to the Linked Earth wiki.

Annotating a LiPD file

Once a LiPD file is uploaded, the metadata about that file is shown in a table-like manner on the page with two columns. The column on the left of each table contains the properties describing the LiPD file, while the column on the right states the value associated with each property. Figure 3 shows an example, where the "archive type" of CAN9Neukom2014 is "Tree". You can also click on the property to access its definition in the LinkedEarth Ontology. Properties followed by (L) indicates that they are part of the LiPD architecture and, therefore, cannot be directly edited by basic editors of the LinkedEarth community.

Please check that the metadata for your record is correct and edit it if appropriate. For instance, to add an investigator to the CAN9Neukom2014 dataset in Figure 2, click on its corresponding row as indicated in Figure 3. Then type the value of the property. In this example, we added "Daniel" as the investigator.

Figure 3: Editing a property value for a LiPD file

All annotated values can be edited or removed. To remove "Daniel" from the investigators of the example dataset, click on the row and on the red cross button of the left as shown in Figure 4:

Figure 4: Deleting a property from a dataset

Each property/property value added to the page is tracked on the page history. The page history is accessible through the "View history" button at the top of any wiki page, allowing to monitor the edits done by other wiki users. Figure 5 illustrates the edits for the CAN9Neukom 2014 dataset: the latest change added a propertyValue adding "Daniel" as an investigator.

Anyone with basic editor privileges and logged into the wiki can edit property values. Don't worry, all changes are revertible and you will get an automatic email if anyone changes a page in your watchlist.

Figure 5: History of the edits done to the CAN9Neukon2014 dataset

If there is a disagreement between two researchers, a discussion may be started on the "Discussion" page, as depicted on Figure 6. To learn more about how to contribute and edit discussion pages, follow this tutorial.

Figure 6: Creating a discussion page on CAN9Neukon2014 dataset

Contributions is tracked automatically on the wiki and displayed in the "Credit" section, which can be found at the bottom of each wiki page.

Figure 7: credits of the CAN9Neukon2014 dataset

Concept annotation in a LiPD file

As shown in Figures 3 and 4, some of the annotated values like "Tree" already have links to other pages. These pages can be further populated and edited by domain experts. To do so click on the Edit tab at the top of the page as shown on Figure 8 for the "Tree" archive. The article was created as a stub, awaiting domain experts' contribution, further gathering field knowledge.

To learn more about Wiki editing, visit the Quick Guide to Editing Wiki Pages.

By linking to other existing pages we can connect different LiPD datasets (e.g., if two different datasets are of the same archive type, they will link to the same page) and support queries. For instance, one can look for all the datasets on the LinkedEarth wiki using "Tree" as archives.

Red links means that the page does not yet exist. Editors can create the new page and edit its content. An example can be seen on Figure 9.

Figure 8: Tree concept definition on the Linked Earth wiki
Figure 9: Creating a page for an unexisting concept (highlighted in red)

Annotate a dataset

Property annotation

Until now we have covered how to annotate property values and associate them to concepts and existing pages in the wiki. In this next step we will see how to create new properties to describe a dataset, i.e., adding new annotations to our dataset outside of the standard properties shown in Figure 2.

The "Properties" box, placed under the "Standard Properties" table, allows users to edit and create new properties and values. An example is shown in Figure 10. By clicking on the "plus" sign in the title, a new row will appear on the table. The row has two fields, one for adding the property name we want to use to describe the dataset (e.g., title, description, name, etc.) and another row for inserting the property value.

Figure 10: adding a property for the LiPD file in Figure 3

Before adding a new property name, it is important to note that a similar property may already exist in the ontology to describe the same metadata. The properties are case-sensitive. For instance, imagine that we want to add a "description" to the dataset. If we start typing the property, we see that it already exists, and we can select it for our purposes. Selecting existing properties is important, as it helps to structure and control the content uploaded to the wiki.

Adding location to a dataset

The LinkedEarth wiki automatically add a new dataset to an existing query page, such as the one on our Main Page, provided that the dataset contains a set of coordinates. To do so, first link the location used to collect the data as illustrated on Figure 11. Any location is valid, from a single point (in a xyz coordinate systems) to a polyline (e.g., a river), or even a polygon (e.g., a mountain, a city or even a country). In the example, we are linking the dataset to the location "Central Andes composite 9", where the data was collected.

Once the location page has been created, we can annotate its name and its associated geometry with the property "hasGeometry". This property takes into account the fact that a location may change over time (e.g., a river could change its course), and hence the geometry would change without affecting the location itself.

Finally, we add the coordinates to the Geometry (if the page doesn't exist already). For this we use the AsWKT property, which indicates to the system that the coordinates are in the Well Known Text format. Since in this case we are representing a point, we also add type "Point" as property for the geometry.


Figure 11: Steps for adding location to your dataset.

See Also