Difference between revisions of "LinkedEarth Ontology"

From Linked Earth Wiki
Jump to: navigation, search
(How useful is the ontology?: make the text more symmetrical and easier to follow)
(Ontology definition)
Line 1: Line 1:
 
At its most fundamental level, the LinkedEarth Ontology allows us to not only define terms commonly used to describing a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also to specify the relationship among those terms (e.g., a variable has uncertainty). As such it allows us to make inferences, support complex queries, as well as perform quality control on the data.  
 
At its most fundamental level, the LinkedEarth Ontology allows us to not only define terms commonly used to describing a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also to specify the relationship among those terms (e.g., a variable has uncertainty). As such it allows us to make inferences, support complex queries, as well as perform quality control on the data.  
 +
 +
[[File:Ontology ExampleOfATriple.jpg|thumb|right|400px|The triple consists of a subject (the dataset), a property (hasName), and an object (the name of the dataset, WesternPacific_Khider_2014).]]
 +
 +
When representing the knowledge of a domain like paleoclimatology, we usually distinguish the things that we want to describe (i.e, concepts like a dataset, a variable, etc,...) and the relationships used to describe those concepts (e.g., the name of the dataset, the value of the variable, etc...). As shown in the figure to the right, we can use a graph-based representation to encode the information in a set of triples.
 +
 +
Each triple has a subject (i.e., what we want to describe), a property (the element describing the subject), and an object (i.e., the values used to describe the subject).
 +
 +
Different concepts may be linked to each other using properties. For example, a dataset contain a data table, which contains several variables. The properties and concepts for a domain are often defines as ontologies. An ontology is defined as a [http://iaoa.org/isc2012/docs/Guarino2009_What_is_an_Ontology.pdf "formal specification of a shared conceptualization"], and they represent consensual knowledge that helps a community describing the concepts of the domain using a common representation. A feature of ontologies is that they are machine readable, i.e., they allow machines understanding the domain in the way the creators of the ontology have defined. Thanks to the ontology, machines can navigate through data and discover data that otherwise would be hidden to them. This enables batch processing of data that would require a large amount of (wo)men hours.
 +
  
 
==How useful is the ontology?==
 
==How useful is the ontology?==
Line 34: Line 43:
 
# and for the variations to have be  interpreted as ([[:Property:InterpretedAs_©]]) temperature (or T)
 
# and for the variations to have be  interpreted as ([[:Property:InterpretedAs_©]]) temperature (or T)
  
The sections below describes the difference between properties and categories on the wiki.  
+
The sections below describes the difference between properties and categories on the wiki. Relating back to the triples defined above, [[Special:Categories]] on the wiki represent concepts and [[Special:Properties]] are the properties relating the various categories.
  
 
'''Remember that no formal knowledge about ontologies is required to use and contribute to the wiki.'''
 
'''Remember that no formal knowledge about ontologies is required to use and contribute to the wiki.'''

Revision as of 19:58, 5 April 2017

At its most fundamental level, the LinkedEarth Ontology allows us to not only define terms commonly used to describing a paleoclimate dataset (e.g., variable, uncertainty, calibration) but also to specify the relationship among those terms (e.g., a variable has uncertainty). As such it allows us to make inferences, support complex queries, as well as perform quality control on the data.

The triple consists of a subject (the dataset), a property (hasName), and an object (the name of the dataset, WesternPacific_Khider_2014).

When representing the knowledge of a domain like paleoclimatology, we usually distinguish the things that we want to describe (i.e, concepts like a dataset, a variable, etc,...) and the relationships used to describe those concepts (e.g., the name of the dataset, the value of the variable, etc...). As shown in the figure to the right, we can use a graph-based representation to encode the information in a set of triples.

Each triple has a subject (i.e., what we want to describe), a property (the element describing the subject), and an object (i.e., the values used to describe the subject).

Different concepts may be linked to each other using properties. For example, a dataset contain a data table, which contains several variables. The properties and concepts for a domain are often defines as ontologies. An ontology is defined as a "formal specification of a shared conceptualization", and they represent consensual knowledge that helps a community describing the concepts of the domain using a common representation. A feature of ontologies is that they are machine readable, i.e., they allow machines understanding the domain in the way the creators of the ontology have defined. Thanks to the ontology, machines can navigate through data and discover data that otherwise would be hidden to them. This enables batch processing of data that would require a large amount of (wo)men hours.


How useful is the ontology?

One of the most practical aspects of the ontology is to allow to query the data. For instance, the following query searches the entire database for coral d18O records that have been interpreted to represent temperature (limit =10).

{{ #ask: 
[[Category:Dataset_©]] 
[[IncludesPaleoData_©.FoundInMeasurementTable_©.IncludesVariable_©.MeasuredOn_©::<q>[[Category:Coral]]</q>]]
[[IncludesPaleoData_©.FoundInMeasurementTable_©.IncludesVariable_©.InterpretedAs_©.Name_©::T]]
 | ?IncludesPaleoData_©=PaleoData
 | format=broadtable
 | limit=10
}}


A quick look at the code above describes the hierarchy of a dataset on the wiki (and in the corresponding LiPD file). In essence, the query asks the database to find the datasets(Category:Dataset_©),

  1. which include (Property:IncludesPaleoData_©) PaleoData (Category:PaleoData_©),
  2. that are found in (Property:FoundInMeasurementTable_©) a table (Category:MeasurementTable_©),
  3. that include ( Property:IncludesVariable_©) variables (Category:Variable_©).

These variables need to fit two criteria:

  1. to be measured on (Property:MeasuredOn_©) an archive of type coral (Category:Coral)
  2. and for the variations to have be interpreted as (Property:InterpretedAs_©) temperature (or T)

The sections below describes the difference between properties and categories on the wiki. Relating back to the triples defined above, Special:Categories on the wiki represent concepts and Special:Properties are the properties relating the various categories.

Remember that no formal knowledge about ontologies is required to use and contribute to the wiki.

The LinkedEarth Ontology

See Also