Category:Historical Documents Working Group
Contents
Overview
In the Linked Earth context, a working group (WG) is a self-organized coalition of knowledgeable experts, whose activities are governed herewith. This page is dedicated to the discussion of data and metadata standards for historical documents, and aims to formulate a set of recommendations for such a standard.
Members of 'Historical Documents Working Group'
This working group contains only the following member.
Sources
Data is usually compiled from different historical sources. The LiPD data structure supports several Publication, that is normally used for referring to the publication describing the data. So this data cluster can be used for historical sources as well, in addition to the current publication describing the data. It is related to known standards as Dublin Core or BibTEX
- Source-ID for later reference
- Source Type (string), i.e. newspaper, book, ...
- Source Author
- Source Title
- Publication Year
- Publication Date
- Journal Title
- Publisher
- Url to Source (i.e. to PDF)
- DOI of Source (almost never exist for historical documents if not published as data compilation)
- Source
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
Scans, Pages
Each source can optionally have a bunch of images, that are the scans of the pages inside the source. Maybe this can be dropped for the LiPD format and only references to external images should be added to the quotes.
Quotes
Out of each sources, several quotes could be extracted by transcribing them. Related data would most probably go into "measurementTables".
- Quote-ID: For later reference
- Reference to source: maybe short form like "Author()Year" is adequate
- Page: String of the page(s) number where the quote is extracted from
- Scan: Optional link(s) to externally stored image of this page(s)
- Language: The language of the quote (or it's translation)
- Protolanguage: The language the quote was written originally
- Quote: The quote itself (UTF8)
- License: License of the quote (cc)
- DOI: DOI, the quote is published
- Quote
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
Events
Events, that are more like interpretations of the quote would go into "model part". Each events refers to
- a quote
- a position
- a time
and contains a climate related data. (to be defined)
Position
The position different than for the other archive types is not fixed. Usually a compilation of several sources refer to different scattered locations. The location usually can be named but might covers different scales (continent, country, area, city, street,...) and terrain types (city, river, sea, ...). It best corresponds to the Geospatial metadata of LiPD; see also http://schema.org/Place .
- Position
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
Time
The time derived for an event from historical documents is usually more precise than for other archive types. Often the uncertainty is just days or even hours. So it would be best to code the absolute date by gregorian calendar and a string of type ISO 8601. Optionally an absolute gregorian year (may be with floating point fraction instead of integer) can be calculated as well.
- Time
You are not entitled to view results of this poll.
The original sources often contain the information in calendar notation different than gregorian. Julian calendar was used in Europe in historical time, but outside Europe other calendars existed or still exist, likeweise in chinese or arabic documents.
You are not entitled to view results of this poll.
If so, which parameters for start and end should be considered?
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
It is also possible to extract the time related text from the quote and add it to a field 'phrase':
You are not entitled to view results of this poll.
Event itself
The event refers to the quote, maybe directly to the source, to the chronology/timing and the position/location. It also holds the coding of the phenomena, here it gets complicated - see next section.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
You are not entitled to view results of this poll.
Phenomena/Coding
A lot of information can be found inside historical documents. Thereby the coding schema is complex. Some examples and their coding to illustrate this:
- The weather was extremely warm -> temperature index + very hot (+3)
- A lot of rain fell in December -> precipitation type + rain & precipitation amount + more than usual (+1)
- The water level was 13 feet -> water level + value:13.0 + tolerance:1.0 + unit:feet
- The drought lead to bad harvest of potatoes -> harvest + harvest amount: less than usual (-1) & potatoes / precipitation amount: less than usual (-1)
- The wheat harvest was little but the wine quality was good -> harvest + harvest amount: less than usual (-1) & wheat / harvest + harvest quality: better than usual (+1) & wine
+: one dimensional combination &: two (or more) dimensional combinations /: two different events (i.e. cause vs effect)
(Coding tree as used in tambora.org can be found here.)
(codeset_id,codeset_description,category,path,node_label,scale_label,scale_unit,value_label,value_index,average,variance,si_unit,si_average,si_variance )
Example
Rename LiPD file extension from zip to lpd
Description | Tambora-Files | LiPD-Files | Remarks |
---|---|---|---|
Flood Example | Media:flood_tambora_csv.zip | Media:Exp0000.tambora.2017.zip | |
To-Do | To-Do | To-Do | |
To-Do | To-Do | To-Do |
Media in category "Historical Documents Working Group"
The following 6 files are in this category, out of 6 total.