DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an ABECTO execution plan to compare space travel data from Wikidata and DBpedia and the according results.
The generated result data are derived from the compared knowledge graphs, which are licensed as follows:
http://dcat-ap.de/def/licenses/cc-by-sahttp://dcat-ap.de/def/licenses/cc-by-sa
DBpedia is a joint project of Leipzig University, Freie UniversitÀt Berlin and OpenLink Software to extract structured information from Wikipedia and make it accessible as linked data web applications. DBpedia also makes it possible to link this data with information from other web applications. The data sets are available under the GNU Free Documentation License and are linked to other free data collections.
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
DBpedia.org is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia and to link other datasets on the Web to Wikipedia data.
The English version of the DBpedia knowledge base currently describes 6.0M entities of which 4.6M have abstracts, 1.53M have geo coordinates and 1.6M depictions. In total, 5.2M resources are classified in a consistent ontology, consisting of 1.5M persons, 810K places (including 505K populated places), 490K works (including 135K music albums, 106K films and 20K video games), 275K organizations (including 67K companies and 53K educational institutions), 301K species and 5K diseases. The total number of resources in English DBpedia is 16.9M that, besides the 6.0M resources, includes 1.7M skos concepts (categories), 7.3M redirect pages, 260K disambiguation pages and 1.7M intermediate nodes.
Altogether the DBpedia 2016-04 release consists of 9.5 billion (2015-10: 8.8 billion) pieces of information (RDF triples) out of which 1.3 billion (2015-10: 1.1 billion) were extracted from the English edition of Wikipedia, 5.0 billion (2015-04: 4.4 billion) were extracted from other language editions and 3.2 billion (2015-10: 3.2 billion) from DBpedia Commons and Wikidata. In general, we observed a growth in mapping-based statements of about 2%.
Thorough statistics can be found on the DBpedia website and general information on the DBpedia datasets here.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
About Dataset DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in Wikipedia. This is an extract of the data (after cleaning, kernel included) that provides taxonomic, hierarchical categories ("classes") for 342,782 wikipedia articles. There are 3 levels, with 9, 70 and 219 classes respectively. A version of this dataset is a popular baseline for NLP/text classification tasks. This version of the dataset is much tougher⊠See the full description on the dataset page: https://huggingface.co/datasets/DeveloperOats/DBPedia_Classes.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
DBpedia Portuguese is being constructed by a working group to internationalize DBpedia to the Lusosphere.
We are actively collaborating with the DBpedia Internationalization Team to include knowledge from the Portuguese Language Wikipedia into DBpedia. As a first step, we have performed a preliminary extraction (available at http://pt.dbpedia.org) and are editing several mappings and labels to the DBpedia Ontology. The activity is organized via the mailing list dbpedia-portuguese@lists.sourceforge.net
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tough Tables (2T) is a dataset designed to evaluate table annotation approaches in solving the CEA and CTA tasks. The dataset is compliant with the data format used in SemTab 2019, and it can be used as an additional dataset without any modification. The target knowledge graph is DBpedia 2016-10. Check out the 2T GitHub repository for more details about the dataset generation.
New in v2.0: We release the updated version of 2T_WD! The target knowledge graph is Wikidata (online instance) and the dataset complies with the SemTab 2021 data format.
This work is based on the following paper:
Cutrona, V., Bianchi, F., Jimenez-Ruiz, E. and Palmonari, M. (2020). Tough Tables: Carefully Evaluating Entity Linking for Tabular Data. ISWC 2020, LNCS 12507, pp. 1â16.
Note on License: This dataset includes data from the following sources. Refer to each source for license details: - Wikipedia https://www.wikipedia.org/ - DBpedia https://dbpedia.org/ - Wikidata https://www.wikidata.org/ - SemTab 2019 https://doi.org/10.5281/zenodo.3518539 - GeoDatos https://www.geodatos.net - The Pudding https://pudding.cool/ - Offices.net https://offices.net/ - DATA.GOV https://www.data.gov/
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Changelog:
v2.0
New GT for 2T_WD
A few entities have been removed from the CEA GT, because they are no longer represented in WD (e.g., dbr:Devonté points to wd:Q21155080, which does not exist)
Tables codes and values differ from the previous version, because of the random noise.
Updated ancestor/descendant hierarchies to evaluate CTA.
v1.0
New Wikidata version (2T_WD)
Fix header for tables CTRL_DBP_MUS_rock_bands_labels.csv and CTRL_DBP_MUS_rock_bands_labels_NOISE2.csv (column 2 was reported with id 1 in target - NOTE: the affected column has been removed from the SemTab2020 evaluation)
Remove duplicated entries in tables
Remove rows with wrong values (e.g., the Kazakhstan entity has an empty name "''")
Many rows and noised columns are shuffled/changed due to the random noise generator algorithm
Remove row "Florida","Floorida","New York, NY" from TOUGH_WEB_MISSP_1000_us_cities.csv (and all its NOISE1 variants)
Fix header of tables:
CTRL_WIKI_POL_List_of_current_monarchs_of_sovereign_states.csv
CTRL_WIKI_POL_List_of_current_monarchs_of_sovereign_states_NOISE2.csv
TOUGH_T2D_BUS_29414811_2_4773219892816395776_videogames_developers.csv
TOUGH_T2D_BUS_29414811_2_4773219892816395776_videogames_developers_NOISE2.csv
v0.1-pre
First submission. It contains only tables, without GT and Targets.
Data set of a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. It is hoped that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself. The project extracts knowledge from 111 different language editions of Wikipedia. The DBpedia project maps Wikipedia infoboxes from 27 different language editions to a single shared ontology consisting of 320 classes and 1,650 properties. The mappings are created via a world-wide crowd-sourcing effort and enable knowledge from the different Wikipedia editions to be combined. The project publishes regular releases of all DBpedia knowledge bases for download and provides SPARQL query access to 14 out of the 111 language editions via a global network of local DBpedia chapters. In addition to the regular releases, the project maintains a live knowledge base which is updated whenever a page in Wikipedia changes. DBpedia sets 27 million RDF links pointing into over 30 external data sources and thus enables data from these sources to be used together with DBpedia data. Several hundred data sets on the Web publish RDF links pointing to DBpedia themselves and thus make DBpedia one of the central interlinking hubs in the Linked Open Data (LOD) cloud.
This ontology is generated from the manually created specifications in the dbpedia mappings wiki. each release of this ontology corresponds to a new release of the dbpedia data set which contains instance data extracted from the different language versions of wikipedia. for information regarding changes in this ontology, please refer to the dbpedia mappings wiki.
This dataset contains information about politicians from DBpedia, a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. Some important information about people that is available on DBpedia are name, gender, nationality, occupation, birth date, death date, profession and for many politicians also the political party they belong to. This dataset is based on the English DBpedia dump from October 2015 and documents the temporal evolution of the hyperlink network that articles about politicians formed on Wikipedia between 2001 and 2016 every month. Wikipedia maintains revisions for each article to keep track of the changes over time. The first revision of each month was used to construct the hyperlink network between articles about politicians.
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
DBpedia Japanese is a part of the DBpedia internationzation effort. Datasets of DBpedia Japanese are generated from dump data of Wikipedia in Japanese and include links to other DBpedia chapters and Wordnet-ja.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Resulting axioms and assertions from applying the Cat2Ax approach to the DBpedia knowledge graph.
The methodology is described in the conference publication "N. Heist, H. Paulheim: Uncovering the Semantics of Wikipedia Categories, International Semantic Web Conference, 2019".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Sets from the ISWC 2019 Semantic Web Challenge on Tabular Data to Knowledge Graph Matching.
For details, see: http://www.cs.ox.ac.uk/isg/challenges/sem-tab/
Note on License: This data includes data from the following sources. Refer to each source for license details:
- Wikipedia https://www.wikipedia.org/
- DBpedia http://dbpedia.org/
- T2Dv2 Gold Standard for Matching Web Tables to DBpedia http://webdatacommons.org/webtables/goldstandardV2.html
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CaLiGraph is a large-scale semantic knowledge graph with a rich ontology which is compiled from the DBpedia ontology, and Wikipedia categories & list pages. For more information, visit http://caligraph.org
Information about uploaded files: (all files are b-zipped and in the n-triple format)
caligraph-metadata.nt.bz2 Metadata about the dataset which is described using void vocabulary.
caligraph-ontology.nt.bz2 Class definitions, property definitions, restrictions, and labels of the CaLiGraph ontology.
caligraph-ontology_dbpedia-mapping.nt.bz2 Mapping of classes and properties to the DBpedia ontology.
caligraph-ontology_provenance.nt.bz2 Provenance information about classes (i.e. which Wikipedia category or list page has been used to create this class).
caligraph-instances_types.nt.bz2 Definition of instances and (non-transitive) types.
caligraph-instances_transitive-types.nt.bz2 Transitive types for instances (can also be induced by a reasoner).
caligraph-instances_labels.nt.bz2 Labels for instances.
caligraph-instances_relations.nt.bz2 Relations between instances derived from the class restrictions of the ontology (can also be induced by a reasoner).
caligraph-instances_dbpedia-mapping.nt.bz2 Mapping of instances to respective DBpedia instances.
caligraph-instances_provenance.nt.bz2 Provenance information about instances (e.g. if the instance has been extracted from a Wikipedia list page).
dbpedia_caligraph-instances.nt.bz2 Additional instances of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !
dbpedia_caligraph-types.nt.bz2 Additional types of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !
dbpedia_caligraph-relations.nt.bz2 Additional relations of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !
Changelog
v3.1.1
Fixed an encoding issue in caligraph-ontology.nt.bz2
v3.1.0
Fixed several issues related to ontology consistency and structure
v3.0.0
Added functionality to group mentions of unknown entities into distinct entities
v2.1.0
Fixed error that lead to a class inheriting from a disjoint class
Introduced owl:ObjectProperty and owl:DataProperty instead of rdf:Property
Several cosmetic fixes
v2.0.2
Fixed incorrect formatting of some properties
v2.0.1
Better entity extraction and representation
Small cosmetic fixes
v2.0.0
Entity extraction from arbitrary tables and enumerations in Wikipedia pages
v1.4.0
BERT-based recognition of subject entities and improved language models from spaCy 3.0
v1.3.1
Fixed minor encoding errors and improved formatting
v1.3.0
CaLiGraph is now based on a recent version of Wikipedia and DBpedia from November 2020
v1.1.0
Improved the CaLiGraph type hierarchy
Many small bugfixes and improvements
v1.0.9
Additional alternative labels for CaLiGraph instances
v1.0.8
Small cosmetic changes to URIs to be closer to DBpedia URIs
v1.0.7
Mappings from CaLiGraph classes to DBpedia classes are now realised via rdfs:subClassOf instead of owl:equivalentClass
Entities are now URL-encoded to improve accessibility
v1.0.6
Fixed a bug in the ontology creation step that led to a substantially lower amount of sub-type relationships than actually exist. The new version provides a richer type hierarchy that also leads to an increased amount of types for resources.
v1.0.5
Fixed a bug that has declared CaLiGraph predicates as subclasses of owl:Predicate instead of being of the type owl:Predicate.
This dataset contains information about politicians from DBpedia, a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. Some important information about people that is available on DBpedia are name, gender, nationality, occupation, birth date, death date, profession and for many politicians also the political party they belong to. This dataset is based on the English DBpedia dump from October 2015 and documents the temporal evolution of the hyperlink network that articles about politicians formed on Wikipedia between 2001 and 2016 every month. Wikipedia maintains revisions for each article to keep track of the changes over time. The first revision of each month was used to construct the hyperlink network between articles about politicians. Sonstige Other All Wikipedia articles that relate to an instance of class Person in Dbpedia (English language dump October 2015) and were identified as politicians. Full population KAT54 Person Personality Role
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
Basque chapter of DBpedia. DBpedia is a "community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. We hope this will make it easier for the amazing amount of information in Wikipedia to be used in new and interesting ways, and that it might inspire new mechanisms for navigating, linking and improving the encyclopedia itself."
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
QBLink-KG is a modified version of QBLink, which is a high-quality benchmark for evaluating conversational understanding of Wikipedia content.QBLink consists of sequences of up to three hand-crafted queries, with responses being single-named entities that match the titles of Wikipedia articles.For the QBLink-KG, the English subset of the DBpedia snapshot from September 2021 was used as the target Knowledge Graph. QBLink answers provided as the titles of Wikipedia infoboxes can be easily mapped to DBpedia entity URIs - if the corresponding entities are present in DBpedia - since DBpedia is constructed through the extraction of information from Wikipedia infoboxes.QBLink, in its original format, is not directly applicable for Conversational Entity Retrieval from a Knowledge Graph (CER-KG) because knowledge graphs contain considerably less information than Wikipedia. A named entity serving as an answer to a QBLink query may not be present as an entity in DBpedia. To modify QBLink for CER over DBpedia, we implemented two filtering steps: 1) we removed all queries for which the wiki_page field is empty, or the answer cannot be mapped to a DBpedia entity or does not match to a Wikipedia page. 2) For the evaluation of a model with specific techniques for entity linking and candidate selection, we excluded queries with answers that do not belong to the set of candidate entities derived using that model.The original QBLink dataset files before filtering are:QBLink-train.jsonQBLink-dev.jsonQBLink-test.jsonAnd the final QBLink-KG files after filtering are:QBLink-Filtered-train.jsonQBLink-Filtered-dev.jsonQBLink-Filtered-test.jsonWe used below references to construct QBLink-KG:Ahmed Elgohary, Chen Zhao, and Jordan Boyd-Graber. 2018. A dataset and baselines for sequential open-domain question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1077â1083, Brussels, Belgium. Association for Computational Linguistics.https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2021-09Lehmann, Jens et al. âDBpedia â A Large-scale, Multilingual Knowledge Base Extracted from Wikipediaâ. 1 Jan. 2015 : 167 â 195.To give more details about QBLink-KG, please read our research paper:Zamiri, Mona, et al. "Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph", The Web Conference 2024.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This benchmark consist in 255 randomly selected disease descriptions, as of February 2024. Each disease description was labeled by two data annotators who reviewed each other's annotations to ensure accuracy and consistency across the dataset.
This procedure involves collecting, parsing and extracting data from Wikipedia using a software routine that interfaces with an API \footnote{https://pypi.org/project/Wikipedia-API/} to systematically retrieve and collate information related to a predefined disease. Specifically, it searches for pages with a certain disease and, within those pages, extracts the "Sings and Symptoms" section.
This process has two steps:
After extracting the text from Wikipedia, the phenotypical entities were annotated.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Semantic Annotation for Tabular Data with DBpedia: Adapted SemTab 2019 with DBpedia 2016-10
CEA:
Keep only valid entities in DBpedia 2016-10
Resolve percentage encoding
Add missing redirect entities
CTA:
Keep only valid types
Resolve transitive types (parents and equivalent types of the specific type) with DBpedia ontology 2016-10
CPA:
Add equivalent properties
Statistic of Adapted Tabular data SemTab 2019
CEA | CPA | CTA | |||||||
---|---|---|---|---|---|---|---|---|---|
Orginal | Adapted | Change | Orginal | Adapted | Change | Orginal | Adapted | Change | |
Round 1 | 8418 | 8406 | -0.14% | 116 | 116 | 0.00% | 120 | 120 | 0.00% |
Round 2 | 463796 | 457567 | -1.34% | 6762 | 6762 | 0.00% | 14780 | 14333 | -3.02% |
Round 3 | 406827 | 406820 | 0.00% | 7575 | 7575 | 0.00% | 5762 | 5673 | -1.54% |
Round 4 | 107352 | 107351 | 0.00% | 2747 | 2747 | 0.00% | 1732 | 1717 | -0.87% |
DBpedia 2016-10 extra resources: (Original dataset http://downloads.dbpedia.org/2016-10/)
File: _dbpedia_classes_2016-10.csv
Information: DBpedia classes and parents: (We remove the abstract types: Agent, Thing)
Total: 759 classes
Structure: class, parents (separate with space)
Example: "City","Location Place PopulatedPlace Settlement"
File: _dbpedia_properties_2016-10.csv
Information: DBpedia properties and these equivalents
Total: 2865 properties
Structure: property, itâs equivalent properties
Example: "restingDate","deathDate"
File: _dbpedia_domains_2016-10.csv
Information: DBpedia properties and these domain types
Total: 2421 properties (have types as their domain)
Structure: property, type (domain)
Example: "deathDate","Person"
File: _dbpedia_entities_2016-10.jsonl.bz2
Information: DBpedia entity dump
Format: json list bz2 (bz2 Compressed json list)
Source: DBpedia dump 2016-10 core
Total: 5,289,577 entities (No disambiguation entities)
Structure:
An entity: for example âTokyoâ: (datatype: dictionary),
{
'wd': 'Q1322032', (Wikidata ID, datatype: string)
'wp': 'Tokyo', (Wikipedia ID, add prefix https://en.wikipedia.org/wiki/ + wp to get the Wikipedia URL, datatype: string)
'dp': 'Tokyo', (DBpedia ID, add prefix http://dbpedia.org/resource/ + dp to get the DBpedia URL, datatype: string)
'label': 'Tokyo', (Entity label, datatype: string)
'aliases': ['To-kyo', 'TĂŽkyĂŽ Prefecture', ..], (Other entity names, datatype: list)
'aliases_multilingual': ['äžäșŹć°ć', 'Ű·ÙÙÙÙ', ...], (Other entity names in multilingual, datatype: list)
'types_specific': 'City', (Entity direct type, datatype: string)
'types_transitive': ['Human settlement', 'City', 'PopulatedPlace', 'Location', 'Place', 'Settlement'], (Entity transitive types, datatype: list)
'claims_entity': { (entity statements, datatype: dictionary. Keys: properties, Values: list of tail entities)
'governingBody': ['Tokyo Metropolitan Government'],
'subdivision': ['Honshu', 'KantĆ region'],
...
},
'claims_literal': {
'string': { (String literal: datatype: dictionary. Keys: properties, Values: list of values
'postalCode': ['JP-13'],
'utcOffset': ['+09:00', '+9'],
âŠ
}
'time': { (Time literal: datatype: dictionary. Keys: properties, Values: list of date time
'populationAsOf': ['2016-07-31'],
...
}),
'quantity': { (Numerical literal: datatype: dictionary. Keys: properties, Values: list of values
populationDesity: [6224.66, 6349.0],
'maximumElevation': [2017],
...
},
'pagerank': 2.2167366040153352e-06 (Entity page rank score calculated on DBpedia Graph)
}
THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
'PiOCHE' stands for 'Persons in Objectified Complex Historical Events'
The following people participated in this project:
Name: | Works for: | Involved in: | WebID: |
Ansgar Scherp | Data Sciences Univ. Ulm | Applying Event-Model F | https://orcid.org/0000-0002-2653-9245 |
Gerard Kuys | DBpedia Association | Time-sensitive RDF data on historical subjects | https://isni.org/isni/0000000389538183 |
Gerald Wildenbeest | DBpedia Association | DBpedia Databus / data.pldn.nl | https://orcid.org/0000-0001-8026-8506 |
For about two years the persons named above have been working on a project, in which we developed a Linked Open Data model fit to express the essentials of historical events as described by some source. The reason we did this was that we felt that the efforts of cultural heritage institutions were directed almost exclusively towards the disclosure and accessibility of their collections, but much less so towards telling the story being conveyed by any document or other piece of material evidence of the past.
While, of course, there should always be a link between a digital statement about a situation in times past and the materials on which such a statement would be based, it would be the world upside down if the focus were to lie exclusively on the materials and not on the interpretations they may support.
Representing historical events as open data published on the internet as an endeavour cannot be founded on a single effort, it is due to take time. However, we feel that this is not a reason not to embark on such a project. Instead of developing theoretical models of what historical events exactly are and how they should universally be modelled, we approached the problem from the other side: transposing historical events as presented in a historical work into Linked Open Data, and see what kind of complexities we would encounter. In doing so, we adopted an event model best suited to our requirements and tried to use it in order to express the various layers of any event. As our principal source we selected the 19th-century Dutch Geographical Dictionary (13 volumes, 1839-1851) by A.J. van der Aa (https://en.wikipedia.org/wiki/Abraham_Jacob_van_der_Aa). This work abundantly supplied us with cases in which we had to deal with temporal perspectives (did this 19th-century author when talking about a particular place mean the same things as we do now), conflicting interpretations and so on.
We named our project PiOCHE: Persons in Objectified Complex Historical Events. As an acronym, it stands for chopping our way through unruly materials that are about, or salvaged from, history.
Our project has been documented in the following ways:
DOI dataset: https://doi.org/10.5281/zenodo.6542765 , reflecting the state of the dataset at the submission time of the Project description
Project description: http://doi.org/10.5334/johd.84 - Article: Kuys, G., & Scherp, A. (2022). Representing Persons and Objects in Complex Historical Events using the Event Model F. Journal of Open Humanities Data, 8: 22, pp. 1â12 (Sept. 2, 2022)
A document in the making, reflecting on the results and unresolved issues of the PiOCHE project
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.