59 datasets found

DBpedia RDF2Vec Graph Embeddings
zenodo.org
pdf, zip
Updated Jul 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Pekár Christensen; Martin Pekár Christensen; Matteo Lissandrini; Matteo Lissandrini; Katja Hose; Katja Hose (2024). DBpedia RDF2Vec Graph Embeddings [Dataset]. http://doi.org/10.5281/zenodo.6384728
Explore at:
pdf, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6384728
Dataset updated
Jul 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martin Pekár Christensen; Martin Pekár Christensen; Matteo Lissandrini; Matteo Lissandrini; Katja Hose; Katja Hose
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DBpedia graph embeddings using RDF2Vec. RDF2Vec embedding generation code can be found here and is based on a publication by Portisch et al. [1].

The embeddings dataset consists of 200-dimensional vectors of DBpedia entities (from 1/9/2021).

Figure of cosine similarities between a selected set of DBpedia entities are provided in the dataset here.

Generating Embeddings

The code for generating these embeddings can be found here.

Run the run.sh script that wraps all the necessary commmands to generate embeddings

bash run.sh

The script downloads a set of DBpedia files, which are listed in dbpedia_files.txt. It then builds a Docker image and runs a container of that image that generates the embeddings for the DBpedia graph defined by the DBpedia files.

A folder files is created containing all the downloaded DBpedia files, and a folder embeddings/dbpedia is created containing the embeddings in vectors.txt along a set of random walk files.

Run Time of Embeddings Generation

Generating embeddings can take more than a day, but it depends on the number of DBpedia files chosen to be downloaded. Following are some basic run time statistics when embeddings are generated on a 64 GB RAM, 8 cores (AMD EPYC), 1 TB SSD, 1996.221 MHz machine.

Total: 1 day, 8 hours, 52 minutes, 41 seconds

Walk generation: 0 days, 7 minutes, 24 minutes, 36 seconds

Training: 1 day, 1 hour, 28 minutes, 5 seconds

Parameters Used

Here is listed the parameters used to generate the embeddings provided here:

Number of walks per entity: 100

Depth (hops) per walk: 4

Walk generation mode: RANDOM_WALKS_DUPLICATE_FREE

Threads: # of processors / 2

Training mode: sg

Embeddings vector dimension: 200

Minimum word2vec word count: 1

Sample rate: 0.0

Training window size: 5

Training epochs: 5
P
DBpedia Dataset
paperswithcode.com
opendatalab.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sören Auer; Christian Bizer; Georgi Kobilarov; Jens Lehmann; Richard Cyganiak; Zachary G. Ives, DBpedia Dataset [Dataset]. https://paperswithcode.com/dataset/dbpedia
Explore at:
Authors
Sören Auer; Christian Bizer; Georgi Kobilarov; Jens Lehmann; Richard Cyganiak; Zachary G. Ives
Description
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.
Data from: Universal Knowledge Graph Embeddings
zenodo.org
bin, zip
Updated Feb 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
N'Dah Jean Kouagou; Caglar Demir; M. Hamada Zahera; Stefan Heindorf; Jiayi Li; Axel-Cyrille Ngonga Ngomo; N'Dah Jean Kouagou; Caglar Demir; M. Hamada Zahera; Stefan Heindorf; Jiayi Li; Axel-Cyrille Ngonga Ngomo (2023). Universal Knowledge Graph Embeddings [Dataset]. http://doi.org/10.5281/zenodo.7503097
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7503097
Dataset updated
Feb 2, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
N'Dah Jean Kouagou; Caglar Demir; M. Hamada Zahera; Stefan Heindorf; Jiayi Li; Axel-Cyrille Ngonga Ngomo; N'Dah Jean Kouagou; Caglar Demir; M. Hamada Zahera; Stefan Heindorf; Jiayi Li; Axel-Cyrille Ngonga Ngomo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset provides embeddings for entities and relations in DBpedia (English) and Wikidata. The two knowledge graphs are first merged using a novel approach that we developed by leveraging the sameAs links between them. Then, we used the state-of-the-art embedding model ConEx to compute embeddings of the merge. Our embeddings are called universal knowledge graph embeddings.
Z
S-DBpedia: A benchmark dataset and evaluation for spatial knowledge graph...
data.niaid.nih.gov
Updated Jan 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li, Qinghui (2024). S-DBpedia: A benchmark dataset and evaluation for spatial knowledge graph completion [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7431612
Explore at:
Dataset updated
Jan 27, 2024
Dataset provided by
Zhang, Fu
Cheng, Jingwei
Chen, Ming
Mao, Chaoyuan
Li, Qinghui
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A benchmark for Spatial Knowledge Graph Completion (SKGC) extracted from DBpedia.

It can be used to evaluate Spatial Knowledge Graph Embedding or Completion methods.

The S-DBpedia baseline dataset contains two types of datasets.

Data scale: S-DBpedia_small, S-DBpedia_medium, S-DBpedia_large, S-DBpedia.

Data Sparsity: S-DBpedia_GT5E, S-DBpedia_GT10E, S-DBpedia_GT20E, S-DBpedia_GT50E.

We extracted all attributes of entities in the dataset from DBpedia. It contains text, numerical, and image information. The data here includes the dataset file (dataset name) and all attribute files of the entity (Attribute.tar.gz).

For the construction code, evaluation and detailed usage instructions of the dataset, please see https://github.com/NEU-IDKE/S-DBpedia.
CollabRec: DBpedia Subgraphs (2022-09)
zenodo.org
data.niaid.nih.gov
zip
Updated Mar 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous; Anonymous (2023). CollabRec: DBpedia Subgraphs (2022-09) [Dataset]. http://doi.org/10.5281/zenodo.7772596
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7772596
Dataset updated
Mar 28, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
The core version of DBpedia has too many entities and statements to train recommendation models in a reasonable time frame, which is why we created two subsets (DB1M, and DBA240) of the core version of DBpedia from September 2022.

File structure
Each dataset is located in their own folder with the following files:

index.tsv.gz is a file in tabular format that maps a simple integer to a URI, which identifies an entity in the KG.

index_labels.tsv.gz is a file that links entities (represented by their index number) to their label and description.

relevant_entities.tsv.gz is a file with all the entities, which occur as subject or/and as object in statements of the subsampled KG.

statements.tsv.gz is a file with all the statements of the subsampled KG. The first column contains the subjects, second column the predicates, and the third column the objects. All those entities are represented by their index number (see index.tsv.gz) and not their URI.

statements.nt.gz is a file with all the statements of the subsampled KG in N-Triples format.
P
SimpleDBpediaQA Dataset
paperswithcode.com
opendatalab.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Azmy; Peng Shi; Jimmy Lin; Ihab Ilyas, SimpleDBpediaQA Dataset [Dataset]. https://paperswithcode.com/dataset/simpledbpediaqa
Explore at:
Authors
Michael Azmy; Peng Shi; Jimmy Lin; Ihab Ilyas
Description
A new benchmark dataset for simple question answering over knowledge graphs that was created by mapping SimpleQuestions entities and predicates from Freebase to DBpedia.
Classes Knowledge Graph
kaggle.com
Updated Aug 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Afroz (2024). Classes Knowledge Graph [Dataset]. https://www.kaggle.com/datasets/pythonafroz/dbpedia-classes-knowledge-graph/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 31, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Afroz
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
DBPedia Classes

DBpedia is a knowledge graph extracted from Wikipedia, providing structured data about real-world entities and their relationships. DBpedia Classes are the core building blocks of this knowledge graph, representing different categories or types of entities.

Key Concepts:

Entity: A real-world object, such as a person, place, thing, or concept. Class: A group of entities that share common properties or characteristics. Instance: A specific member of a class.

Examples of DBPedia Classes:

Person: Represents individuals, e.g., "Barack Obama," "Albert Einstein." Place: Represents locations, e.g., "Paris," "Mount Everest." Organization: Represents groups, institutions, or companies, e.g., "Google," "United Nations." Event: Represents occurrences, e.g., "World Cup," "French Revolution." Artwork: Represents creative works, e.g., "Mona Lisa," "Star Wars."

Hierarchy and Relationships:

DBpedia classes often have a hierarchical structure, where subclasses inherit properties from their parent classes. For example, the class "Person" might have subclasses like "Politician," "Scientist," and "Artist."

Relationships between classes are also important. For instance, a "Person" might have a "birthPlace" relationship with a "Place," or an "Artist" might have a "hasArtwork" relationship with an "Artwork."

Applications of DBPedia Classes:

Semantic Search: DBPedia classes can be used to enhance search results by understanding the context and meaning of queries.

Knowledge Graph Construction: DBPedia classes form the foundation of knowledge graphs, which can be used for various applications like question answering, recommendation systems, and data integration.

Data Analysis: DBPedia classes can be used to analyze and extract insights from large datasets.
RDF2Vec DBpedia inverse predicate frequency embeddings
zenodo.org
explore.openaire.eu
zip
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Cochez; Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim (2020). RDF2Vec DBpedia inverse predicate frequency embeddings [Dataset]. http://doi.org/10.5281/zenodo.1320007
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1320007
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Cochez; Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the vectors from computing RDF2vec embeddings from a inverse predicate frequency weighted DBpedia 2016-04 graph.

For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector.

The parameter settings for the embedding are as specified in the paper:

Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Biased graph walks for RDF graph embeddings. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics (WIMS '17). ACM, New York, NY, USA, Article 21, 12 pages. DOI: https://doi.org/10.1145/3102254.3102279
Z
RDF2Vec DBpedia Page Rank embeddings
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ponzetto, Simone Paulo (2020). RDF2Vec DBpedia Page Rank embeddings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1320037
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Ponzetto, Simone Paulo
Ristoski, Petar
Paulheim, Heiko
Cochez, Michael
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the vectors from computing RDF2vec embeddings from a Page Rank weighted DBpedia 2016-04 graph.

For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector.

The parameter settings for the embedding are as specified in the paper:

Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Biased graph walks for RDF graph embeddings. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics (WIMS '17). ACM, New York, NY, USA, Article 21, 12 pages. DOI: https://doi.org/10.1145/3102254.3102279
Z
CaLiGraph - A Large-Scale Semantic Knowledge Graph compiled from Wikipedia...
data.niaid.nih.gov
zenodo.org
Updated Jun 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paulheim, Heiko (2023). CaLiGraph - A Large-Scale Semantic Knowledge Graph compiled from Wikipedia Categories and List Pages [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3484511
Explore at:
Dataset updated
Jun 25, 2023
Dataset provided by
Heist, Nicolas
Paulheim, Heiko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CaLiGraph is a large-scale semantic knowledge graph with a rich ontology which is compiled from the DBpedia ontology, and Wikipedia categories & list pages. For more information, visit http://caligraph.org

Information about uploaded files: (all files are b-zipped and in the n-triple format)

caligraph-metadata.nt.bz2 Metadata about the dataset which is described using void vocabulary.

caligraph-ontology.nt.bz2 Class definitions, property definitions, restrictions, and labels of the CaLiGraph ontology.

caligraph-ontology_dbpedia-mapping.nt.bz2 Mapping of classes and properties to the DBpedia ontology.

caligraph-ontology_provenance.nt.bz2 Provenance information about classes (i.e. which Wikipedia category or list page has been used to create this class).

caligraph-instances_types.nt.bz2 Definition of instances and (non-transitive) types.

caligraph-instances_transitive-types.nt.bz2 Transitive types for instances (can also be induced by a reasoner).

caligraph-instances_labels.nt.bz2 Labels for instances.

caligraph-instances_relations.nt.bz2 Relations between instances derived from the class restrictions of the ontology (can also be induced by a reasoner).

caligraph-instances_dbpedia-mapping.nt.bz2 Mapping of instances to respective DBpedia instances.

caligraph-instances_provenance.nt.bz2 Provenance information about instances (e.g. if the instance has been extracted from a Wikipedia list page).

dbpedia_caligraph-instances.nt.bz2 Additional instances of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !

dbpedia_caligraph-types.nt.bz2 Additional types of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !

dbpedia_caligraph-relations.nt.bz2 Additional relations of CaLiGraph that are not in DBpedia. ! This file is no part of CaLiGraph but should rather be used as an extension to DBpedia. The triples use the DBpedia namespace and can thus be used to directly extend DBpedia. !

Changelog

v3.1.1

Fixed an encoding issue in caligraph-ontology.nt.bz2

v3.1.0

Fixed several issues related to ontology consistency and structure

v3.0.0

Added functionality to group mentions of unknown entities into distinct entities

v2.1.0

Fixed error that lead to a class inheriting from a disjoint class

Introduced owl:ObjectProperty and owl:DataProperty instead of rdf:Property

Several cosmetic fixes

v2.0.2

Fixed incorrect formatting of some properties

v2.0.1

Better entity extraction and representation

Small cosmetic fixes

v2.0.0

Entity extraction from arbitrary tables and enumerations in Wikipedia pages

v1.4.0

BERT-based recognition of subject entities and improved language models from spaCy 3.0

v1.3.1

Fixed minor encoding errors and improved formatting

v1.3.0

CaLiGraph is now based on a recent version of Wikipedia and DBpedia from November 2020

v1.1.0

Improved the CaLiGraph type hierarchy

Many small bugfixes and improvements

v1.0.9

Additional alternative labels for CaLiGraph instances

v1.0.8

Small cosmetic changes to URIs to be closer to DBpedia URIs

v1.0.7

Mappings from CaLiGraph classes to DBpedia classes are now realised via rdfs:subClassOf instead of owl:equivalentClass

Entities are now URL-encoded to improve accessibility

v1.0.6

Fixed a bug in the ontology creation step that led to a substantially lower amount of sub-type relationships than actually exist. The new version provides a richer type hierarchy that also leads to an increased amount of types for resources.

v1.0.5

Fixed a bug that has declared CaLiGraph predicates as subclasses of owl:Predicate instead of being of the type owl:Predicate.
Detecting Synonymous Relationships by Shared Data-driven Definitions
figshare.com
txt
Updated Dec 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan-Christoph Kalo (2019). Detecting Synonymous Relationships by Shared Data-driven Definitions [Dataset]. http://doi.org/10.6084/m9.figshare.11343785.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11343785.v1
Dataset updated
Dec 9, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jan-Christoph Kalo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets that can be used together with the Code in: https://github.com/JanKalo/RuleAlign
o
Rdf2Vec Dbpedia Inverse Page Rank Frequency Embeddings
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Jun 19, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim (2017). Rdf2Vec Dbpedia Inverse Page Rank Frequency Embeddings [Dataset]. http://doi.org/10.5281/zenodo.1320810
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1320810
Dataset updated
Jun 19, 2017
Authors
Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim
Description
This dataset contains the vectors from computing RDF2vec embeddings from a inverse Page Rank frequency weighted DBpedia 2016-04 graph. For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector. The parameter settings for the embedding are as specified in the paper: Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Biased graph walks for RDF graph embeddings. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics (WIMS '17). ACM, New York, NY, USA, Article 21, 12 pages. DOI: https://doi.org/10.1145/3102254.3102279
QBLink-KG: QBLink Adapted to DBpedia Knowledge Graph
figshare.com
json
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mona Zamiri; Yao Qiang; Fedor Nikolaev; Dongxiao Zhu; Alexander Kotov (2024). QBLink-KG: QBLink Adapted to DBpedia Knowledge Graph [Dataset]. http://doi.org/10.6084/m9.figshare.25256290.v3
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25256290.v3
Dataset updated
Feb 21, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Mona Zamiri; Yao Qiang; Fedor Nikolaev; Dongxiao Zhu; Alexander Kotov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
QBLink-KG is a modified version of QBLink, which is a high-quality benchmark for evaluating conversational understanding of Wikipedia content.QBLink consists of sequences of up to three hand-crafted queries, with responses being single-named entities that match the titles of Wikipedia articles.For the QBLink-KG, the English subset of the DBpedia snapshot from September 2021 was used as the target Knowledge Graph. QBLink answers provided as the titles of Wikipedia infoboxes can be easily mapped to DBpedia entity URIs - if the corresponding entities are present in DBpedia - since DBpedia is constructed through the extraction of information from Wikipedia infoboxes.QBLink, in its original format, is not directly applicable for Conversational Entity Retrieval from a Knowledge Graph (CER-KG) because knowledge graphs contain considerably less information than Wikipedia. A named entity serving as an answer to a QBLink query may not be present as an entity in DBpedia. To modify QBLink for CER over DBpedia, we implemented two filtering steps: 1) we removed all queries for which the wiki_page field is empty, or the answer cannot be mapped to a DBpedia entity or does not match to a Wikipedia page. 2) For the evaluation of a model with specific techniques for entity linking and candidate selection, we excluded queries with answers that do not belong to the set of candidate entities derived using that model.The original QBLink dataset files before filtering are:QBLink-train.jsonQBLink-dev.jsonQBLink-test.jsonAnd the final QBLink-KG files after filtering are:QBLink-Filtered-train.jsonQBLink-Filtered-dev.jsonQBLink-Filtered-test.jsonWe used below references to construct QBLink-KG:Ahmed Elgohary, Chen Zhao, and Jordan Boyd-Graber. 2018. A dataset and baselines for sequential open-domain question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1077–1083, Brussels, Belgium. Association for Computational Linguistics.https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2021-09Lehmann, Jens et al. ‘DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia’. 1 Jan. 2015 : 167 – 195.To give more details about QBLink-KG, please read our research paper:Zamiri, Mona, et al. "Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph", The Web Conference 2024.
Z
RDF2Vec DBpedia inverse page rank split embeddings
data.niaid.nih.gov
explore.openaire.eu
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cochez, Michael (2020). RDF2Vec DBpedia inverse page rank split embeddings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1320004
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Ponzetto, Simone Paulo
Ristoski, Petar
Paulheim, Heiko
Cochez, Michael
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the vectors from computing RDF2vec embeddings from a inverse page rank split weighted DBpedia 2016-04 graph.

For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector.

The parameter settings for the embedding are as specified in the paper:

Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Biased graph walks for RDF graph embeddings. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics (WIMS '17). ACM, New York, NY, USA, Article 21, 12 pages. DOI: https://doi.org/10.1145/3102254.3102279
Untitled Item
figshare.com
zip
Updated Mar 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huang Hao (2024). Untitled Item [Dataset]. http://doi.org/10.6084/m9.figshare.24769494.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24769494.v1
Dataset updated
Mar 19, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Huang Hao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository includes all code and data for causal inference over the knowledge graph, it includes experiments over four datasets: the synthetic review dataset, the open review dataset, the subset of DBpedia related to the writer, the MIMIC-III (we don't offer the data due to confidential issue).
o
Kglove Dbpedia Page Rank Split Frequency Embeddings
explore.openaire.eu
data.niaid.nih.gov
Updated Oct 21, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim (2017). Kglove Dbpedia Page Rank Split Frequency Embeddings [Dataset]. http://doi.org/10.5281/zenodo.1320169
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1320169
Dataset updated
Oct 21, 2017
Authors
Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim
Description
This dataset contains the vectors from computing KGloVe embeddings from a Page Rank split frequency weighted DBpedia 2016-04 graph. For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector. The parameter settings for the embedding are as specified in the paper: Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Global RDF Vector Space Embeddings. In The Semantic Web – ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21–25, 2017, Proceedings, Part I.
QALD-9-Plus
figshare.com
paperswithcode.com
+1more
txt
Updated Dec 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aleksandr Perevalov; Andreas Both; Dennis Diefenbach; Ricardo Usbeck (2021). QALD-9-Plus [Dataset]. http://doi.org/10.6084/m9.figshare.16864273.v7
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16864273.v7
Dataset updated
Dec 21, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Aleksandr Perevalov; Andreas Both; Dennis Diefenbach; Ricardo Usbeck
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
QALD-9-Plus is the dataset for Knowledge Graph Question Answering (KGQA) based on well-known QALD-9.QALD-9-Plus enables to train and test KGQA systems over DBpedia and Wikidata using questions in 8 different languages.Some of the questions have several alternative writings in particular languages which enables to evaluate the robustness of KGQA systems and train paraphrasing models.As the questions' translations were provided by native speakers, they are considered as "gold standard", therefore, machine translation tools can be trained and evaluated on the dataset.Please, see also the GitHub repository: https://github.com/Perevalov/qald_9_plus
Z
KGloVe DBpedia uniform embeddings
data.niaid.nih.gov
explore.openaire.eu
Updated Jan 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cochez, Michael (2020). KGloVe DBpedia uniform embeddings [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1320147
Explore at:
Dataset updated
Jan 21, 2020
Dataset provided by
Ponzetto, Simone Paulo
Ristoski, Petar
Paulheim, Heiko
Cochez, Michael
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the vectors from computing KGloVe embeddings from a uniformly weighted DBpedia 2016-04 graph.

For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector.

The parameter settings for the embedding are as specified in the paper:

Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Global RDF Vector Space Embeddings. In The Semantic Web – ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21–25, 2017, Proceedings, Part I.
RDF2Vec DBpedia uniform embeddings
zenodo.org
data.niaid.nih.gov
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Cochez; Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim (2020). RDF2Vec DBpedia uniform embeddings [Dataset]. http://doi.org/10.5281/zenodo.1318146
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1318146
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Cochez; Michael Cochez; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim; Petar Ristoski; Simone Paulo Ponzetto; Heiko Paulheim
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the vectors from computing RDF2vec embeddings from a uniformly weighted DBpedia 2016-04 graph.

For each entity in the graph, the text file in the zip archive contains a line with the entity name and the embedded vector.

The parameter settings for the embedding are as specified in the paper:

Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Biased graph walks for RDF graph embeddings. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics (WIMS '17). ACM, New York, NY, USA, Article 21, 12 pages. DOI: https://doi.org/10.1145/3102254.3102279
DBkWik Plus Plus
figshare.com
bin
Updated Sep 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven Hertling; Heiko Paulheim (2022). DBkWik Plus Plus [Dataset]. http://doi.org/10.6084/m9.figshare.20407864.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20407864.v1
Dataset updated
Sep 29, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sven Hertling; Heiko Paulheim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Large knowledge graphs like DBpedia and YAGO are always based on the same source - namely Wikipedia. But there are more wikis that contain information about long-tail entities such as wiki hosting platforms like Fandom. In this paper, we present the approach and analysis of DBkWik++, a fused Knowledge Graph from thousands of wikis. A modified version of the DBpedia framework is applied to each wiki which results in many isolated Knowledge Graphs. With an incremental merge based approach, we reuse one-to-one matching systems to solve the multi source KG matching task. Based on this alignment we create a consolidated knowledge graph with more than 15 million instances.

Facebook

Twitter

Click to copy link

Link copied

Cite

Martin Pekár Christensen; Martin Pekár Christensen; Matteo Lissandrini; Matteo Lissandrini; Katja Hose; Katja Hose (2024). DBpedia RDF2Vec Graph Embeddings [Dataset]. http://doi.org/10.5281/zenodo.6384728

DBpedia RDF2Vec Graph Embeddings

Explore at:

pdf, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6384728

Dataset updated

Jul 17, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Martin Pekár Christensen; Martin Pekár Christensen; Matteo Lissandrini; Matteo Lissandrini; Katja Hose; Katja Hose

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

DBpedia graph embeddings using RDF2Vec. RDF2Vec embedding generation code can be found here and is based on a publication by Portisch et al. [1].

The embeddings dataset consists of 200-dimensional vectors of DBpedia entities (from 1/9/2021).

Figure of cosine similarities between a selected set of DBpedia entities are provided in the dataset here.

Generating Embeddings

The code for generating these embeddings can be found here.

Run the run.sh script that wraps all the necessary commmands to generate embeddings

bash run.sh

The script downloads a set of DBpedia files, which are listed in dbpedia_files.txt. It then builds a Docker image and runs a container of that image that generates the embeddings for the DBpedia graph defined by the DBpedia files.

A folder files is created containing all the downloaded DBpedia files, and a folder embeddings/dbpedia is created containing the embeddings in vectors.txt along a set of random walk files.

Run Time of Embeddings Generation

Generating embeddings can take more than a day, but it depends on the number of DBpedia files chosen to be downloaded. Following are some basic run time statistics when embeddings are generated on a 64 GB RAM, 8 cores (AMD EPYC), 1 TB SSD, 1996.221 MHz machine.

Total: 1 day, 8 hours, 52 minutes, 41 seconds
Walk generation: 0 days, 7 minutes, 24 minutes, 36 seconds
Training: 1 day, 1 hour, 28 minutes, 5 seconds

Parameters Used

Here is listed the parameters used to generate the embeddings provided here:

Number of walks per entity: 100
Depth (hops) per walk: 4
Walk generation mode: RANDOM_WALKS_DUPLICATE_FREE
Threads: # of processors / 2
Training mode: sg
Embeddings vector dimension: 200
Minimum word2vec word count: 1
Sample rate: 0.0
Training window size: 5
Training epochs: 5

Clear search

Close search

Google apps

Main menu

DBpedia RDF2Vec Graph Embeddings

DBpedia Dataset

Data from: Universal Knowledge Graph Embeddings

S-DBpedia: A benchmark dataset and evaluation for spatial knowledge graph...

CollabRec: DBpedia Subgraphs (2022-09)

SimpleDBpediaQA Dataset

Classes Knowledge Graph

RDF2Vec DBpedia inverse predicate frequency embeddings

RDF2Vec DBpedia Page Rank embeddings

CaLiGraph - A Large-Scale Semantic Knowledge Graph compiled from Wikipedia...

Detecting Synonymous Relationships by Shared Data-driven Definitions

Rdf2Vec Dbpedia Inverse Page Rank Frequency Embeddings

QBLink-KG: QBLink Adapted to DBpedia Knowledge Graph

RDF2Vec DBpedia inverse page rank split embeddings

Untitled Item

Kglove Dbpedia Page Rank Split Frequency Embeddings

QALD-9-Plus

KGloVe DBpedia uniform embeddings

RDF2Vec DBpedia uniform embeddings

DBkWik Plus Plus

DBpedia RDF2Vec Graph EmbeddingsSee More Versions

DBpedia RDF2Vec Graph Embeddings