100+ datasets found
  1. wikidata-20220103-all.json.gz

    • academictorrents.com
    bittorrent
    Updated Jan 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wikidata.org (2022). wikidata-20220103-all.json.gz [Dataset]. https://academictorrents.com/details/229cfeb2331ad43d4706efd435f6d78f40a3c438
    Explore at:
    bittorrent(109042925619)Available download formats
    Dataset updated
    Jan 24, 2022
    Dataset provided by
    Wikidata//wikidata.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A BitTorrent file to download data with the title 'wikidata-20220103-all.json.gz'

  2. h

    wikidata-parallel-descriptions-en-ja

    • huggingface.co
    Updated May 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    elanmitsua (2024). wikidata-parallel-descriptions-en-ja [Dataset]. https://huggingface.co/datasets/Mitsua/wikidata-parallel-descriptions-en-ja
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2024
    Authors
    elanmitsua
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Wikidata parallel descriptions en-ja

    Parallel corpus for machine translation generated from wikidata dump (2024-05-06). Currently we processed only English/Japanese pair. The jsonl file is ready-to-train by Hugging Face transformers trainer for translation tasks.

      Dataset Details
    

    https://www.wikidata.org/wiki/Wikidata:Database_download

      Dataset Creation
    

    As Wikidata description field does not represent exact direct translation, filtering is required for… See the full description on the dataset page: https://huggingface.co/datasets/Mitsua/wikidata-parallel-descriptions-en-ja.

  3. a

    Wikidata PageRank

    • danker.s3.amazonaws.com
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Thalhammer (2025). Wikidata PageRank [Dataset]. https://danker.s3.amazonaws.com/index.html
    Explore at:
    tsv, application/n-triples, application/vnd.hdt, ttlAvailable download formats
    Dataset updated
    Oct 16, 2025
    Authors
    Andreas Thalhammer
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Regularly published dataset of PageRank scores for Wikidata entities. The underlying link graph is formed by a union of all links accross all Wikipedia language editions. Computation is performed Andreas Thalhammer with 'danker' available at https://github.com/athalhammer/danker . If you find the downloads here useful please feel free to leave a GitHub ⭐ at the repository and buy me a ☕ https://www.buymeacoffee.com/thalhamm

  4. Wikidata dump from 2018-12-17 in JSON

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Jan 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jakub Klímek; Jakub Klímek; Petr Škoda; Petr Škoda (2021). Wikidata dump from 2018-12-17 in JSON [Dataset]. http://doi.org/10.5281/zenodo.4436356
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jakub Klímek; Jakub Klímek; Petr Škoda; Petr Škoda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dump from Wikidata from 2018-12-17 in JSON. This one is not avavailable anymore from Wikidata. It was downloaded originally from https://dumps.wikimedia.org/other/wikidata/20181217.json.gz and recompressed to fit on Zenodo.

  5. Wikidata item quality labels

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Glorian Yapinus; Amir Sarabadani; Aaron Halfaker (2023). Wikidata item quality labels [Dataset]. http://doi.org/10.6084/m9.figshare.5035796.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Glorian Yapinus; Amir Sarabadani; Aaron Halfaker
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains quality labels for 5000 Wikidata items applied by Wikidata editors. The labels correspond to the quality scale described at https://www.wikidata.org/wiki/Wikidata:Item_quality Each line is a JSON blob with the following fields: - item_quality: The labeled quality class (A-E)- rev_id: the revision identifier of the version of the item that was labeled- strata: The size of the item in bytes at the time it was sampled- page_len: The actual size of the item in bytes- page_title: The Qid of the item- claims: A dictionary including P31 "instance-of" values for filtering out certain types of itemsThe # of observations by class is: - A class: 322- B class: 438- C class: 1773- D class: 997- E class: 1470

  6. d

    20200210 wikidata all json gz

    • data.depositar.io
    gz
    Updated Jun 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    無專案資料集 (Unassigned Datasets) (2020). 20200210 wikidata all json gz [Dataset]. https://data.depositar.io/en/dataset/20200210-wikidata-dump
    Explore at:
    gzAvailable download formats
    Dataset updated
    Jun 10, 2020
    Dataset provided by
    無專案資料集 (Unassigned Datasets)
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikidata all data dump 時間: 2020/2/10 格式:JSON 檔案格式:gz

  7. Wikidata Companies Graph

    • zenodo.org
    • data.hellenicdataservice.gr
    • +1more
    application/gzip
    Updated Aug 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pantelis Chronis; Pantelis Chronis (2020). Wikidata Companies Graph [Dataset]. http://doi.org/10.5281/zenodo.3971752
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 5, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pantelis Chronis; Pantelis Chronis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains information about commercial organizations (companies) and their relations with other commercial organizations, persons, products, locations, groups and industries. The dataset has the form of a graph. It has been produced by the SmartDataLake project (https://smartdatalake.eu), using data collected from Wikidata (https://www.wikidata.org).

  8. h

    wikidata-en-descriptions

    • huggingface.co
    Updated Aug 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Erenrich (2023). wikidata-en-descriptions [Dataset]. https://huggingface.co/datasets/derenrich/wikidata-en-descriptions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 5, 2023
    Authors
    Daniel Erenrich
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    derenrich/wikidata-en-descriptions dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. f

    Wikidata Human Gender Indicators

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Klein; Piotr Konieczny; Harsh Gupta; Vivek Rai; Haiyi Zhu (2023). Wikidata Human Gender Indicators [Dataset]. http://doi.org/10.6084/m9.figshare.3100903.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Max Klein; Piotr Konieczny; Harsh Gupta; Vivek Rai; Haiyi Zhu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a collection of Gender Indicators from Wikidata and Wikipedia of Human Biographies. Data is derived from the 2016-01-03 Wikidata snapshot.Each file describe the humans in Wikidata aggregated by Gender (Property:P21), and dissaggregated by the following Wikidata Properties: - Date of Birth (P569)- Date of Death (P570)- Place of Birth (P19)- Country of Citizenship (P27)- Ethnic Group (P172)- Field of Work (P101)- Occupation (P106)- Wikipedia Language ("Sitelinks") Further aggregations of the data are: - World Map (Countries derived from place of birth and citizenship)- World Cultures (Inglehart Welzel Map applied to World Map)- Gender Co-Occurence (Humans with multiple genders).Wikidata labels have be translated to English for convenience when possible. You may still see values with "QIDs" which means there was no English translation possible. In the case where there were multiple values, such as for occupation, the we count the gender as co-occuring with each occupation separately.For more information. http://wigi.wmflabs.org/

  10. Wikidata dump 2017-12-27

    • zenodo.org
    bz2
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiData; WikiData (2020). Wikidata dump 2017-12-27 [Dataset]. http://doi.org/10.5281/zenodo.1211767
    Explore at:
    bz2Available download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    WikiData; WikiData
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
  11. wikidata-20180813-all.json.bz2

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikidata (2020). wikidata-20180813-all.json.bz2 [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3268724
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Wikidata//wikidata.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A copy of a dump which was available from WikiMedia: https://dumps.wikimedia.org/wikidatawiki/entities/

  12. Dump of Wikidata of January 1st, 2024.

    • academictorrents.com
    bittorrent
    Updated Jan 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wikidata.org (2024). Dump of Wikidata of January 1st, 2024. [Dataset]. https://academictorrents.com/details/0852ef544a4694995fcbef7132477c688ded7d9a
    Explore at:
    bittorrent(130532238059)Available download formats
    Dataset updated
    Jan 4, 2024
    Dataset provided by
    Wikidata//wikidata.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikidata: A Free and Open Knowledge Base Accessible: Readable and editable by humans and machines alike. Central Hub: Serves as the core storage for structured data across Wikimedia s sister projects, such as Wikipedia, Wikivoyage, Wiktionary, Wikisource, and more. Current Edition: This torrent represents an unofficial dump of Wikidata as of January 1st, 2024.

  13. E

    Wikidata

    • live.european-language-grid.eu
    json
    Updated Oct 28, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). Wikidata [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/7268
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 28, 2012
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikidata is a free and open knowledge base that can be read and edited by both humans and machines. Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.

  14. h

    wikidata-enwiki-categories-and-statements

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Erenrich (2025). wikidata-enwiki-categories-and-statements [Dataset]. https://huggingface.co/datasets/derenrich/wikidata-enwiki-categories-and-statements
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2025
    Authors
    Daniel Erenrich
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    derenrich/wikidata-enwiki-categories-and-statements dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. Wikidata Constraint Violations - July 2018 - extended

    • figshare.com
    txt
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Pellissier Tanon (2023). Wikidata Constraint Violations - July 2018 - extended [Dataset]. http://doi.org/10.6084/m9.figshare.13338743.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Thomas Pellissier Tanon
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset is a cleaned up and annotated version of an other dataset previously shared: https://figshare.com/articles/dataset/Wikidata_Constraints_Violations_-_July_2017/7712720This dataset contains corrections for Wikidata constraint violations extracted from the July 1st 2018 Wikidata full history dump.It has been created as part of a work named Neural Knowledge Base Repairs by Thomas Pellissier Tanon and Fabian Suchanek.An example of code making use of this dataset is available on GitHub: https://github.com/Tpt/bass-materials/blob/master/corrections_learning.ipynbThe following constraints are considered:* conflicts with: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Conflicts_with* distinct values: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Unique_value* inverse and symmetric: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Inverse https://www.wikidata.org/wiki/Help:Property_constraints_portal/Symmetric* item requires statement: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Item* one of: https://www.wikidata.org/wiki/Help:Property_constraints_portal/One_of* single value: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Single_value* type: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Type* value requires statement: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Target_required_claim* value type: https://www.wikidata.org/wiki/Help:Property_constraints_portal/Value_typeThe constraints.tsv file contains the list of most of the Wikidata constraints considered in this dataset (beware, there could be some discrepancies for type, valueType, itemRequiresClaim and valueRequiresClaim constraints).It is a tabbed-separated file with the following columns:1. constrain id: the URI of the Wikidata statement describing the constraint2. property id: the URI of the property that is constrained3. type id: the URI of the constraint type (type, value type...). It is a Wikidata item.4. 15 columns for the possible attributes of the constraint. If an attribute has multiple values, they are in the same cell but separated by a space. The columns are:* regex: https://www.wikidata.org/wiki/Property:P1793* exceptions: https://www.wikidata.org/wiki/Property:P2303* group by: https://www.wikidata.org/wiki/Property:P2304* items: https://www.wikidata.org/wiki/Property:P2305* property: https://www.wikidata.org/wiki/Property:P2306* namespace: https://www.wikidata.org/wiki/Property:P2307* class: https://www.wikidata.org/wiki/Property:P2308* relation: https://www.wikidata.org/wiki/Property:P2309* minimal date: https://www.wikidata.org/wiki/Property:P2310* maximum date: https://www.wikidata.org/wiki/Property:P2311* maximum value: https://www.wikidata.org/wiki/Property:P2312* minimal value: https://www.wikidata.org/wiki/Property:P2313* status: https://www.wikidata.org/wiki/Property:P2316* separator: https://www.wikidata.org/wiki/Property:P4155* scope: https://www.wikidata.org/wiki/Property:P5314The other files provide for each constraint type the list of all corrections extracted from the edit history. The format of the file is one line per correction with the following tabbed-separated values: 1. constraint id 2. revision that fixed the constraint violation 3. first violation triple subject 4. first violation triple predicate 5. first violation triple object 6. second violation triple subject (blank if no second violation triple) 7. second violation triple predicate (blank if no second violation triple) 8. second violation triple object (blank if no second violation triple) 9. separator (not useful) 10. subject of the first triple in the correction 11. predicate of the first triple in the correction 12. object of the first triple in the correction 13. is the first triple in the correction an addition or a deletion (for a deletion and for an addition) 14. subject of the second triple in the correction (might not exist) 15. predicate of the econd triple in the correction (might not exist) 16. object of the econd triple in the correction (might not exist) 17. is the second triple in the correction an addition or a deletion (for a deletion and for an addition) (might not exist) 18. Description of the subject of the first violation triple encoded in JSON 19. Description of the object of the first violation triple encoded in JSON (might be empty for literals) 20. Description of the term of the second triple that has not already be described by the two previous description. (might be empty for literals or if there is no second triple)

  16. Wikidata Causal Event Triple Data

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sola; Sola; Debarun; Debarun; Oktie; Oktie (2023). Wikidata Causal Event Triple Data [Dataset]. http://doi.org/10.5281/zenodo.7196049
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sola; Sola; Debarun; Debarun; Oktie; Oktie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains triples curated from Wikidata surrounding news events with causal relations, and is released as part of our WWW'23 paper, "Event Prediction using Case-Based Reasoning over Knowledge Graphs".

    Starting from a set of classes that we consider to be types of "events", we queried Wikidata to collect entities that were an instanceOf an event class and that were connected to another such event entity by a causal triple (https://www.wikidata.org/wiki/Wikidata:List_of_properties/causality). For all such cause-effect event pairs, we then collected a 3-hop neighborhood of outgoing triples.

  17. Z

    UMLS-Wikidata

    • data.niaid.nih.gov
    Updated Sep 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa, Faizan E; Dima, Corina; Ochoa, Juan G. Diaz; Staab, Steffen (2024). UMLS-Wikidata [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11003202
    Explore at:
    Dataset updated
    Sep 6, 2024
    Dataset provided by
    University of Stuttgart
    QUIBIQ GmbH
    PerMediQ GmbH
    Authors
    Mustafa, Faizan E; Dima, Corina; Ochoa, Juan G. Diaz; Staab, Steffen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    UMLS_Wikidata is a German biomedical entity linking knowledge base that provides good coverage for German entity linking datasets such as WikiMed-DE-BEL. The knowledge base is created by filtering out the Wikidata items that contain the Concept Unique Itentifier (CUI) of UMLS. Each entry in the knowledge base consists of Wikidata QID, label, description, UMLS CUI and aliases. The resulting KB has 731,414 Wikidata QIDs, 599,330 unique CUIs and 671,797 unique (mention, CUI) pairs where mention includes label and aliases.

  18. Topics for each Wikipedia Article across Languages

    • figshare.com
    application/gzip
    Updated Jun 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diego Saez-Trumper (2020). Topics for each Wikipedia Article across Languages [Dataset]. http://doi.org/10.6084/m9.figshare.12127434.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 29, 2020
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Diego Saez-Trumper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the predicted topic(s) for (almost) each Wikipedia article across languages. Each row contains the following columns:Qid,topic,probability,page_id,page_title,wiki_db Where: * Qid: Wikidata Item Id* topic: Topic based on the ORES draft topic model (https://www.mediawiki.org/wiki/Talk:ORES/Draft_topic) * probability: Probability to belong to the topic* page_id: page_id* page_title: page_title* wiki_db: wiki_db, for example for english Wikipedia is enwikiFor exampleQ1000211,Geography.Regions.Europe.Western_Europe,1.0,166578,Frières-Faillouël,euwikiTopics are predicted using the Wikidata-Topic model developed by Isaac Johnson (https://github.com/geohci/wikidata-topic-model)The source code to create this dataset can be found here:https://github.com/digitalTranshumant/wikidata-topic-model

  19. Z

    Improving the Utility and Trustworthiness of Knowledge Graph Embeddings with...

    • data.niaid.nih.gov
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Safavi, Tara; Koutra, Danai; Meij, Edgar (2020). Improving the Utility and Trustworthiness of Knowledge Graph Embeddings with Calibration [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3738263
    Explore at:
    Dataset updated
    Apr 2, 2020
    Dataset provided by
    University of Michigan
    Bloomberg
    Authors
    Safavi, Tara; Koutra, Danai; Meij, Edgar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains two public knowledge graph datasets used in our paper Improving the Utility of Knowledge Graph Embeddings with Calibration. Each dataset is described below.

    Note that for our experiments we split each dataset randomly 5 times into 80/10/10 train/validation/test splits. We recommend that users of our data do the same to avoid (potentially) overfitting models to a single dataset split.

    wikidata-authors

    This dataset was extracted by querying the Wikidata API for facts about people categorized as "authors" or "writers" on Wikidata. Note that all head entities of triples are people (authors or writers), and all triples describe something about that person (e.g., their place of birth, their place of death, or their spouse). The knowledge graph has 23,887 entities, 13 relations, and 86,376 triples.

    The files are as follows:

    entities.tsv: A tab-separated file of all unique entities in the dataset. The fields are as follows:

    eid: The unique Wikidata identifier of this entity. You can find the corresponding Wikidata page at https://www.wikidata.org/wiki/.

    label: A human-readable label of this entity (extracted from Wikidata).

    relations.tsv: A tab-separated file of all unique relations in the dataset. The fields are as follows:

    rid: The unique Wikidata identifier of this relation. You can find the corresponding Wikidata page at https://www.wikidata.org/wiki/Property:.

    label: A human-readable label of this relation (extracted from Wikidata).

    triples.tsv: A tab-separated file of all triples in the dataset, in the form of , , .

    fb15krr-linked

    This dataset is an extended version of the FB15k+ dataset provided by [Xie et al IJCAI16]. It has been linked to Wikidata using Freebase MIDs (machine IDs) as keys; we discarded triples from the original dataset that contained entities that could not be linked to Wikidata. We also removed reverse relations following the procedure described by [Toutanova and Chen CVSC2015]. Finally, we removed existing triples labeled as False and added predicted triples labeled as True based on the crowdsourced annotations we obtained in our True or False Facts experiment (see our paper for details). The knowledge graph consists of 14,289 entities, 770 relations, and 272,385 triples.

    The files are as follows:

    entities.tsv: A tab-separated file of all unique entities in the dataset. The fields are as follows:

    mid: The Freebase machine ID (MID) of this entity.

    wiki: The corresponding unique Wikidata identifier of this entity. You can find the corresponding Wikidata page at https://www.wikidata.org/wiki/.

    label: A human-readable label of this entity (extracted from Wikidata).

    types: All hierarchical types of this entity, as provided by [Xie et al IJCAI16].

    relations.tsv: A tab-separated file of all unique relations in the dataset. The fields are as follows:

    label: The hierarchical Freebase label of this relation.

    triples.tsv: A tab-separated file of all triples in the dataset, in the form of , , .

  20. Event representation on Wikidata and Wikipedia with, and without the...

    • data.europa.eu
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo, Event representation on Wikidata and Wikipedia with, and without the analysis of vernacular languages [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-4733507?locale=bg
    Explore at:
    unknown(1594252)Available download formats
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This projects aims at proving with data that it is necessary to analyze vernacular languages when dealing with events that are described using public sources likes Wikidata and Wikipedia. In order to retrieve and analyze events, it uses the wikivents Python package. We provide in the project directory the Jupyter Notebook that processed (and/or generate) the dataset directory content. Statistics from this analysis is located in the stats directory. The main statistics are reported in the associated paper.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
wikidata.org (2022). wikidata-20220103-all.json.gz [Dataset]. https://academictorrents.com/details/229cfeb2331ad43d4706efd435f6d78f40a3c438
Organization logo

wikidata-20220103-all.json.gz

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
bittorrent(109042925619)Available download formats
Dataset updated
Jan 24, 2022
Dataset provided by
Wikidata//wikidata.org/
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

A BitTorrent file to download data with the title 'wikidata-20220103-all.json.gz'

Search
Clear search
Close search
Google apps
Main menu