100+ datasets found
  1. wikidata-all

    • huggingface.co
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikimedia Movement (2024). wikidata-all [Dataset]. https://huggingface.co/datasets/Wikimedians/wikidata-all
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2024
    Dataset provided by
    Wikimedia movementhttps://wikimedia.org/
    Authors
    Wikimedia Movement
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Wikidata - All Entities

    This Hugging Face Data Set contains the entirety of Wikidata as of the date listed below. Wikidata is a freely licensed structured knowledge graph following the wiki model of user contributions. If you build on this data please consider contributing back to Wikidata. For more on the size and other statistics of Wikidata, see: Special:Statistics. Current Dump as of: 2024-03-04

      Original Source
    

    The data contained in this repository is retrieved… See the full description on the dataset page: https://huggingface.co/datasets/Wikimedians/wikidata-all.

  2. wikidata-20220103-all.json.gz

    • academictorrents.com
    bittorrent
    Updated Jan 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wikidata.org (2022). wikidata-20220103-all.json.gz [Dataset]. https://academictorrents.com/details/229cfeb2331ad43d4706efd435f6d78f40a3c438
    Explore at:
    bittorrent(109042925619)Available download formats
    Dataset updated
    Jan 24, 2022
    Dataset provided by
    Wikidata//wikidata.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A BitTorrent file to download data with the title 'wikidata-20220103-all.json.gz'

  3. Wikidata Entities of Interest

    • opensanctions.org
    csv
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikidata (2024). Wikidata Entities of Interest [Dataset]. https://www.opensanctions.org/datasets/wd_curated/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 6, 2024
    Dataset authored and provided by
    Wikidata//wikidata.org/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Persons of interest profiles from Wikidata, the structured data version of Wikipedia.

  4. a

    Wikidata PageRank

    • danker.s3.amazonaws.com
    Updated Sep 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Thalhammer (2025). Wikidata PageRank [Dataset]. https://danker.s3.amazonaws.com/index.html
    Explore at:
    tsv, application/n-triples, application/vnd.hdt, ttlAvailable download formats
    Dataset updated
    Sep 13, 2025
    Authors
    Andreas Thalhammer
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Regularly published dataset of PageRank scores for Wikidata entities. The underlying link graph is formed by a union of all links accross all Wikipedia language editions. Computation is performed Andreas Thalhammer with 'danker' available at https://github.com/athalhammer/danker . If you find the downloads here useful please feel free to leave a GitHub ⭐ at the repository and buy me a ☕ https://www.buymeacoffee.com/thalhamm

  5. Wikidata Persons in Relevant Categories

    • opensanctions.org
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikidata (2025). Wikidata Persons in Relevant Categories [Dataset]. https://www.opensanctions.org/datasets/wd_categories/
    Explore at:
    Dataset updated
    Oct 10, 2025
    Dataset authored and provided by
    Wikidata//wikidata.org/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Category-based imports from Wikidata, the structured data version of Wikipedia.

  6. h

    wikidata-en-descriptions-small

    • huggingface.co
    Updated Aug 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Erenrich (2023). wikidata-en-descriptions-small [Dataset]. https://huggingface.co/datasets/derenrich/wikidata-en-descriptions-small
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 5, 2023
    Authors
    Daniel Erenrich
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    derenrich/wikidata-en-descriptions-small dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. Wikidata dump 2017-12-27

    • zenodo.org
    bz2
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WikiData; WikiData (2020). Wikidata dump 2017-12-27 [Dataset]. http://doi.org/10.5281/zenodo.1211767
    Explore at:
    bz2Available download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    WikiData; WikiData
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
  8. KeySearchWiki

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Feb 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leila Feddoul; Leila Feddoul; Frank Löffler; Frank Löffler; Sirko Schindler; Sirko Schindler (2022). KeySearchWiki [Dataset]. http://doi.org/10.5281/zenodo.4955200
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 14, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Leila Feddoul; Leila Feddoul; Frank Löffler; Frank Löffler; Sirko Schindler; Sirko Schindler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    KeySearchWiki is a dataset for evaluating keyword search systems over Wikidata.

    The dataset was automatically generated by leveraging Wikidata and Wikipedia set categories (e.g., Category:American television directors) as data sources for both relevant entities and queries.
    Relevant entities are gathered by carefully navigating the Wikipedia set categories hierarchy in all available languages. Furthermore, those categories are refined and combined to derive more complex queries.

    Detailed information about KeySearchWiki and its generation can be found on the Github page.

  9. Z

    Wikidata dump from 2018-12-17 in JSON

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Škoda, Petr (2021). Wikidata dump from 2018-12-17 in JSON [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4436355
    Explore at:
    Dataset updated
    Jan 15, 2021
    Dataset provided by
    Klímek, Jakub
    Škoda, Petr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dump from Wikidata from 2018-12-17 in JSON. This one is not avavailable anymore from Wikidata. It was downloaded originally from https://dumps.wikimedia.org/other/wikidata/20181217.json.gz and recompressed to fit on Zenodo.

  10. Wikidata dump extension (enwiki section links)

    • zenodo.org
    application/gzip
    Updated Nov 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natalia Ostapuk; Djellel Difallah; Philippe Cudré-Mauroux; Natalia Ostapuk; Djellel Difallah; Philippe Cudré-Mauroux (2022). Wikidata dump extension (enwiki section links) [Dataset]. http://doi.org/10.5281/zenodo.7360787
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Nov 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Natalia Ostapuk; Djellel Difallah; Philippe Cudré-Mauroux; Natalia Ostapuk; Djellel Difallah; Philippe Cudré-Mauroux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains mappings between Wikidata entities and Wikipedia sections. The mappings come in addition to the existing Wikidata sitelinks referencing Wikipedia pages.

    The creation of the present dataset stems from the observation that only a fraction of Wikidata entities has a corresponding Wikipedia article in any language (we refer to the remaining entities, without an article, as orphans). However, a substantial number of orphan entities are indeed available in Wikipedia, but not at the page level; orphan entities can be described within existing Wikipedia articles in the form of sections, subsections, and paragraphs of a more generic concept or fact. The dataset provides a fine-grained mapping between Wikidata orphan entities and Wikipedia (sub)-sections.

    Mappings are provided for English language.

    The dataset is available in JSON and RDF formats and complies with the Wikibase data model.

    In the JSON representation, an entity contains two fields: id (the unique identifier of an entity) and sectionlinks (links to Wikipedia sections). Each sectionlink record comprises a list of records1 with three fields: site, title, and url. A section title is appended to the page title separated with # symbol. Such a compound title is then URL-encoded and added to the URL path. Following the Wikidata guidelines, each entity is encoded as a single line.

    Example:

    {
      "id": "Q715509",
      "sectionlinks": {
        "enwiki": [
          {
            "site": "enwiki",
            "title": "Places in Harry Potter#Azkaban",
            "url": "https://en.wikipedia.org/wiki/Places_in_Harry_Potter#Azkaban"
          }
        ],
      }
    }

    The RDF dump is serialized using the Turtle format and stores nodes describing Wikipedia links. Section titles are added in the same manner as described above.

    Example:

    <https://en.wikipedia.org/wiki/Places_in_Harry_Potter#Azkaban> a schema:Article ;
        schema:about wd:Q715509 ;
        schema:inLanguage "en" ;
        schema:isPartOf <https://en.wikipedia.org/> ;
        schema:name "Places in Harry Potter#Azkaban"@en .
    
    <https://en.wikipedia.org/> wikibase:wikiGroup "wikipedia" .
    

    1 As opposed to sitelinks, where each entity can be mapped with a unique Wikipedia page (one-to-one mapping), in sectionlinks we allow a one-to-many mapping, i.e., an entity can be mapped to multiple sections. For example, Tennis racket concept can be mapped to Tennis#Rackets and Racket (sports equipment)#Tennis sections.

  11. d

    20200210 wikidata all json gz

    • data.depositar.io
    gz
    Updated Jun 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    無專案資料集 (Unassigned Datasets) (2020). 20200210 wikidata all json gz [Dataset]. https://data.depositar.io/en/dataset/20200210-wikidata-dump
    Explore at:
    gzAvailable download formats
    Dataset updated
    Jun 10, 2020
    Dataset provided by
    無專案資料集 (Unassigned Datasets)
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikidata all data dump 時間: 2020/2/10 格式:JSON 檔案格式:gz

  12. wikidata-20240902-all.json.bz2

    • academictorrents.com
    bittorrent
    Updated Sep 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikidata Contributors (2024). wikidata-20240902-all.json.bz2 [Dataset]. https://academictorrents.com/details/7bee8ece634c55ab4ed7da5a56dd81578729ed2b
    Explore at:
    bittorrent(91964359511)Available download formats
    Dataset updated
    Sep 5, 2024
    Dataset provided by
    Wikidata//wikidata.org/
    Authors
    Wikidata Contributors
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    A BitTorrent file to download data with the title 'wikidata-20240902-all.json.bz2'

  13. Wikidata Causal Event Triple Data

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sola; Sola; Debarun; Debarun; Oktie; Oktie (2023). Wikidata Causal Event Triple Data [Dataset]. http://doi.org/10.5281/zenodo.7196049
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sola; Sola; Debarun; Debarun; Oktie; Oktie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains triples curated from Wikidata surrounding news events with causal relations, and is released as part of our WWW'23 paper, "Event Prediction using Case-Based Reasoning over Knowledge Graphs".

    Starting from a set of classes that we consider to be types of "events", we queried Wikidata to collect entities that were an instanceOf an event class and that were connected to another such event entity by a causal triple (https://www.wikidata.org/wiki/Wikidata:List_of_properties/causality). For all such cause-effect event pairs, we then collected a 3-hop neighborhood of outgoing triples.

  14. f

    Wikidata Human Gender Indicators

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Klein; Piotr Konieczny; Harsh Gupta; Vivek Rai; Haiyi Zhu (2023). Wikidata Human Gender Indicators [Dataset]. http://doi.org/10.6084/m9.figshare.3100903.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Max Klein; Piotr Konieczny; Harsh Gupta; Vivek Rai; Haiyi Zhu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a collection of Gender Indicators from Wikidata and Wikipedia of Human Biographies. Data is derived from the 2016-01-03 Wikidata snapshot.Each file describe the humans in Wikidata aggregated by Gender (Property:P21), and dissaggregated by the following Wikidata Properties: - Date of Birth (P569)- Date of Death (P570)- Place of Birth (P19)- Country of Citizenship (P27)- Ethnic Group (P172)- Field of Work (P101)- Occupation (P106)- Wikipedia Language ("Sitelinks") Further aggregations of the data are: - World Map (Countries derived from place of birth and citizenship)- World Cultures (Inglehart Welzel Map applied to World Map)- Gender Co-Occurence (Humans with multiple genders).Wikidata labels have be translated to English for convenience when possible. You may still see values with "QIDs" which means there was no English translation possible. In the case where there were multiple values, such as for occupation, the we count the gender as co-occuring with each occupation separately.For more information. http://wigi.wmflabs.org/

  15. Wikidata Companies Graph

    • zenodo.org
    • data.hellenicdataservice.gr
    • +1more
    application/gzip
    Updated Aug 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pantelis Chronis; Pantelis Chronis (2020). Wikidata Companies Graph [Dataset]. http://doi.org/10.5281/zenodo.3971752
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 5, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pantelis Chronis; Pantelis Chronis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains information about commercial organizations (companies) and their relations with other commercial organizations, persons, products, locations, groups and industries. The dataset has the form of a graph. It has been produced by the SmartDataLake project (https://smartdatalake.eu), using data collected from Wikidata (https://www.wikidata.org).

  16. Wikidata item quality labels

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Glorian Yapinus; Amir Sarabadani; Aaron Halfaker (2023). Wikidata item quality labels [Dataset]. http://doi.org/10.6084/m9.figshare.5035796.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Glorian Yapinus; Amir Sarabadani; Aaron Halfaker
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains quality labels for 5000 Wikidata items applied by Wikidata editors. The labels correspond to the quality scale described at https://www.wikidata.org/wiki/Wikidata:Item_quality Each line is a JSON blob with the following fields: - item_quality: The labeled quality class (A-E)- rev_id: the revision identifier of the version of the item that was labeled- strata: The size of the item in bytes at the time it was sampled- page_len: The actual size of the item in bytes- page_title: The Qid of the item- claims: A dictionary including P31 "instance-of" values for filtering out certain types of itemsThe # of observations by class is: - A class: 322- B class: 438- C class: 1773- D class: 997- E class: 1470

  17. Wikidata Politically Exposed Persons

    • opensanctions.org
    Updated Oct 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikidata (2025). Wikidata Politically Exposed Persons [Dataset]. https://www.opensanctions.org/datasets/wd_peps/
    Explore at:
    application/json+ftmAvailable download formats
    Dataset updated
    Oct 12, 2025
    Dataset authored and provided by
    Wikidata//wikidata.org/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Profiles of politically exposed persons from Wikidata, the structured data version of Wikipedia.

  18. f

    Wikidata Reference

    • figshare.com
    application/gzip
    Updated Mar 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sven Hertling; Nandana Mihindukulasooriya (2025). Wikidata Reference [Dataset]. http://doi.org/10.6084/m9.figshare.28602170.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    figshare
    Authors
    Sven Hertling; Nandana Mihindukulasooriya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset SummaryThe Triple-to-Text Alignment dataset aligns Knowledge Graph (KG) triples from Wikidata with diverse, real-world textual sources extracted from the web. Unlike previous datasets that rely primarily on Wikipedia text, this dataset provides a broader range of writing styles, tones, and structures by leveraging Wikidata references from various sources such as news articles, government reports, and scientific literature. Large language models (LLMs) were used to extract and validate text spans corresponding to KG triples, ensuring high-quality alignments. The dataset can be used for training and evaluating relation extraction (RE) and knowledge graph construction systems.Data FieldsEach row in the dataset consists of the following fields:subject (str): The subject entity of the knowledge graph triple.rel (str): The relation that connects the subject and object.object (str): The object entity of the knowledge graph triple.text (str): A natural language sentence that entails the given triple.validation (str): LLM-based validation results, including:Fluent Sentence(s): TRUE/FALSESubject mentioned in Text: TRUE/FALSERelation mentioned in Text: TRUE/FALSEObject mentioned in Text: TRUE/FALSEFact Entailed By Text: TRUE/FALSEFinal Answer: TRUE/FALSEreference_url (str): URL of the web source from which the text was extracted.subj_qid (str): Wikidata QID for the subject entity.rel_id (str): Wikidata Property ID for the relation.obj_qid (str): Wikidata QID for the object entity.Dataset CreationThe dataset was created through the following process:1. Triple-Reference Sampling and ExtractionAll relations from Wikidata were extracted using SPARQL queries.A sample of KG triples with associated reference URLs was collected for each relation.2. Domain Analysis and Web ScrapingURLs were grouped by domain, and sampled pages were analyzed to determine their primary language.English-language web pages were scraped and processed to extract plaintext content.3. LLM-Based Text Span Selection and ValidationLLMs were used to identify text spans from web content that correspond to KG triples.A Chain-of-Thought (CoT) prompting method was applied to validate whether the extracted text entailed the triple.The validation process included checking for fluency, subject mention, relation mention, object mention, and final entailment.4. Final Dataset Statistics12.5K Wikidata relations were analyzed, leading to 3.3M triple-reference pairs.After filtering for English content, 458K triple-web content pairs were processed with LLMs.80.5K validated triple-text alignments were included in the final dataset.

  19. Wikidata Dump wikidata

    • zenodo.org
    application/gzip, bin +1
    Updated Jan 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benno Fünfstück; Benno Fünfstück (2022). Wikidata Dump wikidata [Dataset]. http://doi.org/10.5281/zenodo.5833973
    Explore at:
    json, application/gzip, binAvailable download formats
    Dataset updated
    Jan 11, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Benno Fünfstück; Benno Fünfstück
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    RDF dump of wikidata produced with wdumps.

        <p>
        <br>
        <a href="https://tools.wmflabs.org/wdumps/dump/2049">View on wdumper</a>
        </p>
    
        <p>
        <b>entity count</b>: 0, <b>statement count</b>: 0, <b>triple count</b>: 0
        </p>
    
  20. Wikidata

    • web.archive.org
    full json dump +3
    Updated Oct 23, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikimedia (2018). Wikidata [Dataset]. https://www.wikidata.org/wiki/Wikidata:Data_access
    Explore at:
    simplified ("truthy") rdf n-triples dump, sparql endpoint, full json dump, full rdf turtle dumpAvailable download formats
    Dataset updated
    Oct 23, 2018
    Dataset provided by
    Wikimedia Foundationhttp://www.wikimedia.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikidata offers a wide range of general data about our universe as well as links to other databases. The data is published under the CC0 "Public domain dedication" license. It can be edited by anyone and is maintained by Wikidata's editor community.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Wikimedia Movement (2024). wikidata-all [Dataset]. https://huggingface.co/datasets/Wikimedians/wikidata-all
Organization logo

wikidata-all

Wikidata - All Entities

Wikimedians/wikidata-all

Explore at:
291 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2024
Dataset provided by
Wikimedia movementhttps://wikimedia.org/
Authors
Wikimedia Movement
License

https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

Description

Wikidata - All Entities

This Hugging Face Data Set contains the entirety of Wikidata as of the date listed below. Wikidata is a freely licensed structured knowledge graph following the wiki model of user contributions. If you build on this data please consider contributing back to Wikidata. For more on the size and other statistics of Wikidata, see: Special:Statistics. Current Dump as of: 2024-03-04

  Original Source

The data contained in this repository is retrieved… See the full description on the dataset page: https://huggingface.co/datasets/Wikimedians/wikidata-all.

Search
Clear search
Close search
Google apps
Main menu