Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. Notable People Dataset (Wikidata-based)

    • kaggle.com
    zip
    Updated Jun 5, 2025
  2. Wikidata dump 2017-12-27

    • zenodo.org
    bz2
    Updated Jan 24, 2020
  3. Wikidata dump from 2018-12-17 in JSON

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    application/gzip
    Updated Jan 15, 2021
  4. wikidata

    • kaggle.com
    zip
    Updated Apr 16, 2025
  5. Wikidata Entities of Interest

    • opensanctions.org
    csv
    Updated Jan 9, 2026
  6. Wikidata Persons in Relevant Categories

    • opensanctions.org
    Updated Jan 23, 2026
  7. t

    Wikidata Explorer Feature - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Jul 16, 2024
  8. h

    freebase-wikidata-mapping

    • huggingface.co
    Updated Mar 28, 2024
    + more versions
  9. t

    Wikidata dataset - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Nov 25, 2024
  10. Wikidata item quality labels

    • figshare.com
    txt
    Updated May 31, 2023
  11. a

    Wikidata PageRank

    • danker.s3.amazonaws.com
    • danker.s3-website.eu-central-1.amazonaws.com
    Updated Jan 6, 2026
  12. h

    wikidata-truthy

    • huggingface.co
    Updated Nov 26, 2025
  13. Wikidata Dump wikidata_partial

    • zenodo.org
    application/gzip, bin +1
    Updated Dec 26, 2020
    + more versions
  14. h

    Wikidata Companies Graph

    • data.hellenicdataservice.gr
    • data.europa.eu
    Updated Jun 20, 2019
  15. Wikidata Politically Exposed Persons

    • opensanctions.org
    Updated Jan 8, 2026
  16. E

    Wikidata

    • live.european-language-grid.eu
    json
    Updated Oct 28, 2012
    + more versions
  17. Wikidata Dump all human data

    • zenodo.org
    application/gzip, bin +1
    Updated May 20, 2020
    + more versions
  18. Wikidata Dump

    • zenodo.org
    application/gzip, bin +1
    Updated Mar 10, 2020
  19. WikiData - Datasets - OpenData.eol.org

    • opendata.eol.org
    Updated Mar 22, 2017
    + more versions
  20. h

    wikidata-extraction

    • huggingface.co
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ekaterina Solovyeva (2025). Notable People Dataset (Wikidata-based) [Dataset]. https://www.kaggle.com/datasets/qqsolov/notable-people-dataset-wikidata-based
Organization logo

Notable People Dataset (Wikidata-based)

417K+ biographical records from Wikidata, born in 20th–21st centuries

Explore at:
zip(32237057 bytes)Available download formats
Dataset updated
Jun 5, 2025
Authors
Ekaterina Solovyeva
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset Overview

This dataset contains 417,937 biographical records of notable individuals, extracted from Wikidata using SPARQL queries via the Wikidata Query Service.

Key Selection Criteria:

  • Timeframe: Individuals born in the 20th or 21st century (1901–present).
  • Country of Birth: Entries must include the country_of_birth.
  • Photo Availability: Each entry includes an associated image (image_url is mandatory).
  • Profession Filter: Focused on individuals with occupations categorized in occupation_groups.csv (Science & Academia, Arts & Culture, Public Figures, Sports, Business).

Column Descriptions

ColumnDescriptionNotes
wikidata_urlUnique Wikidata URL identifier for the entryMandatory
labelPrimary name/label of the person (usually in English)Mandatory
name_in_native_languagesName(s) in the person’s native language(s);-separated values
pseudonymsAlternative names or aliases used by the person;-separated values
sex_or_genderGender informationMandatory
date_of_birthBirth dateMandatory
place_of_birthCity or region of birth
country_of_birthCountry of birthMandatory
date_of_deathDeath date (if applicable)
place_of_deathCity or region of death (if applicable)
country_of_deathCountry of death (if applicable)
citizenshipsNationalities or citizenships held;-separated values
occupationsSpecific occupations or rolesMandatory, ;-separated
occupation_groupsBroad occupational categoriesMandatory, ;-separated
awardsAwards, honors, or recognitions received;-separated values
signature_urlURL to an image of the person’s signature
image_urlURL to the person's image/portraitMandatory
date_of_imageDate when the image was created (if available)

Notes

The data may contain some number of inaccuracies, due to inconsistencies or errors in the original Wikidata entries. This can sometimes be seen in date fields, especially date_of_image.

Source and Licensing Notes

  • All data in this dataset was derived from Wikidata. Wikidata content is available under the CC0 1.0 license.
  • Images linked in image_url and signature_url are hosted on Wikimedia Commons and may have individual licenses (e.g., CC BY-SA, Public Domain). Please check the license terms on the source page before using.
Search
Clear search
Close search
Google apps
Main menu