76 datasets found

h
freebase-wikidata-mapping
huggingface.co
Updated Mar 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Knowledge Discovery & Management Lab, DA-IICT (2024). freebase-wikidata-mapping [Dataset]. https://huggingface.co/datasets/kdm-daiict/freebase-wikidata-mapping
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 28, 2024
Dataset authored and provided by
Knowledge Discovery & Management Lab, DA-IICT
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
mapping between freebase and wikidata entities

This dataset maps freebase ids to wikidata ids and labels. It is useful for visualising and better understanding when working with datasets like fb15k-237 How it was created:

Download freebase-wikidata mapping from here. [compressed size: 21.2 MB] Download wikidata entities data from here. [compressed size: 81GB] Align labels with the freebase,wikidata id
wikidata-20240701-all.json.bz2
academictorrents.com
bittorrent
Updated Aug 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wikidata Contributors (2024). wikidata-20240701-all.json.bz2 [Dataset]. https://academictorrents.com/details/dc083577b9f773ef0d41a3eba21b8694d5a56e99
Explore at:
bittorrent(89940529332)Available download formats
Dataset updated
Aug 30, 2024
Dataset provided by
Wikidata//wikidata.org/
Authors
Wikidata Contributors
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
A BitTorrent file to download data with the title 'wikidata-20240701-all.json.bz2'
Z
QALD-10 Wikidata Dump
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Jan 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Usbeck, Ricardo; Yan, Xi; Perevalov, Aleksandr; Jiang, Longquan; Schulz, Julius; Kraft, Angelie; Möller, Cedric; Huang, Junbo; Reineke, Jan; Ngonga Ngomo, Axel-Cyrille; Saleem, Muhammad; Both, Andreas (2023). QALD-10 Wikidata Dump [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_7496689
Explore at:
Dataset updated
Jan 3, 2023
Dataset provided by
Universität Hamburg
Leipzig University of Applied Sciences
Universität Paderborn
Authors
Usbeck, Ricardo; Yan, Xi; Perevalov, Aleksandr; Jiang, Longquan; Schulz, Julius; Kraft, Angelie; Möller, Cedric; Huang, Junbo; Reineke, Jan; Ngonga Ngomo, Axel-Cyrille; Saleem, Muhammad; Both, Andreas
License
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Description
This data dump of Wikidata is published to allow fair and replicable evaluation of KGQA systems with the QALD-10 benchmark. QALD-10 is newly released and was used in the QALD-10 Challenge. Anyone interested in evaluating their KGQA systems with QALD-10 can download this dump and set up a local Wikidata endpoint in their server.
Kensho Derived Wikimedia Dataset
kaggle.com
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kensho R&D (2020). Kensho Derived Wikimedia Dataset [Dataset]. https://www.kaggle.com/kenshoresearch/kensho-derived-wikimedia-data
Explore at:
zip(8760044227 bytes)Available download formats
Dataset updated
Jan 24, 2020
Authors
Kensho R&D
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Kensho Derived Wikimedia Dataset

Wikipedia, the free encyclopedia, and Wikidata, the free knowledge base, are crowd-sourced projects supported by the Wikimedia Foundation. Wikipedia is nearly 20 years old and recently added its six millionth article in English. Wikidata, its younger machine-readable sister project, was created in 2012 but has been growing rapidly and currently contains more than 75 million items.

These projects contribute to the Wikimedia Foundation's mission of empowering people to develop and disseminate educational content under a free license. They are also heavily utilized by computer science research groups, especially those interested in natural language processing (NLP). The Wikimedia Foundation periodically releases snapshots of the raw data backing these projects, but these are in a variety of formats and were not designed for use in NLP research. In the Kensho R&D group, we spend a lot of time downloading, parsing, and experimenting with this raw data. The Kensho Derived Wikimedia Dataset (KDWD) is a condensed subset of the raw Wikimedia data in a form that we find helpful for NLP work. The KDWD has a CC BY-SA 3.0 license, so feel free to use it in your work too.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4301984%2F972e4157b97efe8c2c5ea17c983b1504%2Fkdwd_header_logos_2.jpg?generation=1580510520532141&alt=media" alt="">

This particular release consists of two main components - a link annotated corpus of English Wikipedia pages and a compact sample of the Wikidata knowledge base. We version the KDWD using the raw Wikimedia snapshot dates. The version string for this dataset is kdwd_enwiki_20191201_wikidata_20191202 indicating that this KDWD was built from the English Wikipedia snapshot from 2019 December 1 and the Wikidata snapshot from 2019 December 2. Below we describe these components in more detail.

Example Notebooks

Dive right in by checking out some of our example notebooks:

Introduction to the KDWD Wikipedia sample

Introduction to the KDWD Wikidata sample

Entity aliases and disambiguation candidates from anchor link statistics

Updates / Changelog

initial release 2020-01-31

File Summary

Wikipedia

page.csv (page metadata and Wikipedia-to-Wikidata mapping)

link_annotated_text.jsonl (plaintext of Wikipedia pages with link offsets)

Wikidata

item.csv (item labels and descriptions in English)

item_aliases.csv (item aliases in English)

property.csv (property labels and descriptions in English)

property_aliases.csv (property aliases in English)

statements.csv (truthy qpq statements)

Three Layers of Data

The KDWD is three connected layers of data. The base layer is a plain text English Wikipedia corpus, the middle layer annotates the corpus by indicating which text spans are links, and the top layer connects the link text spans to items in Wikidata. Below we'll describe these layers in more detail.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4301984%2F19663d43bade0e92f578255f6e0d9dcd%2Fkensho_wiki_triple_layer.svg?generation=1580347573004185&alt=media" alt="">

Wikipedia Sample

The first part of the KDWD is derived from Wikipedia. In order to create a corpus of mostly natural text, we restrict our English Wikipedia page sample to those that:

are in the (Main/Article) namespace

are not redirect pages

are not disambiguation pages

a...
h
wikidata-label-maps-20250820
huggingface.co
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Kumar Atri (2025). wikidata-label-maps-20250820 [Dataset]. https://huggingface.co/datasets/yashkumaratri/wikidata-label-maps-20250820
Explore at:
Dataset updated
Aug 20, 2025
Authors
Yash Kumar Atri
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Wikidata Label Maps 2025-08-20

Label maps extracted from the 2025-08-20 Wikidata dump.Use these to resolve Q and P identifiers to English labels quickly.

Files

entity_map.parquet - columns: id, label, descriptionQ items. 77.4M rows. prop_map.parquet - columns: id, label, description, datatypeP items. 11,568 rows.

All files are Parquet with Zstandard compression.

Download Options A) Hugging Face snapshot to a local folder

from huggingface_hub import… See the full description on the dataset page: https://huggingface.co/datasets/yashkumaratri/wikidata-label-maps-20250820.
E
External References of English Wikipedia (ref-wiki-en)
live.european-language-grid.eu
data.niaid.nih.gov
txt
Updated Mar 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). External References of English Wikipedia (ref-wiki-en) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7625
Explore at:
txtAvailable download formats
Dataset updated
Mar 27, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
External References of English Wikipedia (ref-wiki-en) is a corpus of the plain-text content of 2,475,461 external webpages linked from the reference section of articles in English Wikipedia. Specifically:
32,329,989 external reference URLs were extracted from a 2018 HTML dump of English Wikipedia. Removing repeated and ill-formed URLs yielded 23,036,318 unique URLs.These URLs were filtered to remove file extensions for unsupported formats (videos, audio, etc.), yielding 17,781,974 downloadable URLs. The URLs were loaded into Apache Nutch and continuously downloaded from August 2019 to December 2019, resulting in 2,475,461 successfully downloaded URLs. Not all URLs could be accessed. The order in which URLs were accessed was determined by Nutch, which partitions URLs by host and then randomly chooses amongst the URLs for each host.The content of these webpages were indexed in Apache Solr by Nutch. From Solr we extracted a JSON dump of the content.Many URLs offer a redirect; unfortunately Nutch does not index redirect information. This means that connecting the Wikipedia article (with the pre-direct link) to the downloaded webpage (at the post-redirect link) was complicated. However, by inspecting the order of download in the Nutch log files, we managed to recover links for 2,058,896 documents (83%) from their original Wikipedia article(s).We further managed to associate 3,899,953 unique Wikidata items with at least one external reference webpage in the corpus.
The ref-en-wiki corpus is incomplete, i.e., we did not attempt to download all reference URLs for English Wikipedia. We thus also collect a smaller complete corpus for the external references of 5,000 Wikipedia articles (ref-wiki-en-5k). We sampled from 5 ranges of Wikidata items: Q1-10000, Q10001-100000, Q100001-1000000, Q1000001-10000000, and Q10000001-100000000. From each range we sampled 1000 items. We then scraped the external reference URLs for the Wikipedia article corresponding to these items and downloaded them. The resulting corpus contains 37,983 webpages.Each line of the corpus (ref-wiki-en, ref-wiki-en-5k) encodes the webpage of an external reference in JSON format. Specifically, we provide:
tstamp: When the webpage was accessedhost: The domain (FQDN post-redirect) from which the webpage was retrieved.title: The title (meta) of the documenturl: The URL (post-redirect) of the webpageQ: The Q-code identifiers of the Wikidata items whose corresponding Wikipedia article is confirmed to link to this webpage.content: A plain-text encoding of the content of the webpage.
Below we provide an abbreviated example of a line from the corpus:{""tstamp"":""2019-09-26T01:22:43.621Z"",""host"":""geology.isu.edu"",""title"":""Digital Geology of Idaho - Basin And Range"",""url"":""http://geology.isu.edu/Digital_Geology_Idaho/Module9/mod9.htm"",""Q"":[810178],""content"":""Digital Geology of Idaho - Basin And Range 1 - Idaho Basement Rock 2 - Belt Supergroup 3 - Rifting & Passive Margin 4 - Accreted Terranes 5 - Thrust Belt 6 - Idaho Batholith 7 - North Idaho & Mining 8 - Challis Volcanics 9 - Basin and Range 10 - Columbia River Basalts 11 - SRP & Yellowstone 12 - Pleistocene Glaciation 13 - Palouse & Lake Missoula 14 - Lake Bonneville Flood 15 - Snake River Plain Aquifer Basin and Range Province - Teritiary Extension General geology of the Basin and Range Province Mechanisms of Basin and Range faulting Idaho Basin and Range south of the Snake River Plain Idaho Basin and Range north of the Snake River Plain Local areas of active and recent Basin & Range faulting: Borah Peak PDF Slideshows: North of SRP , South of SRP , Borah Earthquake Flythroughs: Teton Valley , Henry's Fork , Big Lost River , Blackfoot , Portneuf , Raft River Valley , Bear River , Salmon Falls Creek , Snake River , Big Wood River Vocabulary Words thrust fault Basin and Range Snake River Plain half-graben transfer zone Fly-throughs General geology of the Basin and Range Province The Basin and Range Province generally includes most of eastern California, eastern Oregon, eastern Washington, Nevada, western Utah, southern and western Arizona, and southeastern Idaho. ...""},A summary of the files we make available:
ref-wiki-en.json.gz: 2,475,461 external reference webpages (JSON format)ref-wiki-en_urls.txt.gz: 23,036,318 unique raw links to external references (plain-text format)ref-wiki-en-5k.json.gz: 37,983 external reference webpages (JSON format)ref-wiki-en-5k_urls.json.gz: 70,375 unique raw links to external references (plain-text format)ref-wiki-en-5k_Q.txt.gz: 5,000 Wikidata Q identifiers forming the 5k dataset (plain-text format)
Further details can be found in the publication:
Suggesting References for Wikidata Claims based on Wikipedia's External References. Paolo Curotto, Aidan Hogan. Wikidata Workshop @ISWC 2020.
Further material relating to this publication (including code for a proof-of-concept interface) is also available.
Wikidata jsons
kaggle.com
zip
Updated Feb 4, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timo Bozsolik (2020). Wikidata jsons [Dataset]. https://www.kaggle.com/timoboz/wikidata-jsons
Explore at:
zip(899549129 bytes)Available download formats
Dataset updated
Feb 4, 2020
Authors
Timo Bozsolik
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
This is a collection of pre-processed wikidata jsons which were used in the creation of CSQA dataset (Ref: https://arxiv.org/abs/1801.10314).

Please refer to https://amritasaha1812.github.io/CSQA/download/ for more details.
Z
NILK
data.niaid.nih.gov
Updated Mar 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iurshina, Anastasiia; Pan, Jiaxin; Boutalbi, Rafika (2023). NILK [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6599939
Explore at:
Dataset updated
Mar 26, 2023
Dataset provided by
Stuttgart University
Authors
Iurshina, Anastasiia; Pan, Jiaxin; Boutalbi, Rafika
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
A dataset for the NIL-detection and NIL-disambiguation tasks.

The NILK dataset has two main features: 1) It marks NIL-mentions for NIL-detection by extracting mentions which belong to newly added entities in Wikipedia text. 2) It provides an entity label for NIL-disambiguation by marking NIL-mentions with WikiData IDs from the newer dump.

Dataset files contain JSON objects of the following structure:

{"mention":"Walter Damrosch", "offset":348, "length":15, "context":"...the conductor Walter Damrosch. He scored the piece for the standard instruments of the symphony orchestra plus celesta, saxophone, and automobile horns...", "wikipedia_page_id":"309", "wikidata_id":"Q725579", "nil":false}

The dataset contains both linked and not linked mentions, one can distinguish between them by checking "nil" flag. To obtain NIL-mentions, we compared two WikiData dumps: from 2017 and 2021. NIL-mentions have WikiData ID from WikiData 2021, one can use it to check whether these mentions refer to the same entity.

The dataset was designed with the WikiData 2017 as the target knowledge base in mind: https://archive.org/download/wikibase-wikidatawiki-20170213/wikidata-20170213-all.json.gz

nilk_03_2023.zip contains same data with longer contexts (unsplit)
Wiki80-KG
figshare.com
json
Updated Sep 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongmin Xiao (2025). Wiki80-KG [Dataset]. http://doi.org/10.6084/m9.figshare.19323371.v2
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19323371.v2
Dataset updated
Sep 2, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Hongmin Xiao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Relation extraction dataset with its knowledge graph.
English Wikipedia People Dataset
kaggle.com
zip
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wikimedia (2025). English Wikipedia People Dataset [Dataset]. https://www.kaggle.com/datasets/wikimedia-foundation/english-wikipedia-people-dataset
Explore at:
zip(4293465577 bytes)Available download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Wikimedia Foundationhttp://www.wikimedia.org/
Authors
Wikimedia
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Summary

This dataset contains biographical information derived from articles on English Wikipedia as it stood in early June 2024. It was created as part of the Structured Contents initiative at Wikimedia Enterprise and is intended for evaluation and research use.

The beta sample dataset is a subset of the Structured Contents Snapshot focusing on people with infoboxes in EN wikipedia; outputted as json files (compressed in tar.gz).

We warmly welcome any feedback you have. Please share your thoughts, suggestions, and any issues you encounter on the discussion page for this dataset here on Kaggle.

Data Structure

File name: wme_people_infobox.tar.gz

Size of compressed file: 4.12 GB

Size of uncompressed file: 21.28 GB

Noteworthy Included Fields: - name - title of the article. - identifier - ID of the article. - image - main image representing the article's subject. - description - one-sentence description of the article for quick reference. - abstract - lead section, summarizing what the article is about. - infoboxes - parsed information from the side panel (infobox) on the Wikipedia article. - sections - parsed sections of the article, including links. Note: excludes other media/images, lists, tables and references or similar non-prose sections.

The Wikimedia Enterprise Data Dictionary explains all of the fields in this dataset.

Stats

Infoboxes - Compressed: 2GB - Uncompressed: 11GB

Infoboxes + sections + short description - Size of compressed file: 4.12 GB - Size of uncompressed file: 21.28 GB

Article analysis and filtering breakdown: - total # of articles analyzed: 6,940,949 - # people found with QID: 1,778,226 - # people found with Category: 158,996 - people found with Biography Project: 76,150 - Total # of people articles found: 2,013,372 - Total # people articles with infoboxes: 1,559,985 End stats - Total number of people articles in this dataset: 1,559,985 - that have a short description: 1,416,701 - that have an infobox: 1,559,985 - that have article sections: 1,559,921

This dataset includes 235,146 people articles that exist on Wikipedia but aren't yet tagged on Wikidata as instance of:human.

Maintenance and Support

This dataset was originally extracted from the Wikimedia Enterprise APIs on June 5, 2024. The information in this dataset may therefore be out of date. This dataset isn't being actively updated or maintained, and has been shared for community use and feedback. If you'd like to retrieve up-to-date Wikipedia articles or data from other Wikiprojects, get started with Wikimedia Enterprise's APIs

Initial Data Collection and Normalization

The dataset is built from the Wikimedia Enterprise HTML “snapshots”: https://enterprise.wikimedia.com/docs/snapshot/ and focuses on the Wikipedia article namespace (namespace 0 (main)).

Who are the source language producers?

Wikipedia is a human generated corpus of free knowledge, written, edited, and curated by a global community of editors since 2001. It is the largest and most accessed educational resource in history, accessed over 20 billion times by half a billion people each month. Wikipedia represents almost 25 years of work by its community; the creation, curation, and maintenance of millions of articles on distinct topics. This dataset includes the biographical contents of English Wikipedia language editions: English https://en.wikipedia.org/, written by the community.

Attribution

Terms and conditions

Wikimedia Enterprise provides this dataset under the assumption that downstream users will adhere to the relevant free culture licenses when the data is reused. In situations where attribution is required, reusers should identify the Wikimedia project from which the content was retrieved as the source of the content. Any attribution should adhere to Wikimedia’s trademark policy (available at https://foundation.wikimedia.org/wiki/Trademark_policy) and visual identity guidelines (ava...
b
SpeakGer: A meta-data enriched speech corpus of German state and federal...
berd-platform.de
csv
Updated Jul 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kai-Robin Lange; Kai-Robin Lange; Carsten Jentsch; Carsten Jentsch (2025). SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments [Dataset]. http://doi.org/10.82939/g3225-rba63
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.82939/g3225-rba63
Dataset updated
Jul 25, 2025
Dataset provided by
BERD@NFDI
Authors
Kai-Robin Lange; Kai-Robin Lange; Carsten Jentsch; Carsten Jentsch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Germany
Description
A dataset of German parliament debates covering 74 years of plenary protocols across all 16 state parliaments of Germany as well as the German Bundestag. The debates are separated into individual speeches which are enriched with meta data identifying the speaker as a member of the parliament (mp).
When using this data set, please cite the original paper "Lange, K.-R., Jentsch, C. (2023). SpeakGer: A meta-data enriched speech corpus of German state and federal parliaments. Proceedings of the 3rd Workshop on Computational Linguistics for Political Text Analysis@KONVENS 2023.".
The meta data is separated into two different types: time-specific meta-data that contains only information for a legislative period but can change over time (e.g. the party or constituency of an mp) and meta-data that is considered fixed, such as the birth date or the name of a speaker. The former information are stored aong with the speeches as it is considered temporal information of that point in time, but are additionally stored in the file all_mps_mapping.csv if there is the need to double-check something. The rest of the meta-data are stored in the file all_mps_meta.csv. The meta-data from this file can be matched with a speech by comparing the speaker ID-variable "MPID". The speeches of each parliament are saved in a csv format. Along with the speeches, they contain the following meta-data:
Period: int. The period in which the speech took place
Session: int. The session in which the speech took place
Chair: boolean. The information if the speaker was the chair of the plenary session
Interjection: boolean. The information if the speech is a comment or an interjection from the crowd
Party: list (e.g. ["cdu"] or ["cdu", "fdp"] when having more than one speaker during an interjection). List of the party of the speaker or the parties whom the comment/interjection references
Consituency: string. The consituency of the speaker in the current legislative period
MPID: int. The ID of the speaker, which can be used to get more meta-data from the file all_mps_meta.csv
The file all_mps_meta.csv contains the following meta information:
MPID: int. The ID of the speaker, which can be used to match the mp with his/her speeches.
WikipediaLink: The Link to the mps Wikipedia page
WikiDataLink: The Link to the mps WikiData page
Name: string. The full name of the mp.
Last Name: string. The last name of the mp, found on WikiData. If no last name is given on WikiData, the full name was heuristically cut at the last space to get the information neccessary for splitting the speeches.
Born: string, format: YYYY-MM-DD. Birth date of the mp. If an exact birth date is found on WikiData, this exact date is used. Otherwise, a day in the year of birth given on Wikipedia is used.
SexOrGender: string. Information on the sex or gender of the mp. Disclaimer: This infomation was taken from WikiData, which does not seem to differentiate between sex or gender.
Occupation: list. Occupation(s) of the mp.
Religion: string. Religious believes of the mp.
AbgeordnetenwatchID: int. ID of the mp on the website Abgeordnetenwatch
New Taipei City Government Event Information_ Download Attachment Version
data.gov.tw
csv
Updated Nov 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Development and Evaluation Commission, New Taipei City Government (2025). New Taipei City Government Event Information_ Download Attachment Version [Dataset]. https://data.gov.tw/en/datasets/139723
Explore at:
csvAvailable download formats
Dataset updated
Nov 17, 2025
Dataset provided by
New Taipei Cityhttp://www.tpc.gov.tw/
Research, Development and Evaluation Commissionhttp://archive.rdec.gov.tw/mp110.htm
Authors
Research Development and Evaluation Commission, New Taipei City Government
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Area covered
New Taipei City
Description
This information is adjusted to match the city government's official website revamp operation, replacing the existing "New Taipei City Government Event Information" on the platform, and adding attachment downloads, and does not include HTML syntax. For details, please refer to the latest news and instruction files.
Which keywords of Digital Downloads are trending on WooCommerce?
ecommerce.aftership.com
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AfterShip (2025). Which keywords of Digital Downloads are trending on WooCommerce? [Dataset]. https://ecommerce.aftership.com/product-trends/digital-downloads
Explore at:
Dataset updated
Apr 24, 2025
Dataset authored and provided by
AfterShiphttps://www.aftership.com/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Identify fastest-growing Digital Downloads keywords on WooCommerce. Analyze trending scores to identify the most relevant search terms and stay ahead of market trends for your store.
Address Point Rooftop Data
caliper.com
cdf, dwg, dxf, gdb +9
Updated Nov 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caliper Corporation (2020). Address Point Rooftop Data [Dataset]. https://www.caliper.com/mapping-software-data/address-point-data.htm
Explore at:
sql server mssql, kml, shp, postgresql, cdf, postgis, gdb, sdo, ntf, kmz, geojson, dwg, dxfAvailable download formats
Dataset updated
Nov 17, 2020
Dataset authored and provided by
Caliper Corporationhttp://www.caliper.com/
License
https://www.caliper.com/license/maptitude-license-agreement.htmhttps://www.caliper.com/license/maptitude-license-agreement.htm
Time period covered
2020
Area covered
United States
Description
Address point data for use with GIS mapping software, databases, and web applications are from Caliper Corporation and contain a point layer of over 48 million addresses in 22 states and the District of Columbia.
Building Footprints
caliper.com
cdf, dwg, dxf, gdb +9
Updated Nov 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caliper Corporation (2020). Building Footprints [Dataset]. https://www.caliper.com/mapping-software-data/building-footprint-data.htm
Explore at:
dxf, gdb, postgis, cdf, kml, sdo, postgresql, geojson, kmz, shp, ntf, sql server mssql, dwgAvailable download formats
Dataset updated
Nov 17, 2020
Dataset authored and provided by
Caliper Corporationhttp://www.caliper.com/
License
https://www.caliper.com/license/maptitude-license-agreement.htmhttps://www.caliper.com/license/maptitude-license-agreement.htm
Time period covered
2020
Area covered
Canada, United States
Description
Area layers of US, Australia, and Canada building footprints for use with GIS mapping software, databases, and web applications.
Which keywords of Digital Downloads are trending on Magento?
ecommerce.aftership.com
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AfterShip (2025). Which keywords of Digital Downloads are trending on Magento? [Dataset]. https://ecommerce.aftership.com/product-trends/digital-downloads
Explore at:
Dataset updated
Apr 24, 2025
Dataset authored and provided by
AfterShiphttps://www.aftership.com/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Identify fastest-growing Digital Downloads keywords on Magento. Analyze trending scores to identify the most relevant search terms and stay ahead of market trends for your store.
Which keywords of Digital Downloads attract shoppers on TikTok Shop?
ecommerce.aftership.com
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AfterShip (2025). Which keywords of Digital Downloads attract shoppers on TikTok Shop? [Dataset]. https://ecommerce.aftership.com/product-trends/digital-downloads
Explore at:
Dataset updated
Apr 24, 2025
Dataset authored and provided by
AfterShiphttps://www.aftership.com/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Discover top-performing keywords for Digital Downloads on TikTok Shop. Analyze monthly growth rate rankings to discover trending search terms and capitalize on emerging opportunities for your store.
wiki-movies
kaggle.com
zip
Updated Nov 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anton Kostin (2022). wiki-movies [Dataset]. https://www.kaggle.com/datasets/visualcomments/wikimovies/code
Explore at:
zip(32142478 bytes)Available download formats
Dataset updated
Nov 18, 2022
Authors
Anton Kostin
License
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
Description
We obtained 130 406 movies description and categories from wikipedia using 1) local wikidata dump to find movies names and 2) wikipediaapi library to download description and categories to each movie.
Data from: 5logos Dataset
universe.roboflow.com
zip
Updated Jan 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
gabi_alexandru22@yahoo.com (2023). 5logos Dataset [Dataset]. https://universe.roboflow.com/gabi_alexandru22-yahoo-com/5logos
Explore at:
zipAvailable download formats
Dataset updated
Jan 25, 2023
Dataset provided by
Yahoohttp://yahoo.com/
Authors
gabi_alexandru22@yahoo.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Bounding Boxes
Description
5logos

## Overview 5logos is a dataset for object detection tasks - it contains Objects annotations for 3,717 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Banking Compliance Data
caliper.com
Updated Sep 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caliper Corporation (2023). Banking Compliance Data [Dataset]. https://www.caliper.com/mapping-software-data/banking-data.html
Explore at:
Dataset updated
Sep 19, 2023
Dataset authored and provided by
Caliper Corporationhttp://www.caliper.com/
License
https://www.caliper.com/license/maptitude-license-agreement.htmhttps://www.caliper.com/license/maptitude-license-agreement.htm
Area covered
United States
Description
FREE layers of banking compliance data for the United States are now available for users of the current version of Maptitude. Three separate geographic files and one table are included in this download..

Facebook

Twitter

Click to copy link

Link copied

Cite

Knowledge Discovery & Management Lab, DA-IICT (2024). freebase-wikidata-mapping [Dataset]. https://huggingface.co/datasets/kdm-daiict/freebase-wikidata-mapping

freebase-wikidata-mapping

kdm-daiict/freebase-wikidata-mapping

Explore at:

11 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 28, 2024

Dataset authored and provided by

Knowledge Discovery & Management Lab, DA-IICT

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

mapping between freebase and wikidata entities

This dataset maps freebase ids to wikidata ids and labels. It is useful for visualising and better understanding when working with datasets like fb15k-237 How it was created:

Download freebase-wikidata mapping from here. [compressed size: 21.2 MB] Download wikidata entities data from here. [compressed size: 81GB] Align labels with the freebase,wikidata id

Clear search

Close search

Google apps

Main menu

freebase-wikidata-mapping

wikidata-20240701-all.json.bz2

QALD-10 Wikidata Dump

Kensho Derived Wikimedia Dataset

Kensho Derived Wikimedia Dataset

Example Notebooks

Updates / Changelog

File Summary

Three Layers of Data

Wikipedia Sample

wikidata-label-maps-20250820

External References of English Wikipedia (ref-wiki-en)

Wikidata jsons

NILK

Wiki80-KG

English Wikipedia People Dataset

Summary

Data Structure

Stats

Maintenance and Support

Initial Data Collection and Normalization

Who are the source language producers?

Attribution

SpeakGer: A meta-data enriched speech corpus of German state and federal...

New Taipei City Government Event Information_ Download Attachment Version

Which keywords of Digital Downloads are trending on WooCommerce?

Address Point Rooftop Data

Building Footprints

Which keywords of Digital Downloads are trending on Magento?

Which keywords of Digital Downloads attract shoppers on TikTok Shop?

wiki-movies

Data from: 5logos Dataset

5logos

Banking Compliance Data

freebase-wikidata-mappingSee More Versions

kdm-daiict/freebase-wikidata-mapping

freebase-wikidata-mapping