100+ datasets found

h
random-ai-sheets
huggingface.co
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julien Chaumond (2025). random-ai-sheets [Dataset]. https://huggingface.co/datasets/julien-c/random-ai-sheets
Explore at:
Dataset updated
Jun 10, 2025
Authors
Julien Chaumond
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
julien-c/random-ai-sheets dataset hosted on Hugging Face and contributed by the HF Datasets community
h
sheets-mcp
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2025). sheets-mcp [Dataset]. https://huggingface.co/datasets/dvilasuero/sheets-mcp
Explore at:
Dataset updated
Jun 17, 2025
Authors
Daniel Vila
Description
dvilasuero/sheets-mcp dataset hosted on Hugging Face and contributed by the HF Datasets community
Medical_Challenge_Questions
huggingface.co
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Sheets (2025). Medical_Challenge_Questions [Dataset]. https://huggingface.co/datasets/aisheets/Medical_Challenge_Questions
Explore at:
Dataset updated
Jun 10, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Sheets
Description
aisheets/Medical_Challenge_Questions dataset hosted on Hugging Face and contributed by the HF Datasets community
h
image2struct-musicsheet-v1
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford CRFM, image2struct-musicsheet-v1 [Dataset]. https://huggingface.co/datasets/stanford-crfm/image2struct-musicsheet-v1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Stanford CRFM
Description
Image2Struct - Music Sheet

Paper | Website | Datasets (Webpages, Latex, Music sheets) | Leaderboard | HELM repo | Image2Struct repo License: Apache License Version 2.0, January 2004

Dataset description

Image2struct is a benchmark for evaluating vision-language models in practical tasks of extracting structured information from images. This subdataset focuses on Music sheets. The model is given an image of the expected output with the prompt: Please generate the Lilypond… See the full description on the dataset page: https://huggingface.co/datasets/stanford-crfm/image2struct-musicsheet-v1.
Day_to_Day_Objects_isometric_skeumorphic_3d_bnb
huggingface.co
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Sheets (2025). Day_to_Day_Objects_isometric_skeumorphic_3d_bnb [Dataset]. https://huggingface.co/datasets/aisheets/Day_to_Day_Objects_isometric_skeumorphic_3d_bnb
Explore at:
Dataset updated
Jun 26, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Sheets
Description
Built with https://huggingface.co/spaces/aisheets/sheets and this config: columns: object_name: modelName: meta-llama/Llama-3.3-70B-Instruct modelProvider: groq userPrompt: Generate the name of a common day to day object prompt: > You are a rigorous text-generation engine. Generate only the requested output format, with no explanations following the user instruction and avoiding repetition of the existing responses at the end of the prompt.

# User… See the full description on the dataset page: https://huggingface.co/datasets/aisheets/Day_to_Day_Objects_isometric_skeumorphic_3d_bnb.
Womens_Lives_Across_Centuries
huggingface.co
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Sheets (2025). Womens_Lives_Across_Centuries [Dataset]. https://huggingface.co/datasets/aisheets/Womens_Lives_Across_Centuries
Explore at:
Dataset updated
Jun 4, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Sheets
Description
aisheets/Womens_Lives_Across_Centuries dataset hosted on Hugging Face and contributed by the HF Datasets community
h
2D_Video_Game_Cartoon_Character_Sprite-Sheets
huggingface.co
Updated Feb 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Turner Gane (2024). 2D_Video_Game_Cartoon_Character_Sprite-Sheets [Dataset]. https://huggingface.co/datasets/mgane/2D_Video_Game_Cartoon_Character_Sprite-Sheets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 18, 2024
Authors
Muhammad Turner Gane
Description
Dataset Card for Dataset Name

Dataset Details

Experimental composition of 76 cartoon art-style video game character spritesheets. Resized to 512x512, mixed variation of animation styles.

Dataset Description

All images editted using Tiled image editting software as most assets are typically downloaded individually and not in sequence. I compiled each animation sequence into one img to display animations frame-by-frame evenly distributed across some common… See the full description on the dataset page: https://huggingface.co/datasets/mgane/2D_Video_Game_Cartoon_Character_Sprite-Sheets.
vibench
huggingface.co
Updated Jan 20, 2001
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Sheets (2001). vibench [Dataset]. https://huggingface.co/datasets/aisheets/vibench
Explore at:
Dataset updated
Jan 20, 2001
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Sheets
Description
aisheets/vibench dataset hosted on Hugging Face and contributed by the HF Datasets community
h
loc-nineteenth-century-song-sheets
huggingface.co
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
loc-nineteenth-century-song-sheets [Dataset]. https://huggingface.co/datasets/davanstrien/loc-nineteenth-century-song-sheets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 1, 2024
Authors
Daniel van Strien
Description
davanstrien/loc-nineteenth-century-song-sheets dataset hosted on Hugging Face and contributed by the HF Datasets community
Day_to_Day_Objects_Isometric_Logos
huggingface.co
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Sheets (2025). Day_to_Day_Objects_Isometric_Logos [Dataset]. https://huggingface.co/datasets/aisheets/Day_to_Day_Objects_Isometric_Logos
Explore at:
Dataset updated
Jun 26, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Sheets
Description
aisheets/Day_to_Day_Objects_Isometric_Logos dataset hosted on Hugging Face and contributed by the HF Datasets community
h
crypto-charts
huggingface.co
Updated Jan 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stephan Akkerman (2025). crypto-charts [Dataset]. https://huggingface.co/datasets/StephanAkkerman/crypto-charts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2025
Authors
Stephan Akkerman
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Crypto Charts

This dataset is a collection of a sample of images from tweets that I scraped using my Discord bot that keeps track of financial influencers on Twitter. The data consists mainly of images that are cryptocurrency charts. This dataset can be used for a wide variety of tasks, such as image classification or feature extraction.

FinTwit Charts Collection

This dataset is part of a larger collection of datasets, scraped from Twitter and labeled by a human (me).… See the full description on the dataset page: https://huggingface.co/datasets/StephanAkkerman/crypto-charts.
h
awesome-chatgpt-prompts
huggingface.co
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Fatih Kadir Akın
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
🧠 Awesome ChatGPT Prompts [CSV dataset]

This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

License

CC-0
h
Data from: stock-charts
huggingface.co
Updated Jan 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stephan Akkerman (2025). stock-charts [Dataset]. https://huggingface.co/datasets/StephanAkkerman/stock-charts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2025
Authors
Stephan Akkerman
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Stock Charts

This dataset is a collection of a sample of images from tweets that I scraped using my Discord bot that keeps track of financial influencers on Twitter. The data consists of images that were part of tweets that mentioned a stock. This dataset can be used for a wide variety of tasks, such as image classification or feature extraction.

FinTwit Charts Collection

This dataset is part of a larger collection of datasets, scraped from Twitter and labeled by a… See the full description on the dataset page: https://huggingface.co/datasets/StephanAkkerman/stock-charts.
h
image2struct-latex-v1
huggingface.co
Updated Mar 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford CRFM (2024). image2struct-latex-v1 [Dataset]. https://huggingface.co/datasets/stanford-crfm/image2struct-latex-v1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2024
Dataset authored and provided by
Stanford CRFM
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Image2Struct - Latex

Paper | Website | Datasets (Webpages, Latex, Music sheets) | Leaderboard | HELM repo | Image2Struct repo License: Apache License Version 2.0, January 2004

Dataset description

Image2struct is a benchmark for evaluating vision-language models in practical tasks of extracting structured information from images. This subdataset focuses on LaTeX code. The model is given an image of the expected output with the prompt: Please provide the LaTex code used to… See the full description on the dataset page: https://huggingface.co/datasets/stanford-crfm/image2struct-latex-v1.
SDS-Gloves-Classification
huggingface.co
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SDS-Gloves-Classification [Dataset]. https://huggingface.co/datasets/BASF-AI/SDS-Gloves-Classification
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 22, 2024
Dataset authored and provided by
BASFhttp://basf.com/
License
https://choosealicense.com/licenses/gpl-2.0/https://choosealicense.com/licenses/gpl-2.0/
Description
Safety Data Sheets Gloves Classification

This dataset contains Safety Data Sheets (SDS) sourced from Kaggle, consisting of over 200,000 documents. SDS are detailed documents providing essential information on the properties and hazards of chemicals, ensuring user safety and compliance with regulatory standards. A subset of these documents was pre-processed, cleaned, and annotated to classify whether protective gloves are required when handling materials. The labels were extracted… See the full description on the dataset page: https://huggingface.co/datasets/BASF-AI/SDS-Gloves-Classification.
h
ChartQA
huggingface.co
Updated Oct 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LMMs-Lab (2024). ChartQA [Dataset]. https://huggingface.co/datasets/lmms-lab/ChartQA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 4, 2024
Dataset authored and provided by
LMMs-Lab
Description
Large-scale Multi-modality Models Evaluation Suite

Accelerating the development of large-scale multi-modality models (LMMs) with lmms-eval

🏠 Homepage | 📚 Documentation | 🤗 Huggingface Datasets

This Dataset

This is a formatted version of ChartQA. It is used in our lmms-eval pipeline to allow for one-click evaluations of large multi-modality models. @article{masry2022chartqa, title={ChartQA: A benchmark for question answering about charts with visual and… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/ChartQA.
h
service-public
huggingface.co
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AgentPublic (2025). service-public [Dataset]. https://huggingface.co/datasets/AgentPublic/service-public
Explore at:
Dataset updated
Jul 2, 2025
Dataset authored and provided by
AgentPublic
License
https://choosealicense.com/licenses/etalab-2.0/https://choosealicense.com/licenses/etalab-2.0/
Description
🇫🇷 Service-Public.fr practical sheets Dataset (Administrative Procedures)

This dataset is derived from the official Service-Public.fr platform and contains practical information sheets and resources targeting both individuals (Particuliers) and entrepreneurs (Entreprendre). The purpose of these sheets is to provide information on administrative procedures relating to a number of themes. The data is publicly available on data.gouv.fr and has been processed and chunked for… See the full description on the dataset page: https://huggingface.co/datasets/AgentPublic/service-public.
RAG-Chunk-Analysis
huggingface.co
Updated Nov 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RAG-Chunk-Analysis [Dataset]. https://huggingface.co/datasets/CGIAR/RAG-Chunk-Analysis
Explore at:
Dataset updated
Nov 20, 2024
Dataset authored and provided by
CGIARhttp://cgiar.org/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Description

The datasets contain human evaluation of retrieved chunks from agriculture documents for actual user queries. Each chunk is marked as relevant and irrelevant. The relevant and irrelevant portion of the chunks are mentioned in a separate columns. The dataset consists of multiple XLS files and each XLS file has multiple sheets corresponding to the content for the value chain. The queries are taken from the actual user questions onf farmer.chat prototype bots. For each… See the full description on the dataset page: https://huggingface.co/datasets/CGIAR/RAG-Chunk-Analysis.
h
ChartX
huggingface.co
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alpha-Innovator Lab (2024). ChartX [Dataset]. https://huggingface.co/datasets/U4R/ChartX
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 13, 2024
Dataset authored and provided by
Alpha-Innovator Lab
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

[ Related Paper ] [ Website ] [Models 🤗(Hugging Face)]

ChartX & ChartVLM

Recently, many versatile Multi-modal Large Language Models (MLLMs) have emerged continuously. However, their capacity to query information depicted in visual charts and engage in reasoning based on the queried contents remains under-explored. In this paper, to comprehensively and rigorously benchmark the ability… See the full description on the dataset page: https://huggingface.co/datasets/U4R/ChartX.
h
wikimusictext
huggingface.co
Updated Jun 29, 1998
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wikimusictext [Dataset]. https://huggingface.co/datasets/sander-wood/wikimusictext
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 29, 1998
Authors
Shangda Wu (Sander Wood)
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
We introduce WikiMT-X, an enhanced version of WikiMusicText (WikiMT) with audio recordings, richer text annotations, and improved genre labels. Explore it here: WikiMT-X on Hugging Face.

Dataset Summary

In CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval, we introduce WikiMusicText (WikiMT), a new dataset for the evaluation of semantic search and music classification. It includes 1010 lead sheets in ABC notation sourced from… See the full description on the dataset page: https://huggingface.co/datasets/sander-wood/wikimusictext.