https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
🚀 Hugging Face Uploader: Streamline Your Model Sharing! 🚀
This tool provides a user-friendly way to upload files directly to your Hugging Face repositories. Whether you prefer the interactive environment of a Jupyter Notebook or the command-line efficiency of a Python script, we've got you covered. We've designed it to streamline your workflow and make sharing your models, datasets, and spaces easier than ever before! Will be more consistently updated here:… See the full description on the dataset page: https://huggingface.co/datasets/EarthnDusk/Huggingface_Uploader.
Dataset Creation Scripts
Ready-to-run scripts for creating Hugging Face datasets from local files.
Available Scripts
📄 pdf-to-dataset.py
Convert directories of PDF files into Hugging Face datasets. Features:
📁 Uploads PDFs as dataset objects for flexible processing 🏷️ Automatic labeling from folder structure 🚀 Zero configuration - just point at your PDFs 📤 Direct upload to Hugging Face Hub
Usage:
uv run pdf-to-dataset.py /path/to/pdfs… See the full description on the dataset page: https://huggingface.co/datasets/uv-scripts/dataset-creation.
albertvillanova/tmp-file-upload dataset hosted on Hugging Face and contributed by the HF Datasets community
MahnoorMalik/test-uploading-jsonl-file-for-preview-dataset-final dataset hosted on Hugging Face and contributed by the HF Datasets community
Experimental Results
This directory stores the output files from running inference and evaluation scripts.
Uploading Results to Hugging Face
To back up or share your results, you can upload the entire results/ directory to a Hugging Face dataset repository:
docker run --rm
-v… See the full description on the dataset page: https://huggingface.co/datasets/jd0g/nlistral-7b-results.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
File Storage Dataset
This dataset is used for file storage purposes.
Files
This dataset contains uploaded files organized in the uploads directory.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
English Version:
Note: Due to network issues, we are currently working on uploading the sample data to this Hugging Face repository. The sample data is now temporarily available at:📌 ModelScope Dataset: https://www.modelscope.cn/datasets/fh2678713685/DH-FaceVid-1K_Sample/files
⚠ Important: The uploaded files consist of two split archive parts (.tar.gz format). Users must download both parts, merge them, and then extract the final dataset.
How to Merge & Extract:
Download… See the full description on the dataset page: https://huggingface.co/datasets/jjuik2014/DH-FaceVid-Sample.
Dataset Card for introvoyz041/drug_development_supported_by_informatics
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 358 Number of PDFs processed: 1 Sample size per PDF: 100 Created on: 2025-06-16 15:39:43
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to… See the full description on the dataset page: https://huggingface.co/datasets/introvoyz041/drug_development_supported_by_informatics.
Dataset Card for nirantk/coldocs-fin-sample
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 200 Number of PDFs processed: 1 Sample size per PDF: 100 Created on: 2024-10-30 01:19:36
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images… See the full description on the dataset page: https://huggingface.co/datasets/nirantk/coldocs-fin-sample.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BHI SISR Dataset
Content
HR Dataset Used Datasets Tiling BHI Filtering Files Upload
Corresponding LR Sets Trained models
HR Dataset
The BHI SISR Dataset's purpose is for training single image super-resolution models and is a result of tests on my BHI filtering method, which I made a huggingface community blogpost about, which can be extremely summarized by that removing (by filtering) only the worst quality tiles from a training set has a way bigger… See the full description on the dataset page: https://huggingface.co/datasets/Phips/BHI.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
FineCode: A High-Quality Code Dataset
Disclaimer: No big files uploaded...yet The one upload is simply an example format and doesn't contain all the highest quality code or the final version.
Overview
FineCode is a meticulously curated dataset aimed at providing high-quality code for training and benchmarking code generation models. While many code datasets exist on Hugging Face, the quality of code varies significantly. FineCode seeks to address this by rigorously… See the full description on the dataset page: https://huggingface.co/datasets/jayan12k/Finecode.
Dataset Card for davanstrien/ufo
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 212 Number of PDFs processed: 109 Sample size per PDF: 10 Created on: 2024-09-19 20:46:12
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images Converter.… See the full description on the dataset page: https://huggingface.co/datasets/davanstrien/ufo.
Dataset Card for zohaibterminator/9th-grade-chem
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 53 Number of PDFs processed: 1 Sample size per PDF: 100 Created on: 2025-05-27 12:51:55
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images… See the full description on the dataset page: https://huggingface.co/datasets/zohaibterminator/9th-grade-chem.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Annotation
We resized the dataset to 1080p for easier uploading. Therefore, the original annotation file might not match the video names. Please refer to this https://github.com/PKU-YuanGroup/Open-Sora-Plan/issues/312#issuecomment-2197312973
Pexels
Pexels consists of multiple folders, but each folder exceeds the size limit for Huggingface uploads. Therefore, we divided each folder into 5 parts. You need to merge the 5 parts of each folder first, and then extract each… See the full description on the dataset page: https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0.
Dataset Card for axondendriteplus/LightRAG-DAPO-ScalingLaws
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 62 Number of PDFs processed: 3 Sample size per PDF: 100 Created on: 2025-05-13 10:15:11
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to… See the full description on the dataset page: https://huggingface.co/datasets/axondendriteplus/LightRAG-DAPO-ScalingLaws.
Dataset Card for Pran10/Statista
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 250 Number of PDFs processed: 12 Sample size per PDF: 100 Created on: 2024-11-12 01:04:56
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images Converter.… See the full description on the dataset page: https://huggingface.co/datasets/Pran10/Statista.
Dataset Card for sebgrima/britishhland
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 626 Number of PDFs processed: 4 Sample size per PDF: 100 Created on: 2024-12-01 19:28:20
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images… See the full description on the dataset page: https://huggingface.co/datasets/sebgrima/britishhland.
Dataset Card for melvinwevers/iln
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 48 Number of PDFs processed: 3 Sample size per PDF: 100 Created on: 2024-11-15 14:17:07
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images Converter. Each… See the full description on the dataset page: https://huggingface.co/datasets/melvinwevers/iln.
Dataset Card for atitaarora/state-of-ai-2024
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 212 Number of PDFs processed: 1 Sample size per PDF: 100 Created on: 2024-10-11 15:05:25
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page Images… See the full description on the dataset page: https://huggingface.co/datasets/atitaarora/state-of-ai-2024.
Dataset Card for axondendriteplus/Legal-AI-K-Hub
Dataset Description
This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
Number of images: 4245 Number of PDFs processed: 175 Sample size per PDF: 100 Created on: 2025-05-15 11:26:43
Dataset Creation
Source Data
The images in this dataset were generated from user-uploaded PDF files.
Processing Steps
PDF files were uploaded to the PDFs to Page… See the full description on the dataset page: https://huggingface.co/datasets/axondendriteplus/Legal-AI-K-Hub.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
🚀 Hugging Face Uploader: Streamline Your Model Sharing! 🚀
This tool provides a user-friendly way to upload files directly to your Hugging Face repositories. Whether you prefer the interactive environment of a Jupyter Notebook or the command-line efficiency of a Python script, we've got you covered. We've designed it to streamline your workflow and make sharing your models, datasets, and spaces easier than ever before! Will be more consistently updated here:… See the full description on the dataset page: https://huggingface.co/datasets/EarthnDusk/Huggingface_Uploader.