jasong03/data-upload dataset hosted on Hugging Face and contributed by the HF Datasets community
not-lain/test-parquet-upload-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Demo to save data from a Space to a Dataset. Goal is to provide reusable snippets of code.
Documentation: https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#scheduled-uploads Space: https://huggingface.co/spaces/Wauplin/space_to_dataset_saver/ JSON dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-json Image dataset: https://huggingface.co/datasets/Wauplin/example-commit-scheduler-image Image (zipped) dataset:โฆ See the full description on the dataset page: https://huggingface.co/datasets/Wauplin/example-space-to-dataset-image-zip.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains images used in the documentation of HuggingFace's libraries.
HF Team: Please make sure you optimize the assets before uploading them. My favorite tool for this is https://tinypng.com/.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
๐บ YouTube-Commons ๐บ
YouTube-Commons is a collection of audio transcripts of 2,063,066 videos shared on YouTube under a CC-By license.
Content
The collection comprises 22,709,724 original and automatically translated transcripts from 3,156,703 videos (721,136 individual channels). In total, this represents nearly 45 billion words (44,811,518,375). All the videos where shared on YouTube with a CC-BY license: the dataset provide all the necessary provenance informationโฆ See the full description on the dataset page: https://huggingface.co/datasets/PleIAs/YouTube-Commons.
peopleofverso/upload-divise-final-training-data dataset hosted on Hugging Face and contributed by the HF Datasets community
567-labs/upload-test dataset hosted on Hugging Face and contributed by the HF Datasets community
peopleofverso/test-upload-final-training-data dataset hosted on Hugging Face and contributed by the HF Datasets community
Nitin12340/my-colab-upload dataset hosted on Hugging Face and contributed by the HF Datasets community
peopleofverso/diagnostic-upload-training-data dataset hosted on Hugging Face and contributed by the HF Datasets community
MoritzLaurer/upload-test dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
jholst/test-upload dataset hosted on Hugging Face and contributed by the HF Datasets community
pavitemple/youtube-videos dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
๐ Hugging Face Uploader: Streamline Your Model Sharing! ๐
This tool provides a user-friendly way to upload files directly to your Hugging Face repositories. Whether you prefer the interactive environment of a Jupyter Notebook or the command-line efficiency of a Python script, we've got you covered. We've designed it to streamline your workflow and make sharing your models, datasets, and spaces easier than ever before! Will be more consistently updated here:โฆ See the full description on the dataset page: https://huggingface.co/datasets/EarthnDusk/Huggingface_Uploader.
Dataset Creation Scripts
Ready-to-run scripts for creating Hugging Face datasets from local files.
Available Scripts
๐ pdf-to-dataset.py
Convert directories of PDF files into Hugging Face datasets. Features:
๐ Uploads PDFs as dataset objects for flexible processing ๐ท๏ธ Automatic labeling from folder structure ๐ Zero configuration - just point at your PDFs ๐ค Direct upload to Hugging Face Hub
Usage:
uv run pdf-to-dataset.py /path/to/pdfsโฆ See the full description on the dataset page: https://huggingface.co/datasets/uv-scripts/dataset-creation.
Experimental Results
This directory stores the output files from running inference and evaluation scripts.
Uploading Results to Hugging Face
To back up or share your results, you can upload the entire results/ directory to a Hugging Face dataset repository:
docker run --rm
-vโฆ See the full description on the dataset page: https://huggingface.co/datasets/jd0g/nlistral-7b-results.
waris-gill/csv-upload-processed_quora_train dataset hosted on Hugging Face and contributed by the HF Datasets community
nnuochen/scene-upload dataset hosted on Hugging Face and contributed by the HF Datasets community
waris-gill/csv-upload-newer_semantic_test_cases dataset hosted on Hugging Face and contributed by the HF Datasets community
jasong03/data-upload dataset hosted on Hugging Face and contributed by the HF Datasets community