Codyfederer/ml-dataset-cli-test dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
MmCows: A Multimodal Dataset for Dairy Cattle Monitoring
Details of the dataset and benchmarks are available here. For a quick overview of the dataset, please check this video.
Instruction for downloading
1. Install requirements
pip install huggingface_hub
See the file structure here for the next step.
2. Download a file individually
To download visual_data.zip to your local-dir, use command line:
huggingface-cli download
neis-lab/mmcows \… See the full description on the dataset page: https://huggingface.co/datasets/neis-lab/mmcows.
AmirMohseni/CLI-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for aiornot
Dataset for the aiornot competition. By accessing this dataset, you accept the rules of the AI or Not competition. Please note that dataset may contain images which are not considered safe for work.
Usage
With Hugging Face Datasets 🤗
You can download and use this dataset using the datasets library. 📝 Note: You must be logged in to you Hugging Face account for the snippet below to work. You can do this with huggingface-cli login or… See the full description on the dataset page: https://huggingface.co/datasets/competitions/aiornot.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Experienced
ML training data for the Roman Microlensing Data CHallenge 2025 - Beginner tier.
Uploading
CLI:
brew install huggingface-cli
hf auth login
hf upload RGES-PIT/Beginner . --repo-type=dataset
Python: from huggingface_hub import HfApi
api = HfApi(token=os.getenv("HF_TOKEN")) api.upload_folder( folder_path="/path/to/local/dataset"… See the full description on the dataset page: https://huggingface.co/datasets/RGES-PIT/Beginner.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
BM25, embedding index used in BrowseComp-Plus. For downloading the index: huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="bm25/*" --local-dir ./indexes huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="qwen3-embedding-0.6b/*" --local-dir ./indexes huggingface-cli download Tevatron/browsecomp-plus-indexes --repo-type=dataset --include="qwen3-embedding-4b/*" --local-dir ./indexes huggingface-cli download… See the full description on the dataset page: https://huggingface.co/datasets/Tevatron/browsecomp-plus-indexes.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Wireframe Dataset
This is the Wireframe dataset hosted on Hugging Face Hub.
Summary
Wireframe dataset with image annotations including line segments.The dataset is stored as jsonl files (train/metadata.jsonl, test/metadata.jsonl) and images. Number of samples:
Train: 5,000 Test: 462
Download
Download with huggingface-hub
python3 -m pip install huggingface-hub huggingface-cli download --repo-type dataset lh9171338/Wireframe --local-dir ./
Download with Git… See the full description on the dataset page: https://huggingface.co/datasets/lh9171338/Wireframe.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes
🔥 Evaluation Server | 🏠 Homepage | 📄 Paper | 🔗 GitHub
Download
We recommend using huggingface-cli to download: pip install -U "huggingface_hub[cli]" huggingface-cli download FudanCVL/MOSEv2 --repo-type dataset --local-dir ./MOSEv2 --local-dir-use-symlinks False --max-workers 16
Dataset Summary
MOSEv2 is a comprehensive video object segmentation dataset designed to advance… See the full description on the dataset page: https://huggingface.co/datasets/FudanCVL/MOSEv2.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GuardReasonerTrain
GuardReasonerTrain is the training data for R-SFT of GuardReasoner, as described in the paper GuardReasoner: Towards Reasoning-based LLM Safeguards. Code: https://github.com/yueliu1999/GuardReasoner/
Usage
from datasets import load_dataset
huggingface-cli login
to access this datasetds = load_dataset("yueliu1999/GuardReasonerTrain")
Citation
If you use this dataset, please cite our paper. @article{GuardReasoner… See the full description on the dataset page: https://huggingface.co/datasets/yueliu1999/GuardReasonerTrain.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
KTDA-Datasets
This dataset card aims to describe the datasets used in the KTDA.
Install
pip install huggingface-hub
Usage
huggingface-cli download --repo-type dataset XavierJiezou/ktda-datasets --local-dir data --include grass.zip huggingface-cli download --repo-type dataset XavierJiezou/ktda-datasets --local-dir data --include cloud.zip
unzip grass.zip -d grass unzip cloud.zip -d l8_biome… See the full description on the dataset page: https://huggingface.co/datasets/XavierJiezou/ktda-datasets.
from datasets import load_dataset
Login using e.g. huggingface-cli login to access this dataset
ds = load_dataset("huggingface/transformers-metadata", "frameworks")
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Quick start
The easiest way to download the dataset to your local is to use huggingface-cli. The specific command you can use is huggingface-cli download zifeng-ai/TrialPanorama-database --local-dir LOCAL_DIR --repo-type dataset
where LOCAL_DIR should be replaced with the target directory you want to save your dataset to.
Update history
Aug.4 2025: updated tables with the full set of studies
Dataset website: https://ryanwangzf.github.io/projects/trialpanorama… See the full description on the dataset page: https://huggingface.co/datasets/zifeng-ai/TrialPanorama-database.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Experienced
ML training data for the Roman Microlensing Data CHallenge 2025 - Experienced tier.
Uploading
CLI:
brew install huggingface-cli
hf auth login
hf upload RGES-PIT/Experienced . --repo-type=dataset
Python: from huggingface_hub import HfApi
api = HfApi(token=os.getenv("HF_TOKEN")) api.upload_folder( folder_path="/path/to/local/dataset"… See the full description on the dataset page: https://huggingface.co/datasets/RGES-PIT/Experienced.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Summary
Healthy CT data for abdominal organs (liver, pancreas and kidney) are filtered out from public dataset.
Downloading Instructions
1- Install the Hugging Face library:
pip install -U "huggingface_hub[cli]"
2- Download the dataset:
mkdir HealthyCT cd HealthyCT huggingface-cli download qicq1c/HealthyCT --repo-type dataset --local-dir . --cache-dir ./cache
[Optional] Resume downloading
In case you had a previous interrupted download… See the full description on the dataset page: https://huggingface.co/datasets/qicq1c/HealthyCT.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Fastmap evaluation suite.
You only need the databases to run fastmap. Download the images if you want to produce colored point cloud. Download the subset of data you want to your local directory. huggingface-cli download whc/fastmap_sfm --repo-type dataset --local-dir ./ --include 'databases/tnt_*' 'ground_truths/tnt_*'
or use the python interface from huggingface_hub import hf_hub_download, snapshot_download snapshot_download( repo_id="whc/fastmap_sfm", repo_type='dataset'… See the full description on the dataset page: https://huggingface.co/datasets/whc/fastmap_sfm.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This is the dataset repository used in the pyiqa toolbox. Please refer to Awesome Image Quality Assessment for details of each dataset Example commandline script with huggingface-cli: huggingface-cli download chaofengc/IQA-PyTorch-Datasets live.tgz --local-dir ./datasets --repo-type dataset cd datasets tar -xzvf live.tgz
Disclaimer for This Dataset Collection
This collection of datasets is compiled and maintained for academic, research, and educational… See the full description on the dataset page: https://huggingface.co/datasets/chaofengc/IQA-PyTorch-Datasets.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
1X World Model Compression Challenge Dataset
This repository hosts the dataset for the 1X World Model Compression Challenge. huggingface-cli download 1x-technologies/worldmodel --repo-type dataset --local-dir data
Updates Since v1.1
Train/Val v2.0 (~100 hours), replacing v1.1 Test v2.0 dataset for the Compression Challenge Faces blurred for privacy New raw video dataset (CC-BY-NC-SA 4.0) at worldmodel_raw_data Example scripts now split into: cosmos_video_decoder.py —… See the full description on the dataset page: https://huggingface.co/datasets/1x-technologies/world_model_tokenized_data.
Pansharpening-Datasets
This dataset card aims to describe the datasets used in the Pansharpening.
Install
pip install huggingface-hub
Usage
huggingface-cli download --repo-type dataset XavierJiezou/pansharpening-datasets --local-dir data --include PanBench.zip
unzip PanBench.zip -d PanBench
Citation
@Article{cmfnet, AUTHOR = {Wang, Shiying and Zou, Xuechao and Li, Kai and Xing, Junliang and… See the full description on the dataset page: https://huggingface.co/datasets/XavierJiezou/pansharpening-datasets.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
COCO 2017 mirror
This is a just mirror of the raw COCO dataset files, for convenience. You have to download it using something like: pip install huggingface_hub
huggingface-cli download --local-dir coco-2017 pcuenq/coco-2017-mirror
And then unzip the files before use.