100+ datasets found
  1. R

    Train, Val, Test (sohan) Dataset

    • universe.roboflow.com
    zip
    Updated Sep 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flying Hippie (2025). Train, Val, Test (sohan) Dataset [Dataset]. https://universe.roboflow.com/flying-hippie/train-val-test-dataset-sohan-qzicd/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2025
    Dataset authored and provided by
    Flying Hippie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Flashoverdamage Insulator Broken 1oCX Broken Flashover Goodinsulators CQ9Y Bounding Boxes
    Description

    Train, Val, Test Dataset (Sohan)

    ## Overview
    
    Train, Val, Test Dataset (Sohan) is a dataset for object detection tasks - it contains Flashoverdamage Insulator Broken 1oCX Broken Flashover Goodinsulators CQ9Y annotations for 488 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. R

    Val V3 Dataset

    • universe.roboflow.com
    zip
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anony Mouse (2025). Val V3 Dataset [Dataset]. https://universe.roboflow.com/anony-mouse-vszj2/val-v3-guhiq/model/4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 28, 2025
    Dataset authored and provided by
    Anony Mouse
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Head Bounding Boxes
    Description

    Val V3

    ## Overview
    
    Val V3 is a dataset for object detection tasks - it contains Head annotations for 1,043 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. h

    cleaned-quora-dataset-train-test-split

    • huggingface.co
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fivesixseven (2024). cleaned-quora-dataset-train-test-split [Dataset]. https://huggingface.co/datasets/567-labs/cleaned-quora-dataset-train-test-split
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2024
    Dataset authored and provided by
    fivesixseven
    Description

    This is a cleaned version of the Quora dataset that's been configured with a train-test-val split.

    Train : For training model Test : For running experiments and comparing different OSS models and closed sourced models Val : Only to be used at the end!

    Colab Notebook to reproduce : https://colab.research.google.com/drive/1dGjGiqwPV1M7JOLfcPEsSh3SC37urItS?usp=sharing

  4. R

    Yolov5 Val Dataset

    • universe.roboflow.com
    zip
    Updated Feb 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hammad khan (2023). Yolov5 Val Dataset [Dataset]. https://universe.roboflow.com/hammad-khan-u5n0n/yolov5-val/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 2, 2023
    Dataset authored and provided by
    hammad khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Enemy Body Bounding Boxes
    Description

    Yolov5 Val

    ## Overview
    
    Yolov5 Val is a dataset for object detection tasks - it contains Enemy Body annotations for 2,565 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  5. bdd100k Yolo-Format Dataset

    • kaggle.com
    zip
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Mostafa (2025). bdd100k Yolo-Format Dataset [Dataset]. https://www.kaggle.com/datasets/a7madmostafa/bdd100k-yolo
    Explore at:
    zip(5718281611 bytes)Available download formats
    Dataset updated
    Oct 30, 2025
    Authors
    Ahmad Mostafa
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    🛣️ BDD100K in YOLO Format

    Ready-to-Use Object Detection Dataset for Autonomous Driving

    📘 Overview

    The BDD100K YOLO Format Dataset is a reformatted version of the original Berkeley DeepDrive (BDD100K) dataset, converted into YOLO-compatible annotations.
    It’s designed for quick integration with YOLOv5, YOLOv8, YOLOv11, and other modern object detection frameworks.

    This dataset provides real-world driving scenes with bounding box annotations for common traffic objects — ideal for research, training, and benchmarking models for autonomous driving and road safety applications.

    📂 Dataset Structure

    bdd100k_yolo/
    │
    ├── train/
    │  ├── images/  # 70,000 images
    │  └── labels/  # Corresponding YOLO .txt annotations
    │
    ├── val/
    │  ├── images/  # 10,000 images
    │  └── labels/
    │
    ├── test/
    │  ├── images/
    │  └── labels/
    │
    └── data.yaml   # Dataset configuration file for YOLO
    

    🏷️ Classes (10)

    | ID | Class Name | | -- | ------------- | | 0 | person | | 1 | rider | | 2 | car | | 3 | bus | | 4 | truck | | 5 | bike | | 6 | motor | | 7 | traffic light | | 8 | traffic sign | | 9 | train |

    🚀 Usage

    Train your YOLO model directly using the provided data.yaml file.

    🧩 YOLOv8 Example

    yolo detect train data=data.yaml model=yolov8n.pt epochs=50 imgsz=640
    

    ✅ Validate

    yolo detect val data=data.yaml model=path/to/best.pt
    

    🔍 Predict

    yolo detect predict model=path/to/best.pt
    
  6. VNLicensePlate_yolov7

    • kaggle.com
    zip
    Updated Oct 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bảo Mai Chí (2022). VNLicensePlate_yolov7 [Dataset]. https://www.kaggle.com/datasets/bomaich/vnlicenseplate/code
    Explore at:
    zip(248929934 bytes)Available download formats
    Dataset updated
    Oct 7, 2022
    Authors
    Bảo Mai Chí
    Description

    This dataset contains 1000 images of Vietnamese License Plate (both 1 and 2 lines LP) and text file with xywh to train YOLOv7 model for LP detection This dataset is already splitted into train/valid/test folder for using

  7. h

    VisionThink-General-Val

    • huggingface.co
    Updated Jul 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Senqiao Yang (2025). VisionThink-General-Val [Dataset]. https://huggingface.co/datasets/Senqiao/VisionThink-General-Val
    Explore at:
    Dataset updated
    Jul 23, 2025
    Authors
    Senqiao Yang
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

      Senqiao/VisionThink-General-Val
    

    This is the validation dataset used for our Reasoning VLM on general VQA tasks. VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning [Paper] Senqiao Yang, Junyi Li, Xin Lai, Bei Yu, Hengshuang Zhao, Jiaya Jia

      Highlights
    

    Our VisionThink leverages reinforcement learning to autonomously learn whether to… See the full description on the dataset page: https://huggingface.co/datasets/Senqiao/VisionThink-General-Val.

  8. Image-dataset-FER-Test,Train,Val

    • kaggle.com
    zip
    Updated Oct 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dolly prajapati 182 (2024). Image-dataset-FER-Test,Train,Val [Dataset]. https://www.kaggle.com/datasets/dollyprajapati182/image-dataset-fer-testtrainval/code
    Explore at:
    zip(248085782 bytes)Available download formats
    Dataset updated
    Oct 8, 2024
    Authors
    dolly prajapati 182
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset is a split version of the original Image-dataset found here. The dataset consists of 8 emotion classes: angry, contempt, disgust, fear, happiness, neutral, sadness, and surprise.

    To facilitate model training and evaluation, I have organized the dataset into three subsets:

    Train: Used for training machine learning models. Test: Used to evaluate model performance after training. Validation: Used during training to tune hyperparameters and prevent overfitting.

    This split allows for more effective usage in tasks such as Facial Emotion Recognition (FER) and other emotion analysis projects.

  9. h

    VAL-Bench

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Val Bench, VAL-Bench [Dataset]. https://huggingface.co/datasets/val-bench/VAL-Bench
    Explore at:
    Authors
    Val Bench
    Description

    VAL-Bench

    A diverse benchmark for systematic analysis of a how reliably language models embody human values.

      Abstract
    

    Large language models (LLMs) are increasingly used for tasks where outputs shape human decisions, so it is critical to test whether their responses reflect consistent human values. Existing benchmarks mostly track refusals or predefined safety violations, but these only check rule compliance and do not reveal whether a model upholds a coherent value system… See the full description on the dataset page: https://huggingface.co/datasets/val-bench/VAL-Bench.

  10. h

    text-anonymization-benchmark-val-test

    • huggingface.co
    Updated Mar 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mateusz Dziemian (2024). text-anonymization-benchmark-val-test [Dataset]. https://huggingface.co/datasets/mattmdjaga/text-anonymization-benchmark-val-test
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2024
    Authors
    Mateusz Dziemian
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset card for Text Anonymization Benchmark (TAB) Validation & Test

      Dataset Summary
    

    This is the validation and test split of the Text Anonymisation Benchmark. As the title says it's a dataset focused on text anonymisation, specifcially European Court Documents, which contain labels by mutltiple annotators.

      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    [More Information Needed]

      Dataset Structure
    
    
    
    
    
      Data… See the full description on the dataset page: https://huggingface.co/datasets/mattmdjaga/text-anonymization-benchmark-val-test.
    
  11. R

    Test Val Dataset

    • universe.roboflow.com
    zip
    Updated Sep 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valorant (2021). Test Val Dataset [Dataset]. https://universe.roboflow.com/valorant-jzbfx/test-val
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 14, 2021
    Dataset authored and provided by
    Valorant
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Full Body Bounding Boxes
    Description

    Test Val

    ## Overview
    
    Test Val is a dataset for object detection tasks - it contains Full Body annotations for 1,103 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  12. e

    LIDAR DTM — Digital Earth Model — Val di Sella 2007

    • data.europa.eu
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LIDAR DTM — Digital Earth Model — Val di Sella 2007 [Dataset]. https://data.europa.eu/data/datasets/p_tn-152ad4d1-3263-4ae9-bce9-1265f019a784/
    Explore at:
    Description

    Digital Earth Model (1 m x 1 m) derived from the LiDAR survey carried out in 2007 on behalf of the Forest and Fauna Service — PAT in collaboration with the Faculty of Engineering, Department of Informatics and Telecommunications of the University of Trento for the estimation and spatialisation of the main forest parameters. Test area of Val di Sella.

  13. RibonanzaNet-Drop Train, Val, and Test Data

    • kaggle.com
    zip
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamish Blair (2024). RibonanzaNet-Drop Train, Val, and Test Data [Dataset]. https://www.kaggle.com/datasets/hmblair/ribonanzanet-drop-train-val-and-test-data
    Explore at:
    zip(402567233 bytes)Available download formats
    Dataset updated
    Feb 19, 2024
    Authors
    Hamish Blair
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    onemil1_1.nc is the train dataset. onemil1_2.nc is the validation dataset. onemil2.nc, p240.nc, and p390.nc are the test datasets.

    These files are in .nc format; use xarray with Python to interface with them.

  14. model-1-weights-log

    • kaggle.com
    zip
    Updated Mar 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sabbo26 (2022). model-1-weights-log [Dataset]. https://www.kaggle.com/sabbo26/second-weights
    Explore at:
    zip(249424523 bytes)Available download formats
    Dataset updated
    Mar 12, 2022
    Authors
    sabbo26
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by sabbo26

    Released under CC0: Public Domain

    Contents

  15. h

    kitrec-val-setb

    • huggingface.co
    Updated Nov 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Younggoo lee (2025). kitrec-val-setb [Dataset]. https://huggingface.co/datasets/Younggooo/kitrec-val-setb
    Explore at:
    Dataset updated
    Nov 30, 2025
    Authors
    Younggoo lee
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    KitREC Validation Dataset - Set B

    Validation dataset for the KitREC (Knowledge-Instruction Transfer for Recommendation) cross-domain recommendation system.

      Dataset Description
    

    This validation dataset is designed for evaluating fine-tuned LLMs on cross-domain recommendation tasks during training. It uses the same users as the test set but allows for validation monitoring.

      Dataset Summary
    

    Attribute Value

    Candidate Set Set B (Random (Fair baseline))… See the full description on the dataset page: https://huggingface.co/datasets/Younggooo/kitrec-val-setb.

  16. Z

    Data from: Solar flare forecasting based on magnetogram sequences learning...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Grim, Luís Fernando Lopes; Sampaio Gradvohl, André Leon (2023). Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10246576
    Explore at:
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    Universidade Estadual de Campinas
    Universidade Estadual de Campinas (UNICAMP)
    Authors
    Grim, Luís Fernando Lopes; Sampaio Gradvohl, André Leon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Source codes and dataset of the research "Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation". Our work employed PyTorch, a framework for training Deep Learning models with GPU support and automatic back-propagation, to load the MViTv2 s models with Kinetics-400 weights. To simplify the code implementation, eliminating the need for an explicit loop to train and the automation of some hyperparameters, we use the PyTorch Lightning module. The inputs were batches of 10 samples with 16 sequenced images in 3-channel resized to 224 × 224 pixels and normalized from 0 to 1. Most of the papers in our literature survey split the original dataset chronologically. Some authors also apply k-fold cross-validation to emphasize the evaluation of the model stability. However, we adopt a hybrid split taking the first 50,000 to apply the 5-fold cross-validation between the training and validation sets (known data), with 40,000 samples for training and 10,000 for validation. Thus, we can evaluate performance and stability by analyzing the mean and standard deviation of all trained models in the test set, composed of the last 9,834 samples, preserving the chronological order (simulating unknown data). We develop three distinct models to evaluate the impact of oversampling magnetogram sequences through the dataset. The first model, Solar Flare MViT (SF MViT), has trained only with the original data from our base dataset without using oversampling. In the second model, Solar Flare MViT over Train (SF MViT oT), we only apply oversampling on training data, maintaining the original validation dataset. In the third model, Solar Flare MViT over Train and Validation (SF MViT oTV), we apply oversampling in both training and validation sets. We also trained a model oversampling the entire dataset. We called it the "SF_MViT_oTV Test" to verify how resampling or adopting a test set with unreal data may bias the results positively. GitHub version The .zip hosted here contains all files from the project, including the checkpoint and the output files generated by the codes. We have a clean version hosted on GitHub (https://github.com/lfgrim/SFF_MagSeq_MViTs), without the magnetogram_jpg folder (which can be downloaded directly on https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip) and the output and checkpoint files. Most code files hosted here also contain comments on the Portuguese language, which are being updated to English in the GitHub version. Folders Structure In the Root directory of the project, we have two folders:

    magnetogram_jpg: holds the source images provided by Space Environment Artificial Intelligence Early Warning Innovation Workshop through the link https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip. It comprises 73,810 samples of high-quality magnetograms captured by HMI/SDO from 2010 May 4 to 2019 January 26. The HMI instrument provides these data (stored in hmi.sharp_720s dataset), making new samples available every 12 minutes. However, the images from this dataset were collected every 96 minutes. Each image has an associated magnetogram comprising a ready-made snippet of one or most solar ARs. It is essential to notice that the magnetograms cropped by SHARP can contain one or more solar ARs classified by the National Oceanic and Atmospheric Administration (NOAA). Seq_Magnetogram: contains the references for source images with the corresponding labels in the next 24 h. and 48 h. in the respectively M24 and M48 sub-folders.

    M24/M48: both present the following sub-folders structure:

    Seqs16; SF_MViT; SF_MViT_oT; SF_MViT_oTV; SF_MViT_oTV_Test. There are also two files in root:

    inst_packages.sh: install the packages and dependencies to run the models. download_MViTS.py: download the pre-trained MViTv2_S from PyTorch and store it in the cache. M24 and M48 folders hold reference text files (flare_Mclass...) linking the images in the magnetogram_jpg folders or the sequences (Seq16_flare_Mclass...) in the Seqs16 folders with their respective labels. They also hold "cria_seqs.py" which was responsible for creating the sequences and "test_pandas.py" to verify head info and check the number of samples categorized by the label of the text files. All the text files with the prefix "Seq16" and inside the Seqs16 folder were created by "criaseqs.py" code based on the correspondent "flare_Mclass" prefixed text files. Seqs16 folder holds reference text files, in which each file contains a sequence of images that was pointed to the magnetogram_jpg folders. All SF_MViT... folders hold the model training codes itself (SF_MViT...py) and the corresponding job submission (jobMViT...), temporary input (Seq16_flare...), output (saida_MVIT... and MViT_S...), error (err_MViT...) and checkpoint files (sample-FLARE...ckpt). Executed model training codes generate output, error, and checkpoint files. There is also a folder called "lightning_logs" that stores logs of trained models. Naming pattern for the files:

    magnetogram_jpg: follows the format "hmi.sharp_720s...magnetogram.fits.jpg" and Seqs16: follows the format "hmi.sharp_720s...to.", where:

    hmi: is the instrument that captured the image
    sharp_720s: is the database source of SDO/HMI.
    is the identification of SHARP region, and can contain one or more solar ARs classified by the (NOAA).
    is the date-time the instrument captured the image in the format yyyymmdd_hhnnss_TAI (y:year, m:month, d:day, h:hours, n:minutes, s:seconds).
    is the date-time when the sequence starts, and follow the same format of .

    is the date-time when the sequence ends, and follow the same format of . Reference text files in M24 and M48 or inside SF_MViT... folders follows the format "flare_Mclass_.txt", where:

    is Seq16 if refers to a sequence, or void if refers direct to images.

    "24h" or "48h".

    is "TrainVal" or "Test". The refers to the split of Train/Val.

    void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. All SF_MViT...folders:

    Model training codes: "SF_MViT_M+_", where:

    void or "oT" (over Train) or "oTV" (over Train and Val) or "oTV_Test" (over Train, Val and Test);

    "24h" or "48h";

    "oneSplit" for a specific split or "allSplits" if run all splits.

    void is default to run 1 GPU or "2gpu" to run into 2 gpus systems; Job submission files: "jobMViT_", where:

    point the queue in Lovelace environment hosted on CENAPAD-SP (https://www.cenapad.unicamp.br/parque/jobsLovelace) Temporary inputs: "Seq16_flare_Mclass_.txt:

    train or val;

    void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. Outputs: "saida_MViT_Adam_10-7", where:

    k0 to k4, means the correlated split of the output, or void if the output is from all splits. Error files: "err_MViT_Adam_10-7", where:

    k0 to k4, means the correlated split of the error log file, or void if the error file is from all splits. Checkpoint files: "sample-FLARE_MViT_S_10-7-epoch=-valid_loss=-Wloss_k=.ckpt", where:

    epoch number of the checkpoint;

    corresponding valid loss;

    0 to 4.

  17. Data from: Duck Hunt

    • kaggle.com
    zip
    Updated Jul 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Zanini (2025). Duck Hunt [Dataset]. https://www.kaggle.com/datasets/hugozanini1/duck-hunt
    Explore at:
    zip(7379197 bytes)Available download formats
    Dataset updated
    Jul 26, 2025
    Authors
    Hugo Zanini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Duck Hunt Object Detection Dataset

    This dataset contains 1,004 labeled images from the classic NES game "Duck Hunt" (1984), specifically prepared for YOLO (You Only Look Once) object detection training. The dataset includes sprites of the iconic hunting dog and ducks in various states, augmented to provide a balanced and comprehensive training set for computer vision models.

    Perfect for: - Object detection model training - Computer vision research - Retro gaming AI projects - YOLO algorithm benchmarking - Educational purposes

    🎯 Dataset Statistics

    MetricValue
    Total Images1,004
    Dataset Size12 MB
    Image FormatPNG
    Annotation FormatYOLO (.txt)
    Classes4
    Train/Val Split711/260 (73%/27%)

    Class Distribution

    Class IDClass NameCountDescription
    0dog252The hunting dog in various poses (jumping, laughing, sniffing, etc.)
    1duck_dead256Dead ducks (both black and red variants)
    2duck_shot248Ducks in the moment of being shot
    3duck_flying248Flying ducks in all directions (left, right, diagonal)

    📁 Dataset Structure

    yolo_dataset_augmented/
    ├── images/
    │  ├── train/      # 711 training images
    │  └── val/       # 260 validation images
    ├── labels/
    │  ├── train/      # 711 YOLO annotation files
    │  └── val/       # 260 YOLO annotation files
    ├── classes.txt     # Class names mapping
    ├── dataset.yaml     # YOLO configuration file
    └── augmented_dataset_stats.json # Detailed statistics
    

    🔧 Data Augmentation Details

    The original 47 images were enhanced using advanced data augmentation techniques to create a balanced dataset:

    Augmentation Techniques Applied:

    • Geometric Transformations: Rotation (±15°), horizontal/vertical flipping, scaling (0.8-1.2x), translation
    • Color Adjustments: Brightness (0.7-1.3x), contrast (0.8-1.2x), saturation (0.8-1.2x)
    • Quality Variations: Gaussian noise, slight blur for robustness
    • Advanced Techniques: Mosaic augmentation (YOLO-style 4-image combination)

    Augmentation Parameters:

    {
      'rotation_range': (-15, 15),    # Small rotations for game sprites
      'brightness_range': (0.7, 1.3),  # Brightness variations
      'contrast_range': (0.8, 1.2),   # Contrast adjustments
      'saturation_range': (0.8, 1.2),  # Color saturation
      'noise_intensity': 0.02,      # Gaussian noise
      'horizontal_flip_prob': 0.5,    # 50% chance horizontal flip
      'scaling_range': (0.8, 1.2),    # Scale variations
    }
    

    🚀 Usage Examples

    Loading with YOLOv8 (Ultralytics)

    from ultralytics import YOLO
    
    # Load and train
    model = YOLO('yolov8n.pt') # Load pretrained model
    results = model.train(data='dataset.yaml', epochs=100, imgsz=640)
    
    # Validate
    metrics = model.val()
    
    # Predict
    results = model('path/to/test/image.png')
    

    Loading with PyTorch

    import torch
    from torch.utils.data import Dataset, DataLoader
    from PIL import Image
    import os
    
    class DuckHuntDataset(Dataset):
      def _init_(self, images_dir, labels_dir, transform=None):
        self.images_dir = images_dir
        self.labels_dir = labels_dir
        self.transform = transform
        self.images = os.listdir(images_dir)
      
      def _len_(self):
        return len(self.images)
      
      def _getitem_(self, idx):
        img_path = os.path.join(self.images_dir, self.images[idx])
        label_path = os.path.join(self.labels_dir, 
                     self.images[idx].replace('.png', '.txt'))
        
        image = Image.open(img_path)
        # Load YOLO annotations
        with open(label_path, 'r') as f:
          labels = f.readlines()
        
        if self.transform:
          image = self.transform(image)
          
        return image, labels
    
    # Usage
    dataset = DuckHuntDataset('images/train', 'labels/train')
    dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
    

    YOLO Annotation Format

    Each .txt file contains one line per object: class_id center_x center_y width height

    Example annotation: 0 0.492 0.403 0.212 0.315 Where values are normalized (0-1) relative to image dimensions.

    📊 Technical Specifications

    • Image Dimensions: Variable (original sprite sizes preserved)
    • Color Channels: RGB (3 channels)
    • Annotation Precision: Float32 (normalized coordinates)
    • File Naming: Descriptive names indicating class and augmentation type
    • Quality: High-resolution pixel art sprites

    🎮 Dataset Context

    This dataset is based on sprites from the iconic 1984 NES game "Duck Hunt," one of the most recognizable video games in history. The game featured:

    • The Dog: Your hunting companion who retrieves ducks and ...
  18. Z

    StarDist Adipocyte Segmentation Training data, Training Notebook and Model

    • data.niaid.nih.gov
    Updated Oct 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarkis Rita; Naveiras Olaia; Burri Olivier; Weigert Martin; De Leval Laurence (2022). StarDist Adipocyte Segmentation Training data, Training Notebook and Model [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7003908
    Explore at:
    Dataset updated
    Oct 25, 2022
    Dataset provided by
    CHUV
    EPFL
    UNIL, CHUV
    EPFL,UNIL
    Authors
    Sarkis Rita; Naveiras Olaia; Burri Olivier; Weigert Martin; De Leval Laurence
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data from H&E human bone marrow whole slide scanner images used in the paper: "MarrowQuant 2.0: a digital pathology workflow assisting bone marrow evaluation in clinical and experimental hematology" (https://doi.org/10.21203/rs.3.rs-1860140/v1)

    292 image patches

    Ground truth were manually annotated using QuPath and split into 263 images for training and 29 for validation.

    Training in StarDist was done on a Windows 10 PC with an RTX 2080 GPU. The requirements file for installing a Python 3.7 environment to run the attached notebooks is provided (stardist-val.txt).

    The StarDist model configuration can be found in the Jupyter Notebook :

    Adipocyte Training.ipynb

    Model validation and metrics can be performed by running the notebook after finishing the Adipocyte Training notebook.

    Quality Control.ipynb

  19. t

    Data from: Learned ie thesis: transparent boundary conditions for the val-c...

    • service.tib.eu
    • resodate.org
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Learned ie thesis: transparent boundary conditions for the val-c model [Dataset]. https://service.tib.eu/ldmservice/dataset/goe-doi-10-25625-n0s2qj
    Explore at:
    Dataset updated
    May 16, 2025
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains intermediate data products for some experiments in the PhD thesis "Learned infinite elements for helioseismology" (J.Preuß). We study the accuracy and efficiency of learned IEs for representing a solar atmosphere based on the VAL-C model. A brief description of the provided data is given below: The file "power-spectrum-VALC-meshed.out" contains the reference power spectrum computed with a fully meshed atmosphere. Harmonic degrees up to l=1000 and 7200 equidistant frequencies ranging from 0 mHz up to 8.3 mHz have been considered. The files "power-spectrum-Nj-VALC.out" contain the power spectra obtained with learned IEs using N=j for j in [0,1,2,3,4]. A description how the computed data has been obtained and how it should be postprocessed further is available at this Gitlab repository.

  20. Val-acc-F03

    • kaggle.com
    zip
    Updated Apr 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rohan Roy (2020). Val-acc-F03 [Dataset]. https://www.kaggle.com/datasets/tharakan684/valaccf03
    Explore at:
    zip(273836637 bytes)Available download formats
    Dataset updated
    Apr 27, 2020
    Authors
    Rohan Roy
    Description

    Dataset

    This dataset was created by Rohan Roy

    Released under Other (specified in description)

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Flying Hippie (2025). Train, Val, Test (sohan) Dataset [Dataset]. https://universe.roboflow.com/flying-hippie/train-val-test-dataset-sohan-qzicd/model/2

Train, Val, Test (sohan) Dataset

train-val-test-(sohan)-dataset

train-val-test-dataset-sohan-qzicd

Explore at:
zipAvailable download formats
Dataset updated
Sep 8, 2025
Dataset authored and provided by
Flying Hippie
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Flashoverdamage Insulator Broken 1oCX Broken Flashover Goodinsulators CQ9Y Bounding Boxes
Description

Train, Val, Test Dataset (Sohan)

## Overview

Train, Val, Test Dataset (Sohan) is a dataset for object detection tasks - it contains Flashoverdamage Insulator Broken 1oCX Broken Flashover Goodinsulators CQ9Y annotations for 488 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu