100+ datasets found
  1. R

    Image Augmentation Dataset

    • universe.roboflow.com
    zip
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Augmentation (2024). Image Augmentation Dataset [Dataset]. https://universe.roboflow.com/data-augmentation-d7svr/image-augmentation-4ax9o
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 2, 2024
    Dataset authored and provided by
    Data Augmentation
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Fractured Bounding Boxes
    Description

    Image Augmentation

    ## Overview
    
    Image Augmentation is a dataset for object detection tasks - it contains Fractured annotations for 702 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. u

    Variable Message Signal annotated images for object detection

    • portalcientifico.universidadeuropea.com
    • zenodo.org
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    De Las Heras De Matías, Gonzalo; Sánchez-Soriano, Javier; Puertas, Enrique; De Las Heras De Matías, Gonzalo; Sánchez-Soriano, Javier; Puertas, Enrique (2022). Variable Message Signal annotated images for object detection [Dataset]. https://portalcientifico.universidadeuropea.com/documentos/668fc42eb9e7c03b01bd5abc?lang=en
    Explore at:
    Dataset updated
    2022
    Authors
    De Las Heras De Matías, Gonzalo; Sánchez-Soriano, Javier; Puertas, Enrique; De Las Heras De Matías, Gonzalo; Sánchez-Soriano, Javier; Puertas, Enrique
    Description

    If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Sánchez-Soriano, J.; Fernández-Andrés, J. Dataset: Variable Message Signal Annotated Images for Object Detection. Data 2022, 7, 41. https://doi.org/10.3390/data7040041 This dataset consists of Spanish road images taken from inside a vehicle, as well as annotations in XML files in PASCAL VOC format that indicate the location of Variable Message Signals within them. Also, a CSV file is attached with information regarding the geographic position, the folder where the image is located, and the text in Spanish. This can be used to train supervised learning computer vision algorithms, such as convolutional neural networks. Throughout this work, the process followed to obtain the dataset, image acquisition, and labeling, and its specifications are detailed. The dataset is constituted of 1216 instances, 888 positives, and 328 negatives, in 1152 jpg images with a resolution of 1280x720 pixels. These are divided into 576 real images and 576 images created from the data-augmentation technique. The purpose of this dataset is to help in road computer vision research since there is not one specifically for VMSs. The folder structure of the dataset is as follows: vms_dataset/ data.csv real_images/ imgs/ annotations/ data-augmentation/ imgs/ annotations/ In which: data.csv: Each row contains the following information separated by commas (,): image_name, x_min, y_min, x_max, y_max, class_name, lat, long, folder, text. real_images: Images extracted directly from the videos. data-augmentation: Images created using data-augmentation imgs: Image files in .jpg format. annotations: Annotation files in .xml format.

  3. Yolo tiger and lion labelled detection

    • kaggle.com
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junkie75 (2024). Yolo tiger and lion labelled detection [Dataset]. https://www.kaggle.com/datasets/junkie75/yolo-tiger-and-lion-labelled-detection/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Kaggle
    Authors
    Junkie75
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains images of lions and tigers sourced from the Open Images Dataset V6 and labeled specifically for object detection using the YOLO format. The dataset focuses on two classes: lion and tiger, with annotations provided for each image in a YOLO-compatible .txt file format. This dataset is ideal for training machine learning models for wildlife detection and classification tasks, particularly in distinguishing between these two majestic big cats. Key Features:

    Classes: Lion and Tiger
    Annotations: YOLO format, with bounding box coordinates and class labels provided in separate .txt files for each image.
    Source: Images sourced from Open Images Dataset V6, which is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
    Application: Suitable for object detection models like YOLO, SSD, or Faster R-CNN.
    

    Usage:

    The dataset can be used for training, validating, or testing object detection models. Each image is accompanied by a corresponding YOLO annotation file, making it easy to integrate into any YOLO-based pipeline. Attribution:

    This dataset is derived from the Open Images Dataset V6, and proper attribution must be given. Please credit the Open Images Dataset when using or sharing this dataset in any format.

  4. m

    SyntheticIndoorObjectDetectionDataset

    • data.mendeley.com
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nafiz Fahad (2025). SyntheticIndoorObjectDetectionDataset [Dataset]. http://doi.org/10.17632/nnph98d3kc.2
    Explore at:
    Dataset updated
    Mar 25, 2025
    Authors
    Nafiz Fahad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset was collected from the MyNursingHome dataset, available at https://data.mendeley.com/datasets/fpctx3svzd/1 , and curated to develop a synthetic indoor object detection dataset for autonomous mobile robots, or robots, for supporting researchers in detecting and classifying objects for computer vision and pattern recognition. From the original dataset containing 25 object categories, we selected six key categories—basket bin (499 images), sofa (499 images), human (499 images), table (500 images), chair (496 images), and door (500 images). Initially, we collected a total of 2,993 images from these categories; however, during the annotation process using Roboflow, we rejected 1 sofa, 10 tables, 9 chairs, and 12 door images due to quality concerns, such as poor image resolution or difficulty in identifying the object, resulting in a final dataset of 2,961 images. To ensure an effective training pipeline, we divided the dataset into 70% training (2,073 images), 20% validation (591 images), and 10% test (297 images). Preprocessing steps included auto-orientation and resizing all images to 640×640 pixels to maintain uniformity. To improve generalization for real-world applications, we applied data augmentation techniques, including horizontal and vertical flipping, 90-degree rotations (clockwise, counter-clockwise, and upside down), random rotations within -15° to +15°, shearing within ±10° horizontally and vertically, and brightness adjustments between -15% and +15%. This augmentation process expanded the dataset to 7,107 images, with 6,219 images for training (88%), 597 for validation (8%), and 297 for testing (4%). Moreover, this well-annotated, preprocessed, and augmented dataset significantly improves object detection performance in indoor settings.

  5. A dataset for window and blind states detection

    • figshare.com
    bin
    Updated Aug 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seunghyeon Wang (2024). A dataset for window and blind states detection [Dataset]. http://doi.org/10.6084/m9.figshare.26403004.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 5, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Seunghyeon Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data was constructed for detecting window and blind states. All images were annotated in XML format using LabelImg for object detection tasks. The results of applying the Faster R-CNN based model include detected images and loss graphs for both training and validation in this dataset. Additionally, the raw data with other annotations can be used for applications such as semantic segmentation and image captioning.

  6. R

    Hard Hat Workers Object Detection Dataset - resize-416x416-reflectEdges

    • public.roboflow.com
    zip
    Updated Sep 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Northeastern University - China (2022). Hard Hat Workers Object Detection Dataset - resize-416x416-reflectEdges [Dataset]. https://public.roboflow.com/object-detection/hard-hat-workers/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 30, 2022
    Dataset authored and provided by
    Northeastern University - China
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of Workers
    Description

    Overview

    The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

    The original dataset has a 75/25 train-test split.

    Example Image: https://i.imgur.com/7spoIJT.png" alt="Example Image">

    Use Cases

    One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

    Using this Dataset

    Use the fork or Download this Dataset button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

    Dataset Versions:

    Image Preprocessing | Image Augmentation | Modify Classes * v1 (resize-416x416-reflect): generated with the original 75/25 train-test split | No augmentations * v2 (raw_75-25_trainTestSplit): generated with the original 75/25 train-test split | These are the raw, original images * v3 (v3): generated with the original 75/25 train-test split | Modify Classes used to drop person class | Preprocessing and Augmentation applied * v5 (raw_HeadHelmetClasses): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class * v8 (raw_HelmetClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and person classes * v9 (raw_PersonClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and helmet classes * v10 (raw_AllClasses): generated with a 70/20/10 train/valid/test split | These are the raw, original images * v11 (augmented3x-AllClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied | 3x image generation | Trained with Roboflow's Fast Model * v12 (augmented3x-HeadHelmetClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Fast Model * v13 (augmented3x-HeadHelmetClasses-AccurateModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Accurate Model * v14 (raw_HeadClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class, and remap/relabel helmet class to head

    Choosing Between Computer Vision Model Sizes | Roboflow Train

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

    Roboflow Workmark

  7. r

    Hat Data Augmentation Dataset

    • universe.roboflow.com
    zip
    Updated May 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data augmentation (2023). Hat Data Augmentation Dataset [Dataset]. https://universe.roboflow.com/data-augmentation-a0ako/hat-data-augmentation
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2023
    Dataset authored and provided by
    data augmentation
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Hat Person Bounding Boxes
    Description

    Hat Data Augmentation

    ## Overview
    
    Hat Data Augmentation is a dataset for object detection tasks - it contains Hat Person annotations for 3,213 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  8. f

    ROAD OBSTACLES.zip Road Obstacles for Training DL Models

    • figshare.com
    zip
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pison mutabarura; Nicasio Maguu Muchuka; Davies Rene Segera (2024). ROAD OBSTACLES.zip Road Obstacles for Training DL Models [Dataset]. http://doi.org/10.6084/m9.figshare.27909219.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    figshare
    Authors
    pison mutabarura; Nicasio Maguu Muchuka; Davies Rene Segera
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Augmented custom dataset with images sourced from online sources and camera captures. The dataset was used to train YOLO models for road obstacle detection on African roads specificallly.

  9. Traffic Road Object Detection Polish 12k

    • kaggle.com
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mikołaj Kołek (2024). Traffic Road Object Detection Polish 12k [Dataset]. https://www.kaggle.com/datasets/mikoajkoek/traffic-road-object-detection-polish-12k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mikołaj Kołek
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains annotated images of Polish roads, specifically curated for object detection tasks. The data was collected using a car camera on roads in Poland, primarily in Kraków. The images capture a diverse range of scenarios, including different road types and various lighting conditions (day and night).

    Classes:

    • Car (Vehicles without a trailer)
    • Different-Traffic-Sign (Other traffic signs than warning and prohibition signs, mostly information and order signs)
    • Green-Traffic-Light (Green traffic lights for cars only; green lights for pedestrians are not annotated)
    • Motorcycle
    • Pedestrian (People and cyclists)
    • Pedestrian-Crossing (Pedestrian crossings)
    • Prohibition-Sign (All prohibition signs)
    • Red-Traffic-Light (Red traffic lights for cars only; lights for pedestrians are not annotated)
    • Speed-Limit-Sign (Speed limit signs)
    • Truck (Vehicles with a trailer)
    • Warning-Sign (Warning signs)

    Annotation Process:

    Annotations were carried out using Roboflow. A total of 2,000 images were manually labeled, while an additional 9,000 images were generated through data augmentation. The labeled techniques applied were crop, saturation, brightness, and exposure adjustments.

    Image Statistics Before Data Augmentation:

    Approximately

    • 400 cars per 100 photos
    • 30 different-traffic-signs per photos
    • 80 red-traffic-lights per photos
    • 70 pedestrians per photos
    • 50 warning signs per photos
    • 50 pedestrian-crossings per photos
    • 40 green-traffic-lights per photos
    • 40 prohibition signs per photos
    • 40 trucks per photos
    • 20 speed-limit-signs per photos
    • 2 motorcycles per photos

    The photos were taken on both normal roads and highways, under various conditions, including day and night. All photos were initially 1920x1080 pixels. After cropping, some images may be slightly smaller. No preprocessing steps were applied to the photos.

    Annotations are provided in YOLO format.

    Image Statistics Before Data Augmentation:

    SetPhotosCarDifferent-Traffic-SignRed-Traffic-LightPedestrianWarning-SignPedestrian-CrossingGreen-Traffic-LightProhibition-SignTruckSpeed-Limit-SignMotorcycle
    Test Set1666875471631377982524866224
    Train Set11784766337080581254447640239640923038
    Validation Set3271343945232228163112871121375910

    Image Statistics After Data Augmentation:

    SetPhotosCarDifferent-Traffic-SignRed-Traffic-LightPedestrianWarning-SignPedestrian-CrossingGreen-Traffic-LightProhibition-SignTruckSpeed-Limit-SignMotorcycle
    Test Set9964122328297882247449231228839613224
    Train Set7068285962022048304872326428562412237624541380228
    Validation Set1962805856701392136897867252267282235460
  10. Training dataset for object detection - Penguins from UAV

    • data.aad.gov.au
    • data.gov.au
    Updated Feb 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BELYAEV, OLEG (2023). Training dataset for object detection - Penguins from UAV [Dataset]. http://doi.org/10.26179/s10z-da41
    Explore at:
    Dataset updated
    Feb 21, 2023
    Dataset provided by
    Australian Antarctic Divisionhttps://www.antarctica.gov.au/
    Australian Antarctic Data Centre
    Authors
    BELYAEV, OLEG
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 8, 2021
    Area covered
    Description

    On February 8, 2021, Deception Island Chinstrap penguin colonies were photographed during the PiMetAn Project XXXIV Spanish Antarctic campaign using unmanned aerial vehicles (UAV) at a height of 30m. From the obtained imagery, a training dataset for penguin detection from aerial perspective was generated.

    The penguin species is the Chinstrap penguin (Pygoscelis antarcticus).

    The dataset consists of three folders: "train", containing 531 images, intended for model training; "valid", containing 50 images, intended for model validation; and "test", containing 25 images, intended for model testing. In each of the three folders, an additional .csv file is located, containing labels (x,y positions and class names for every penguin in the images), annotated in Tensorflow Object Detection format.

    There is only one annotation class: Penguin.

    All 606 images are 224x224 px in size, and 96 dpi.

    The following augmentation was applied to create 3 versions of each source image: * Random shear of between -18° to +18° horizontally and -11° to +11° vertically

    This dataset was annotated and exported via www.roboflow.com

    The model Faster R-CNN64 with ResNet-101 backbone was used to perform object detection tasks. Training and evaluation tasks were performed using the TensorFlow 2.0 machine learning platform by Google.

  11. m

    MangoImageBD: An Extensive Image Dataset of Common and Popular Mango...

    • data.mendeley.com
    Updated Dec 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Hasanul Ferdaus (2024). MangoImageBD: An Extensive Image Dataset of Common and Popular Mango Varieties in Bangladesh for Identification and Classification [Dataset]. http://doi.org/10.17632/hp2cdckpdr.2
    Explore at:
    Dataset updated
    Dec 10, 2024
    Authors
    Md Hasanul Ferdaus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    Type of data: 504 x 1120 px mango images. Data format: JPEG. Contents of the dataset: Images (original, processed, and augmented) of common and popular varieties of mangoes in Bangladesh.

    Number of classes: Fifteen (15) common and popular varieties of mangoes in Bangladesh - (1) Amrapali, (2) Ashshina Classic, (3) Ashshina Zhinuk, (4) Banana Mango, (5) Bari-4, (6) Bari-11, (7) Fazli Classic, (8) Fazli Shurmai, (9) Gourmoti, (10) Harivanga, (11) Himsagor, (12) Katimon, (13) Langra, (14) Rupali, and (15) Shada.

    Number of images: Total number of images in the dataset: 28,515. (1) Total original (raw) images of mango cultivars (MangoOriginal) = 5,703, (2) Total processed images with a blend of both real and virtual backgrounds (MangoRealVirtual) = 5,703, and (3) Total augmented images (MangoAugmented)= 17,109.

    Distribution of instances: (1) Original (raw) images in each class of the mango cultivars (MangoOriginal): Amrapali = 135, Ashshina Classic = 571, Ashshina Zhinuk = 1,286, Banana Mango = 83, Bari-4 = 74, Bari-11 = 1,244, Fazli Classic = 171, Fazli Shurmai = 247, Gourmoti = 630, Harivanga = 265, Himsagor = 106, Katimon = 424, Langra = 120, Rupali = 184, and Shada = 163. (2) Processed images with a blend of both real and virtual backgrounds for each class of the mango cultivars (MangoRealVirtual): Amrapali = 135, Ashshina Classic = 571, Ashshina Zhinuk = 1,286, Banana Mango = 83, Bari-4 = 74, Bari-11 = 1,244, Fazli Classic = 171, Fazli Shurmai = 247, Gourmoti = 630, Harivanga = 265, Himsagor = 106, Katimon = 424, Langra = 120, Rupali = 184, and Shada = 163. (3) Augmented images for each class of the mango cultivars (MangoAugmented): Amrapali = 405, Ashshina Classic = 1,713, Ashshina Zhinuk = 3,858, Banana Mango = 249, Bari-4 = 222, Bari-11 = 3,732, Fazli Classic = 513, Fazli Shurmai = 741, Gourmoti = 1,890, Harivanga = 795, Himsagor = 318, Katimon = 1,272, Langra = 360, Rupali = 552, and Shada = 489.

    Dataset size: Total size of the dataset = 1.35 GB and the compressed ZIP file size = 1.16 GB.

    Data acquisition process: Images of various mango varieties are captured through high-definition smartphone cameras focusing from different angles.

    Data source location: Local wholesale and retail fruit markets located in six geographically distributed districts of Bangladesh, namely Chapai Nawabganj, Dhaka, Panchagarh, Rajshahi, Rangpur, and Satkhira which are renowned for diverse mango cultivation and availability.

    Where applicable: Training and evaluating machine learning and deep learning models to identify and classify mango varieties in Bangladesh which can be useful in smart horticulture, precision farming, supply chain automation, ecology and ecosystem health monitoring, and biodiversity and conservation efforts.

  12. R

    Image Augmentation And Annotation Dataset

    • universe.roboflow.com
    zip
    Updated Jun 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vico (2022). Image Augmentation And Annotation Dataset [Dataset]. https://universe.roboflow.com/vico/image-augmentation-and-annotation/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 24, 2022
    Dataset authored and provided by
    Vico
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Image Augmentation And Annotation

    ## Overview
    
    Image Augmentation And Annotation is a dataset for object detection tasks - it contains Objects annotations for 431 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. f

    Data_Sheet_1_Inside out: transforming images of lab-grown plants for machine...

    • frontiersin.figshare.com
    pdf
    Updated Jul 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander E. Krosney; Parsa Sotoodeh; Christopher J. Henry; Michael A. Beck; Christopher P. Bidinosti (2023). Data_Sheet_1_Inside out: transforming images of lab-grown plants for machine learning applications in agriculture.pdf [Dataset]. http://doi.org/10.3389/frai.2023.1200977.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 6, 2023
    Dataset provided by
    Frontiers
    Authors
    Alexander E. Krosney; Parsa Sotoodeh; Christopher J. Henry; Michael A. Beck; Christopher P. Bidinosti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionMachine learning tasks often require a significant amount of training data for the resultant network to perform suitably for a given problem in any domain. In agriculture, dataset sizes are further limited by phenotypical differences between two plants of the same genotype, often as a result of different growing conditions. Synthetically-augmented datasets have shown promise in improving existing models when real data is not available.MethodsIn this paper, we employ a contrastive unpaired translation (CUT) generative adversarial network (GAN) and simple image processing techniques to translate indoor plant images to appear as field images. While we train our network to translate an image containing only a single plant, we show that our method is easily extendable to produce multiple-plant field images.ResultsFurthermore, we use our synthetic multi-plant images to train several YoloV5 nano object detection models to perform the task of plant detection and measure the accuracy of the model on real field data images.DiscussionThe inclusion of training data generated by the CUT-GAN leads to better plant detection performance compared to a network trained solely on real data.

  14. P

    ELEVATER Dataset

    • paperswithcode.com
    • library.toponeai.link
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chunyuan Li; Haotian Liu; Liunian Harold Li; Pengchuan Zhang; Jyoti Aneja; Jianwei Yang; Ping Jin; Houdong Hu; Zicheng Liu; Yong Jae Lee; Jianfeng Gao, ELEVATER Dataset [Dataset]. https://paperswithcode.com/dataset/elevater
    Explore at:
    Authors
    Chunyuan Li; Haotian Liu; Liunian Harold Li; Pengchuan Zhang; Jyoti Aneja; Jianwei Yang; Ping Jin; Houdong Hu; Zicheng Liu; Yong Jae Lee; Jianfeng Gao
    Description

    The ELEVATER benchmark is a collection of resources for training, evaluating, and analyzing language-image models on image classification and object detection. ELEVATER consists of:

    Benchmark: A benchmark suite that consists of 20 image classification datasets and 35 object detection datasets, augmented with external knowledge Toolkit: An automatic hyper-parameter tuning toolkit; Strong language-augmented efficient model adaptation methods. Baseline: Pre-trained language-free and language-augmented visual models. Knowledge: A platform to study the benefit of external knowledge for vision problems. Evaluation Metrics: Sample-efficiency (zero-, few-, and full-shot) and Parameter-efficiency. Leaderboard: A public leaderboard to track performance on the benchmark

    The ultimate goal of ELEVATER is to drive research in the development of language-image models to tackle core computer vision problems in the wild.

  15. m

    SmallFishBD: A Comprehensive Image Dataset of Common Small Fish Varieties in...

    • data.mendeley.com
    Updated Nov 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Hasanul Ferdaus (2024). SmallFishBD: A Comprehensive Image Dataset of Common Small Fish Varieties in Bangladesh for Species Identification and Classification [Dataset]. http://doi.org/10.17632/8jvxtvz52x.2
    Explore at:
    Dataset updated
    Nov 28, 2024
    Authors
    Md Hasanul Ferdaus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    Type of data: 320x320 px fish images.

    Data format: JPEG.

    Contents of the dataset: Varieties of small fishes in Bangladesh.

    Number of classes: Ten small fish varieties - (1) Bele, (2) Nama Chanda, (3) Chela, (4) Guchi, (5) Kachki, (6) Mola, (7) Kata Phasa, (8) Pabda, (9) Puti, and (10) Tengra.

    Number of images: (A) Total images in the original dataset (SmallFishBD) = 1,700. (B) Total images in the augmented dataset (Augmented SmallFishBD) = 20,400.

    Distribution of instances: (A) Images in each fish category of the original dataset (SmallFishBD): Bele = 205, Nama Chanda = 110, Chela = 190, Guchi = 164, Kachki = 247, Mola = 179, Kata Phasa = 129, Pabda = 125, Puti = 218, Tengra = 133. (B) Images in each fish category of the augmented dataset (Augmented SmallFishBD): Bele = 2,460, Nama Chanda = 1,320, Chela = 2,280, Guchi = 1,968, Kachki = 2,964, Mola = 2,148, Kata Phasa = 1,548, Pabda = 1,500, Puti = 2,616, Tengra = 1,596.

    Dataset size: (A) Total size of the original dataset (SmallFishBD) = 36.2 MB and the ZIP compressed size = 28.4 MB. (B) Total size of the augmented dataset (Augmented SmallFishBD) = 617 MB and the ZIP compressed size = 527 MB.

    Data acquisition process: Images of various small fish categories are captured through high-definition smartphone cameras focusing from different angles.

    Data source location: Local wholesale fish markets located in different areas of Dhaka, Bangladesh.

    Where applicable: Training and evaluating machine learning and deep learning models to identify and classify small fish species in Bangladesh which can be useful in aquaculture development, fisheries management and sustainable fishing, ecology and ecosystem health monitoring, and biodiversity and conservation efforts.

  16. V

    Visual Search Technology Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Visual Search Technology Report [Dataset]. https://www.datainsightsmarket.com/reports/visual-search-technology-1983000
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    May 24, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The visual search technology market is experiencing robust growth, driven by the increasing adoption of e-commerce, the proliferation of smartphones with advanced camera capabilities, and the rising demand for enhanced user experiences in online shopping and information retrieval. The market, estimated at $15 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated $75 billion by 2033. This expansion is fueled by several key trends, including the advancements in artificial intelligence (AI) and machine learning (ML) algorithms that power visual search engines, improvements in image recognition accuracy, and the integration of visual search into various applications beyond e-commerce, such as social media and augmented reality (AR) experiences. Companies like Google, Amazon, and Microsoft are heavily investing in this technology, driving further innovation and market penetration. However, challenges remain, including the need for improved data privacy measures, addressing biases in algorithms, and overcoming the limitations of handling complex visual queries. The segmentation of the visual search technology market reveals a diverse landscape. The market is categorized by technology type (image recognition, object detection, etc.), application (e-commerce, social media, healthcare, etc.), and deployment mode (cloud, on-premise, etc.). Leading players like Microsoft, Google, and Amazon leverage their extensive data resources and technological prowess to dominate the market. Smaller, specialized companies, including Clarifai, Syte, and others, are focusing on niche applications and are contributing to market innovation. Geographic growth is expected to be broadly distributed, with North America and Europe leading initially, followed by rapid expansion in Asia-Pacific and other emerging markets as internet penetration and smartphone adoption increase. The competitive landscape is dynamic, with established tech giants and innovative startups vying for market share through product differentiation, strategic partnerships, and acquisitions.

  17. Elephant - Thermal Images

    • kaggle.com
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shijo john (2025). Elephant - Thermal Images [Dataset]. https://www.kaggle.com/datasets/shijo96john/elephant-thermal-images/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 29, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    shijo john
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Objective: The goal is to detect elephants in thermal images using the YOLOv8 (You Only Look Once version 8) deep learning model. This detection system can have applications in wildlife monitoring, preventing human-elephant conflict, and conservation efforts.

    Fine-Tuning with YOLOv8n.pt

    1.Model Selection:

    ○YOLOv8n (n stands for nano) was chosen due to its lightweight architecture, making it ideal for deployment in resource-constrained environments such as drones or edge devices.

    2.Fine-Tuning Approach:

    ○Pre-trained Weights: Fine-tuning was performed using the pre-trained YOLOv8n model (yolov8n.pt), leveraging its robust feature extraction capabilities. ○Transfer Learning: The pre-trained weights were adapted to the specific task of detecting elephants in thermal imagery by training on the custom dataset.

    3.Training Process:

    ○Epochs: Trained for a sufficient number of epochs to achieve convergence (e.g., 50–100 epochs depending on dataset size). ○Learning Rate: Adjusted learning rate using a warm-up strategy, starting with a smaller rate and increasing gradually. ○Optimizer: Utilized AdamW optimizer for faster convergence. ○Data Augmentation: Techniques like flipping, scaling, rotation, and noise injection were applied to improve the model's generalization.

    4.Performance Metrics:

    ○mAP (mean Average Precision): Evaluated at IoU thresholds of 0.5 and 0.5:0.95. ○Precision and Recall: Analyzed to ensure minimal false positives and high detection rates.

    Dataset Sourced from Roboflow

    1.Source and Accessibility:

    ○The dataset was sourced from Roboflow, a platform providing pre-annotated thermal images of elephants.

    2.Dataset Composition:

    ○Includes thermal images capturing elephants in diverse postures, distances, and environmental conditions. ○Balanced dataset with positive (elephant present) and negative (no elephant) samples.

    3.Annotations:

    ○Bounding box annotations for elephant detection. ○Annotation format: YOLO-compatible (text files with class ID and bounding box coordinates).

    4.Training/Validation Split:

    ○Split into 80% training, 10% validation, and 10% test sets.

    Dataset Analysis and Resolution

    1.Resolution:

    ○Images in the dataset have a resolution of 640x640 pixels, resized from their original dimensions to ensure compatibility with YOLOv8 input requirements.

    2.Class Distribution:

    ○A single class for elephants (Class ID: 0). ○Total samples: ~5000 images (example count; adjust based on the actual dataset).

    3.Challenges Identified:

    ○Low Contrast: Thermal images often have lower contrast, which can make object detection challenging. ○Noise: Some images contained thermal noise, which was mitigated during pre-processing. ○Small Object Detection: Instances of elephants far from the camera required additional focus during training.

    4.Pre-processing:

    ○Normalized pixel values between 0 and 1. ○Noise reduction applied using Gaussian filtering. ○Images augmented with techniques such as brightness adjustment and random cropping.

    5.Validation Metrics:

    ○Distribution across training, validation, and test sets ensured minimal class imbalance. ○Random sampling ensured diversity in angles, environments, and distances.

  18. m

    Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach...

    • data.mendeley.com
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Mehedi Hasan (2025). Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach to Infant Facial Emotion Recognition [Dataset]. http://doi.org/10.17632/hy969mrx9p.1
    Explore at:
    Dataset updated
    Mar 10, 2025
    Authors
    Md. Mehedi Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is a meticulously curated dataset designed for infant facial emotion recognition, featuring four primary emotional expressions: Angry, Cry, Laugh, and Normal. The dataset aims to facilitate research in machine learning, deep learning, affective computing, and human-computer interaction by providing a large collection of labeled infant facial images.

    Primary Data (1600 Images): - Angry: 400 - Cry: 400 - Laugh: 400 - Normal: 400

    Data Augmentation & Expanded Dataset (26,143 Images): To enhance the dataset's robustness and expand the dataset, 20 augmentation techniques (including HorizontalFlip, VerticalFlip, Rotate, ShiftScaleRotate, BrightnessContrast, GaussNoise, GaussianBlur, Sharpen, HueSaturationValue, CLAHE, GridDistortion, ElasticTransform, GammaCorrection, MotionBlur, ColorJitter, Emboss, Equalize, Posterize, FogEffect, and RainEffect) were applied randomly. This resulted in a significantly larger dataset with:

    • Angry: 5,781
    • Cry: 6,930
    • Laugh: 6,870
    • Normal: 6,562

    Data Collection & Ethical Considerations: The dataset was collected under strict ethical guidelines to ensure compliance with privacy and data protection laws. Key ethical considerations include: 1. Ethical Approval: The study was reviewed and approved by the Institutional Review Board (IRB) of Daffodil International University under Reference No: REC-FSIT-2024-11-10. 2. Informed Parental Consent: Written consent was obtained from parents before capturing and utilizing infant facial images for research purposes. 3. Privacy Protection: No personally identifiable information (PII) is included in the dataset, and images are strictly used for research in AI-driven emotion recognition.

    Data Collection Locations & Geographical Diversity: To ensure diversity in infant facial expressions, data collection was conducted across multiple locations in Bangladesh, covering healthcare centers and educational institutions:

    1. 250-bed District Sadar Hospital, Sherpur (Latitude: 25.019405 & Longitude: 90.013733)
    2. Upazila Health Complex, Baraigram, Natore (Latitude: 24.3083 & Longitude: 89.1700)
    3. Char Bhabna Community Clinic, Sherpur (Latitude: 25.0188 & Longitude: 90.0175)
    4. Jamiatul Amin Mohammad Al-Islamia Cadet Madrasa, Khagan, Dhaka (Latitude: 23.872856 & Longitude: 90.318947)

    Face Detection Methodology: To extract the facial regions efficiently, RetinaNet—a deep learning-based object detection model—was employed. The use of RetinaNet ensures precise facial cropping while minimizing background noise and occlusions.

    Potential Applications: 1. Affective Computing: Understanding infant emotions for smart healthcare and early childhood development. 2. Computer Vision: Training deep learning models for automated infant facial expression recognition. 3. Pediatric & Mental Health Research: Assisting in early autism screening and emotion-aware AI for child psychology. 4. Human-Computer Interaction (HCI): Designing AI-powered assistive technologies for infants.

  19. m

    Dental OPG XRAY Dataset

    • data.mendeley.com
    Updated Aug 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rubaba Binte Rahman (2024). Dental OPG XRAY Dataset [Dataset]. http://doi.org/10.17632/c4hhrkxytw.4
    Explore at:
    Dataset updated
    Aug 27, 2024
    Authors
    Rubaba Binte Rahman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes dental OPG X-rays collected from three different dental clinics. This dataset can be used for tasks like object detection, image analysis, disease classification, and segmentation. It has two folders: the object detection dataset folder and the classification dataset folder. The object detection folder contains 232 original and 604 augmented images and labels. The classification folder contains six distinct files for each class. The images are in JPG format, and the labels are in JSON format. The augmented data is split into training, validation, and testing sets in an 80:10:10 ratio.

    Dataset collection: • Source: Prescription Point Ltd, Lab Aid Specialized Hospital, Ibn Sina Diagnostic and Imaging Center. • Capture Method: Using android phone camera. • Anonymization: All data were rigorously anonymized to maintain confidentiality and privacy. • Informed Consent: All patients provided their consent in accordance with the dental ethical principles.

    Dataset composition: • Total Participants: 232 Male and female patients aged 10 years or older.

    Variables: • Healthy Teeth: 223 • Caries: 119 • Impacted Teeth: 87 • Broken Down Crown/ Root: 52 • Infection: 23 • Fractured Teeth: 13

  20. Urban_Tree_Detection_Dataset

    • kaggle.com
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mcii34 (2024). Urban_Tree_Detection_Dataset [Dataset]. https://www.kaggle.com/datasets/mcii34/urbantree-subset-public
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mcii34
    Description

    The tree_detection_dataset is a subset of the original tree classification dataset, accessible at this link. This dataset includes only the original images, each with a resolution of 1200x1600. A further refined subset, yolo11, includes bounding box annotations specifically for object detection. All augmentations and processing were performed using Roboflow.

    Dataset Configuration

    The dataset comprises 2,716 images, divided as follows: - Training Set: 87% (2,376 images) - Validation Set: 8% (227 images) - Test Set: 5% (113 images)

    To ensure uniformity in aspect ratio, all images have been resized to 640x640 pixels using a "fit" resizing approach, which may introduce black edges. Additionally, auto-orientation has been applied for consistency.

    Data Preprocessing and Augmentation

    Several augmentation techniques have been applied to enhance model robustness and generalization. Each training example generates three augmented outputs, expanding the diversity of the dataset. The augmentations include:

    • Flip: Horizontal flipping.
    • Rotation:
      • 90° rotations: Clockwise, counterclockwise, and upside-down.
      • Minor random rotations: Between -3° and +3° for slight variations.
    • Grayscale: Applied to 12% of images to simulate different lighting conditions.
    • Blur: Up to 1.5 pixels to mimic focus variations.
    • Noise: Added to up to 0.1% of pixels to introduce subtle distortions.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Data Augmentation (2024). Image Augmentation Dataset [Dataset]. https://universe.roboflow.com/data-augmentation-d7svr/image-augmentation-4ax9o

Image Augmentation Dataset

image-augmentation-4ax9o

image-augmentation-dataset

Explore at:
zipAvailable download formats
Dataset updated
Apr 2, 2024
Dataset authored and provided by
Data Augmentation
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Fractured Bounding Boxes
Description

Image Augmentation

## Overview

Image Augmentation is a dataset for object detection tasks - it contains Fractured annotations for 702 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu