6 datasets found
  1. Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks

    • zenodo.org
    • explore.openaire.eu
    • +1more
    pdf, zip
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marharyta Domnich; Marharyta Domnich; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja (2024). Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks [Dataset]. http://doi.org/10.5281/zenodo.5095024
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marharyta Domnich; Marharyta Domnich; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General information

    The dataset consists of 4403 labelled subscenes from 155 Sentinel-2 (S2) Level-1C (L1C) products distributed over the Northern European terrestrial area. Each S2 product was oversampled at 10 m resolution for 512 x 512 pixels subscenes. 6 L1C S2 products were labelled fully. Among other 149 S2 products the most challenging ~10 subscenes per product were selected for labelling. In total the dataset represents 4403 labelled Sentinel-2 subscenes, where each sub-tile is 512 x 512 pixels at 10 m resolution. The dataset consists of around 30 S2 products per month from April to August and 3 S2 products per month for September and October. Each selected L1C S2 product represents different clouds, such as cumulus, stratus, or cirrus, which are spread over various geographical locations in Northern Europe.

    The classification pixel-wise map consists of the following categories:

    • 0 – MISSING: missing or invalid pixels;
    • 1 – CLEAR: pixels without clouds or cloud shadows;
    • 2 – CLOUD SHADOW: pixels with cloud shadows;
    • 3 – SEMI TRANSPARENT CLOUD: pixels with thin clouds through which the land is visible; include cirrus clouds that are on the high cloud level (5-15km).
    • 4 – CLOUD: pixels with cloud; include stratus and cumulus clouds that are on the low cloud level (from 0-0.2km to 2km).
    • 5 – UNDEFINED: pixels that the labeler is not sure which class they belong to.

    The dataset was labelled using Computer Vision Annotation Tool (CVAT) and Segments.ai. With the possibility of integrating active learning process in Segments.ai, the labelling was performed semi-automatically.

    The dataset limitations must be considered: the data is covering only terrestrial region and does not include water areas; the dataset is not presented in winter conditions; the dataset represent summer conditions, therefore September and October contain only test products used for validation. Current subscenes do not have georeferencing, however, we are working towards including them in next version.

    More details about the dataset structure can be found in README.

    Contributions and Acknowledgements

    The data were annotated by Fariha Harun and Olga Wold. The data verification and Software Development was performed by Indrek Sünter, Heido Trofimov, Anton Kostiukhin, Marharyta Domnich, Mihkel Järveoja, Olga Wold. Methodology was developed by Kaupo Voormansik, Indrek Sünter, Marharyta Domnich.
    We would like to thank Segments.ai annotation tool for instant and an individual customer support. We are grateful to European Space Agency for reviews and suggestions. We would like to extend our thanks to Prof. Gholamreza Anbarjafari for the feedback and directions.
    The project was funded by European Space Agency, Contract No. 4000132124/20/I-DT.

  2. Z

    VINEyard Piacenza Image Collections - VINEPICs

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Poni, Stefano (2023). VINEyard Piacenza Image Collections - VINEPICs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7866441
    Explore at:
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    Gatti, Matteo
    Bertoglio, Riccardo
    Poni, Stefano
    Matteucci, Matteo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Piacenza
    Description

    For a detailed description of this dataset, based on the Datasheets for Datasets (Gebru, Timnit, et al. "Datasheets for datasets." Communications of the ACM 64.12 (2021): 86-92.), check the VINEPICs_datasheet.md file.

    For what purpose was the dataset created? VINEPICs was developed specifically for the purpose of detecting grape bunches in RGB images and facilitating tasks such as object detection, semantic segmentation, and instance segmentation. The detection of grape bunches serves as the initial phase in an analysis pipeline designed for vine plant phenotyping. The dataset encompasses a wide range of lighting conditions, camera orientations, plant defoliation levels, species variations, and cultivation methods. Consequently, this dataset presents an opportunity to explore the influence of each source of variability on grape bunch detection.

    What do the instances that comprise the dataset represent? The dataset consists of RGB images showcasing various species of vine plants. Specifically, the images represent three different Vitis vinifera varieties: - Red Globe, a type of table grape - Cabernet Sauvignon, a red wine grape - Ortrugo, a white wine grape

    These images have been collected over different years and dates at the vineyard facility of Università Cattolica del Sacro Cuore in Piacenza, Italy. You can find the images stored in the "data/images" directory, organized into subdirectories based on the starting time of data collection, indicating the day (and, if available, the approximate time in minutes). Images collected in 2022 are named using timestamps with nanosecond precision.

    Is there a label or target associated with each instance? Each image has undergone manual annotation using the Computer Vision Annotation Tool (CVAT) (https://github.com/opencv/cvat). Grape bunches have been meticulously outlined with polygon annotations. These annotations belong to a single class, "bunch," and have been saved in a JSON file using the COCO Object Detection format, including segmentation masks (https://cocodataset.org/#format-data).

    What mechanisms or procedures were used to collect the data? The data was collected using a D435 Intel Realsense camera, which was mounted on a four-wheeled skid-steering robot. The robot was teleoperated during the data collection process. The data was recorded by streaming the camera's feed into rosbag format. Specifically, the camera was connected via a USB 3.0 interface to a PC running Ubuntu 18.04 and ROS Melodic.

  3. Garrulus Field-D Semantic Segmentation Dataset

    • zenodo.org
    bin
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Wasil; Mohammad Wasil; Ahmad Drak; Brennan Penfold; Ludovico Scarton; Maximilian Johenneken; Alexander Asteroth; Sebastian Houben; Ahmad Drak; Brennan Penfold; Ludovico Scarton; Maximilian Johenneken; Alexander Asteroth; Sebastian Houben (2025). Garrulus Field-D Semantic Segmentation Dataset [Dataset]. http://doi.org/10.5281/zenodo.15480886
    Explore at:
    binAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mohammad Wasil; Mohammad Wasil; Ahmad Drak; Brennan Penfold; Ludovico Scarton; Maximilian Johenneken; Alexander Asteroth; Sebastian Houben; Ahmad Drak; Brennan Penfold; Ludovico Scarton; Maximilian Johenneken; Alexander Asteroth; Sebastian Houben
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Garrulus Field-D dataset represents a 0.3-hectare post-harvest area located in the Arnsberg Forest, Germany. Data was acquired using an Unmanned Aerial Vehicle (UAV), and the area was reconstructed into a geo-referenced RGB orthomosaic with a spatial resolution of approximately 10 cm per pixel.

    Object annotations were created using the Computer Vision Annotation Tool (CVAT), covering four classes:

    • Coarse Woody Debris (CWD)

    • Tree Stumps (STUMP)

    • Vegetation

    • MISCELLANEOUS (MISC) — used for ground sampling point markers

    This particular dataset contains the pre-processed tensor files (for both training and testing), which were generated from the RGB orthomosaic using our custom tool, the Garrulus Dataset Library (GDL).

    Please see our GitHub repo for pre-processing this dataset (https://github.com/garrulus-project/sam_peft/)

    Please note that the original orthomosaic files (.tif) will be made available in a separate publication.

    This dataset is published alongside our paper that was accepted at the ICRA 2025 Workshop on Novel Approaches for Precision Agriculture and Forestry with Autonomous Robots

    📘 If you use this dataset in your work, please cite our paper:
    Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery
    ICRA 2025, and available on arXiv: https://arxiv.org/abs/2505.08932

    @misc{wasil2025peftsam,
    title = {{Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery}},
    author = {Mohammad Wasil and Ahmad Drak and Brennan Penfold and Ludovico Scarton and Maximilian Johenneken and Alexander Asteroth and Sebastian Houben},
    year = {2025},
    eprint = {2505.08932},
    archivePrefix = {arXiv},
    primaryClass = {cs.RO},
    url = {https://arxiv.org/abs/2505.08932},
    note = {Accepted to the Novel Approaches for Precision Agriculture and Forestry with Autonomous Robots, IEEE ICRA Workshop 2025}
    }

  4. Rooftop Drainage Outlets and Ventilations Dataset

    • zenodo.org
    zip
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Arzoumanidis*; Lukas Arzoumanidis*; Julius Knechtel*; Julius Knechtel*; Gizem Sen*; Weilian Li; Weilian Li; Youness Dehbi; Youness Dehbi; Gizem Sen* (2025). Rooftop Drainage Outlets and Ventilations Dataset [Dataset]. http://doi.org/10.5281/zenodo.14040571
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lukas Arzoumanidis*; Lukas Arzoumanidis*; Julius Knechtel*; Julius Knechtel*; Gizem Sen*; Weilian Li; Weilian Li; Youness Dehbi; Youness Dehbi; Gizem Sen*
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Authors marked with an asterisk (*) have contributed equally to this publication.

    We annotated a dataset for the detection of drainage outlets and ventilations on flat rooftops. The underlying high-resolution aerial images are orthophotos with a ground sampling distance of 7.5 cm, provided by the Office for Land Management and Geoinformation of the City of Bonn, Germany. The dataset was created through manual annotation using the Computer Vision Annotation Tool (CVAT) and comprises 740 image pairs. Each pair consists of a rooftop image and a corresponding annotated mask indicating the drainage outlets and ventilations. Since rooftops vary in size, we aimed to create image pairs that capture a single rooftop per image without overlaps or cutoffs. Consequently, the dimensions of each image pair differ. The dataset is split randomly into 80% for training, 10% for validation, and 10% for testing.

    We provide the dataset in the Common Objects in Context (COCO) format for object detection tasks. In addition to the COCO-formatted dataset, we provide the dataset in its original, pairwise, format to support various machine learning tasks, such as semantic segmentation and panoptic segmentation, as well as to accommodate different data-loading requirements for diverse deep learning models.

    If your object detection approach requires the 'category_id' to start from 0 instead of 1, please refer to the following guide: https://github.com/obss/sahi/discussions/336
    For conversion to a completely different dataset format, such as YOLO, please see the repository: https://github.com/ultralytics/JSON2YOLO

  5. BlueberryDCM: A Canopy Image Dataset for Detection, Counting, and Maturity...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Oct 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuzhen Lu; Yuzhen Lu (2024). BlueberryDCM: A Canopy Image Dataset for Detection, Counting, and Maturity Assessment of Blueberries [Dataset]. http://doi.org/10.5281/zenodo.14002517
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 28, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuzhen Lu; Yuzhen Lu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Oct 28, 2024
    Description

    The BlueberryDCM dataset consists of 140 RGB images of blueberry canopies captured at varied spatial scales. All the images were acquired using smartphones in natural field light conditions in different orchards in the season of 2022, with 134 images in Mississippi and 6 images in Michigan. A total of 17,955 bounding box annotations were manually done in the VGG Image Annotator (VIA) (v2.0.12) for the blueberry instances of two fruit maturity classes, "Blue" and "Unblue", representing ripe and unripe fruit, respectively. In addition, for each maturity class, there are two sub-categories in the annotation, "visible", and "occluded", to indicate whether the fruit is fully visible in the canopy or partially occluded. The original annotation format exported from the VGG is VIA .json. The derived annotation files in two other formats, .xml (Pascal VOC format) and .txt (YOLO format with noralized xywh, with 0, 1, 2, and 3 denoting the four categories of "Unblue_visible", "Unblue_occluded", "Blue_visible", and "Blue_occluded" bluerries, respectively) are provided in the dataset for the compatibility of a wide range of object detectors. Hence, the dataset contains both the raw images (.jpg) and three corresponding annotations files (.json, .xml, and .txt) with the same file names, totaling about 107 MB in file size.

    The dataset was used for in a study (see below) on the evaluation of YOLOv8 and YOLOv9 models for blueberry detection, counting, and maturity assessment. The detection accuracy of 93% mAP@50 was achieved by YOLOv8l, with an error of about 10 blueberries in fruit counting and an error of 3.6% in estimating the "Blue" fruit percentage. Software programs for the modeling work are made publicly available at: https://github.com/vicdxxx/BlueberryDetectionAndCounting. In addition, the blueberry dataset was also used as a preliminary database for developing an iOS-based mobile application, which is described in Deng, B., Lu, Y., WanderWeide, J., 2024. Development and preliminary evaluation of a deep learning-based fruit counting mobile application for highbush Blueberries. 2024 ASABE Annual International Meeting 2401022

    Details about the dataset curation and statistics as well as modeling experiments are described in the journal article: Deng, B., Lu, Y., 2024. Detection, Counting, and Maturity Assessment of Blueberries in Canopy Images using YOLOv8 and YOLOv9. Smart Agricultural Technology. https://doi.org/10.1016/j.atech.2024.100620. If you use the dataset in published research, please consider citing the dataset or the journal article. Hopefully, you find the dataset useful.

  6. Tiny Towns Scorer dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Dec 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Owens; Daniel Schoenbach; Payton Klemens; Alex Owens; Daniel Schoenbach; Payton Klemens (2022). Tiny Towns Scorer dataset [Dataset]. http://doi.org/10.5281/zenodo.7429657
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 13, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alex Owens; Daniel Schoenbach; Payton Klemens; Alex Owens; Daniel Schoenbach; Payton Klemens
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset and model used for Tiny Towns Scorer, a computer vision project completed as part of CS 4664: Data-Centric Computing Capstone at Virginia Tech. The goal of the project was to calculate player scores in the board game Tiny Towns.

    The dataset consists of 226 images and associated annotations, intended for object detection. The images are photographs of players' game boards over the course of a game of Tiny Towns, as well as photos of individual game pieces taken after the game. Photos were taken using hand-held smartphones. Images are in JPG and PNG formats. The annotations are provided in TFRecord 1.0 and CVAT for Images 1.1 formats.

    The weights for the trained RetinaNet-portion of the model are also provided.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Marharyta Domnich; Marharyta Domnich; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja (2024). Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks [Dataset]. http://doi.org/10.5281/zenodo.5095024
Organization logo

Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks

Explore at:
zip, pdfAvailable download formats
Dataset updated
Jul 18, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marharyta Domnich; Marharyta Domnich; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja; Kaupo Voormansik; Olga Wold; Fariha Harun; Indrek Sünter; Heido Trofimov; Anton Kostiukhin; Mihkel Järveoja
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

General information

The dataset consists of 4403 labelled subscenes from 155 Sentinel-2 (S2) Level-1C (L1C) products distributed over the Northern European terrestrial area. Each S2 product was oversampled at 10 m resolution for 512 x 512 pixels subscenes. 6 L1C S2 products were labelled fully. Among other 149 S2 products the most challenging ~10 subscenes per product were selected for labelling. In total the dataset represents 4403 labelled Sentinel-2 subscenes, where each sub-tile is 512 x 512 pixels at 10 m resolution. The dataset consists of around 30 S2 products per month from April to August and 3 S2 products per month for September and October. Each selected L1C S2 product represents different clouds, such as cumulus, stratus, or cirrus, which are spread over various geographical locations in Northern Europe.

The classification pixel-wise map consists of the following categories:

  • 0 – MISSING: missing or invalid pixels;
  • 1 – CLEAR: pixels without clouds or cloud shadows;
  • 2 – CLOUD SHADOW: pixels with cloud shadows;
  • 3 – SEMI TRANSPARENT CLOUD: pixels with thin clouds through which the land is visible; include cirrus clouds that are on the high cloud level (5-15km).
  • 4 – CLOUD: pixels with cloud; include stratus and cumulus clouds that are on the low cloud level (from 0-0.2km to 2km).
  • 5 – UNDEFINED: pixels that the labeler is not sure which class they belong to.

The dataset was labelled using Computer Vision Annotation Tool (CVAT) and Segments.ai. With the possibility of integrating active learning process in Segments.ai, the labelling was performed semi-automatically.

The dataset limitations must be considered: the data is covering only terrestrial region and does not include water areas; the dataset is not presented in winter conditions; the dataset represent summer conditions, therefore September and October contain only test products used for validation. Current subscenes do not have georeferencing, however, we are working towards including them in next version.

More details about the dataset structure can be found in README.

Contributions and Acknowledgements

The data were annotated by Fariha Harun and Olga Wold. The data verification and Software Development was performed by Indrek Sünter, Heido Trofimov, Anton Kostiukhin, Marharyta Domnich, Mihkel Järveoja, Olga Wold. Methodology was developed by Kaupo Voormansik, Indrek Sünter, Marharyta Domnich.
We would like to thank Segments.ai annotation tool for instant and an individual customer support. We are grateful to European Space Agency for reviews and suggestions. We would like to extend our thanks to Prof. Gholamreza Anbarjafari for the feedback and directions.
The project was funded by European Space Agency, Contract No. 4000132124/20/I-DT.

Search
Clear search
Close search
Google apps
Main menu