30 datasets found

COCO 2017
kaggle.com
zip
Updated Nov 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikdintel (2024). COCO 2017 [Dataset]. https://www.kaggle.com/datasets/snikhilrao/coco-2017
Explore at:
zip(26884588931 bytes)Available download formats
Dataset updated
Nov 14, 2024
Authors
Nikdintel
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📌 What's Included:

Training Set: 118K images with annotations for detection, segmentation, and keypoints.

Validation Set: 5K images with full annotations for validation.

Testing Set: Images are divided into two splits—dev and challenge—replacing the four splits (dev, standard, reserve, challenge) used in previous years.

Stuff Annotations: Available for 40K images in the training set and 5K validation images, enabling semantic segmentation research.

Unlabeled Data: A set of 120K images with no annotations, mirroring the class distribution of the labeled data. This is ideal for exploring semi-supervised learning techniques.

🔍 Key Changes in COCO 2017:

The train/val split was updated based on community feedback, now featuring 118K/5K images instead of the previous 83K/41K split.

While the annotations for detection and keypoints are consistent with previous years, additional stuff annotations were introduced in 2017.

Unlabeled data is now available for semi-supervised learning tasks, opening new avenues for experimentation.

📂 Dataset Structure:

train2017: Images and annotations

val2017: Images and annotations

test2017: Images (no annotations provided)

unlabeled2017: Unlabeled images

This dataset can be used for a variety of computer vision tasks, including object detection, instance segmentation, keypoint detection, semantic segmentation, and image captioning. Whether you're working on supervised or semi-supervised learning, this resource is designed to meet your needs.
SIIM Covid19 512x512 png 1 category (COCO Format)
kaggle.com
zip
Updated Jul 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jellybeanz (2021). SIIM Covid19 512x512 png 1 category (COCO Format) [Dataset]. https://www.kaggle.com/nyanswanaung/siim-covid19-512x512-png-1-category-coco-format
Explore at:
zip(1673596631 bytes)Available download formats
Dataset updated
Jul 4, 2021
Authors
jellybeanz
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
COCO Format dataset for SIIM Covid19 Object Detection Challenge

Challenge dataset format is dicom and it is then converted to numpy, resizing 512x512 in png format. The original csv file is also converted to coco format json file resulting train.json and val.json

In challenge dataeset, there are four classes namely Negative for Pneumonia, Typical Appearance, Indeterminate Appearance, Atypical Appearance. The bounding boxes are given the label 'opacity' in the competition data (image level) for all images and thus there is just a single class (class-0) in the cooc annotations.

I have just modeled the problem using just one class or one object. I have changed label 'opacity' to 'Covid_Abnormality' for convenience in the dataset.

train.json contains annotations for index 1 to 5000(total 5000) and val.json contains for index 5001 to 7852 (total 2852)

Code for converting 512x512png to coco format.

new_annotations folder contains annotation jsons files whose image names exclude '_image' and formatted with jpg

Example,

dataset/annotations/train.json --> "000a312787f2_image.png"

new_annotations/train.json --> "000a312787f2.jpg"

Acknowledgements

Challenge link --> https://www.kaggle.com/c/siim-covid19-detection Resized 512x512 png dataset link --> https://www.kaggle.com/sreevishnudamodaran/siim-covid19-512-images-and-metadata
R
Moon Challenge Dataset
universe.roboflow.com
zip
Updated Nov 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
COCO to YOLO (2022). Moon Challenge Dataset [Dataset]. https://universe.roboflow.com/coco-to-yolo-sybgr/moon-challenge/model/1
Explore at:
zipAvailable download formats
Dataset updated
Nov 16, 2022
Dataset authored and provided by
COCO to YOLO
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Geomorphological Features Bounding Boxes
Description
Moon Challenge

## Overview Moon Challenge is a dataset for object detection tasks - it contains Geomorphological Features annotations for 4,973 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
COCO 2017 TFRecords
kaggle.com
zip
Updated Aug 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karthikeyan Vijayan (2020). COCO 2017 TFRecords [Dataset]. https://www.kaggle.com/datasets/karthikeyanvijayan/coco-2017-tfrecords/code
Explore at:
zip(20202948610 bytes)Available download formats
Dataset updated
Aug 13, 2020
Authors
Karthikeyan Vijayan
Description
COCO (Common Objects in COntext) is a popular dataset in Computer Vision. It contains annotations for Computer Vision tasks - object detection, segmentation, keypoint detection, stuff segmentation, panoptic segmentation, densepose, and image captioning. For more details visit COCO Dataset

The Tensor Processing Unit (TPU) hardware accelerators are very fast. The challenge is often to feed them data fast enough to keep them busy. Google Cloud Storage (GCS) is capable of sustaining very high throughput but as with all cloud storage systems, initiating a connection costs some network back and forth. Therefore, having our data stored as thousands of individual files is not ideal. This dataset contains COCO dataset with object detection annotations in a smaller number of files and you can use the power of tf.data.Dataset to read from multiple files in parallel.

TFRecord file format Tensorflow's preferred file format for storing data is the protobuf-based TFRecord format. Other serialization formats would work too but you can load a dataset from TFRecord files directly by writing:

filenames = tf.io.gfile.glob(FILENAME_PATTERN) dataset = tf.data.TFRecordDataset(filenames) dataset = dataset.map(...)

For more details https://codelabs.developers.google.com/codelabs/keras-flowers-data/

You can use the following code in your kaggle notebook to get Google Cloud Storage (GCS) path of any public Kaggle dataset .

from kaggle_datasets import KaggleDatasets
GCS_PATH = KaggleDatasets().get_gcs_path()

View the notebook COCO Object Detection dataset in TFRecord to see how TFRecord files are created from the original COCO dataset.
Trojan Detection Software Challenge - object-detection-jul2022-train
catalog.data.gov
data.nist.gov
+1more
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2025). Trojan Detection Software Challenge - object-detection-jul2022-train [Dataset]. https://catalog.data.gov/dataset/trojan-detection-software-challenge-object-detection-jul2022-train
Explore at:
Dataset updated
Mar 14, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Round 10 Train DatasetThis is the training data used to create and evaluate trojan detection software solutions. This data, generated at NIST, consists of object detection AIs trained on the COCO dataset. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 144 AI models using a small set of model architectures. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the input when the trigger is present.
CADOT Dataset
kaggle.com
zip
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tú Hoàng (2025). CADOT Dataset [Dataset]. https://www.kaggle.com/datasets/grizmo/cadot-dataset
Explore at:
zip(206172834 bytes)Available download formats
Dataset updated
May 28, 2025
Authors
Tú Hoàng
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The CADOT dataset is introduced as part of the Grand Challenge at IEEE ICIP 2025, aiming to push forward the development of advanced object detection techniques in remote sensing imagery, particularly focused on dense urban environments. The competition is organized by LabCom IRISER, in collaboration with IGN (Institut national de l'information géographique et forestière), and encourages the use of AI-based data augmentation to enhance model robustness.

Competition Context

The challenge calls for the detection of small objects in high-resolution optical satellite imagery, which is inherently complex due to occlusions, diverse object types, and varied urban layouts. Participants are expected to develop detection pipelines that are not only accurate but also robust under real-world remote sensing constraints.

Dataset Description

The CADOT dataset comprises high-resolution aerial images captured over a dense urban area in the Île-de-France region, France. Each image is carefully annotated with 14 object categories including buildings, roads, vehicles, trees, and various other urban components. The imagery comes from IGN and reflects a realistic and challenging setting for object detection models due to factors like shadows, perspective distortion, and dense object arrangements.

Data is derived from multi-view aerial imagery

Images are orthorectified to remove perspective distortion

Labeling is provided in geospatially aligned formats

Annotations are polygonal or bounding-box-based (depending on release phase)

Data Reformatting

To facilitate easier use of the dataset in machine learning workflows, I have reformatted the original data into the following versions:

Images in .jpg and .png format (cropped and full-frame)

Annotations converted to COCO JSON and YOLO format

Train/val/test splits based on geographic segmentation

A preview subset for quick experimentation

License and Original Source

For full licensing terms and official documentation, please refer to the official challenge page: 🔗 https://cadot.onrender.com/
Evaluation of object detection models for hand region detection using the...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven Koitka; Aydin Demircioglu; Moon S. Kim; Christoph M. Friedrich; Felix Nensa (2023). Evaluation of object detection models for hand region detection using the Faster-RCNN InceptionResNetV2 pre-trained model. [Dataset]. http://doi.org/10.1371/journal.pone.0207496.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0207496.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Sven Koitka; Aydin Demircioglu; Moon S. Kim; Christoph M. Friedrich; Felix Nensa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Results are stated as mean and standard deviation of ten different training set splits. The evaluation is performed on the held-out set of 89 images.
Toloka Visual Question Answering Dataset
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ustalov, Dmitry (2023). Toloka Visual Question Answering Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7057740
Explore at:
Dataset updated
Oct 10, 2023
Dataset provided by
Tolokahttps://www.toloka.ai/
Authors
Ustalov, Dmitry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Our dataset consists of the images associated with textual questions. One entry (instance) in our dataset is a question-image pair labeled with the ground truth coordinates of a bounding box containing the visual answer to the given question. The images were obtained from a CC BY-licensed subset of the Microsoft Common Objects in Context dataset, MS COCO. All data labeling was performed on the Toloka crowdsourcing platform, https://toloka.ai/.

Our dataset has 45,199 instances split among three subsets: train (38,990 instances), public test (1,705 instances), and private test (4,504 instances). The entire train dataset was available for everyone since the start of the challenge. The public test dataset was available since the evaluation phase of the competition, but without any ground truth labels. After the end of the competition, public and private sets were released.

The datasets will be provided as files in the comma-separated values (CSV) format containing the following columns.

Column Type Description image string URL of an image on a public content delivery network width integer image width height integer image height left integer bounding box coordinate: left top integer bounding box coordinate: top right integer bounding box coordinate: right bottom integer bounding box coordinate: bottom question string question in English

This upload also contains a ZIP file with the images from MS COCO.
C
Coco Peat Growth Medium Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jul 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Coco Peat Growth Medium Report [Dataset]. https://www.datainsightsmarket.com/reports/coco-peat-growth-medium-279246
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Jul 2, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global coco peat growth medium market is experiencing robust growth, driven by the increasing popularity of hydroponics and soilless cultivation techniques in both commercial and home gardening. The shift towards sustainable and eco-friendly agricultural practices further fuels market expansion, as coco peat offers a renewable and biodegradable alternative to traditional peat moss. A Compound Annual Growth Rate (CAGR) of, let's assume, 7% over the period 2025-2033 indicates a significant upward trajectory. This growth is being propelled by several factors, including the rising demand for high-quality agricultural produce, the growing awareness of the environmental benefits of coco peat, and technological advancements in coco peat processing that enhance its quality and consistency. Major market players are focusing on product innovation and strategic partnerships to expand their market share. The increasing adoption of coco peat in various applications, such as horticulture, landscaping, and mushroom cultivation, also contributes to the market's overall expansion. Furthermore, government initiatives promoting sustainable agriculture and increasing investments in research and development are bolstering the market's growth potential. Despite this positive outlook, the market faces some challenges. Fluctuations in raw material prices, primarily coconut husks, can impact production costs and profitability. Regional variations in climate and weather patterns can affect the quality and availability of coco peat. Competition from other growth media, such as perlite and vermiculite, presents another challenge for coco peat manufacturers. However, the unique advantages of coco peat, such as its excellent water retention, aeration properties, and sustainable nature, are expected to maintain its competitive edge and continue to drive market growth in the coming years. The market segmentation, while not explicitly provided, is likely to include variations based on the grade of coco peat (e.g., fine, medium, coarse), application type (horticulture, landscaping), and packaging. The geographical distribution likely shows strong growth in regions with large agricultural sectors and a growing awareness of sustainable farming practices.
Evaluation on central points of ROIs annotated by both a radiology expert...
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven Koitka; Aydin Demircioglu; Moon S. Kim; Christoph M. Friedrich; Felix Nensa (2023). Evaluation on central points of ROIs annotated by both a radiology expert and a non-expert. [Dataset]. http://doi.org/10.1371/journal.pone.0207496.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0207496.t002
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Sven Koitka; Aydin Demircioglu; Moon S. Kim; Christoph M. Friedrich; Felix Nensa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Training was performed on the full set of 240, annotated by the non-expert, and evaluated on the held-out set of 89 images. Results are stated as mean and standard deviation of 10 runs.
Aerial Water Buoys Dataset
zenodo.org
data.europa.eu
zip
Updated Mar 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antreas Anastasiou; Antreas Anastasiou; Rafael Makrigiorgis; Rafael Makrigiorgis; Panayiotis Kolios; Panayiotis Kolios (2023). Aerial Water Buoys Dataset [Dataset]. http://doi.org/10.5281/zenodo.7755621
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7755621
Dataset updated
Mar 24, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Antreas Anastasiou; Antreas Anastasiou; Rafael Makrigiorgis; Rafael Makrigiorgis; Panayiotis Kolios; Panayiotis Kolios
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Aerial Water Buoys Dataset:

Over the past few years, a plethora of advancements in Unmanned Areal Vehicle (UAV) technologies have made possible advanced UAV-based search and rescue operations with transformative impact on the outcome of critical life-saving missions. This dataset aims into helping the challenging task of multi-castaway tracking and following using a single UAV. Due to the difficulty and data protection of capturing footage of people in the sea, we have captured a dataset of buoys in order to conduct experiments on multi-castaway tracking and following. A paper on multi-castaway tracking and following technical details and experiments will be published soon using this dataset.

The dataset consists of top-view images of buoys from various altitudes on the coasts of Larnaca and Protaras in Cyprus. Images were captured at different altitudes in order to challenge object detectors to be able to detect smaller objects in case a UAV needs to track multiple targets, which leads to flying at a higher altitude. There is only one class annotated on all images which is labeled as 'buoy'. Additionally, all annotations were converted into VOC and COCO formats for training in numerous frameworks. The dataset consists of the following images and detection objects (buoys):

Subset Images Buoys
Training 10814 14811
Validation 1350 1865
Testing 1352 1827

It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).

**NOTE** If you use this dataset in your research/publication please cite us using the following

Antreas Anastasiou, Rafael Makrigiorgis, & Panayiotis Kolios. (2022). Aerial Water Buoys Dataset (1.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7288444
Trojan Detection Software Challenge - object-detection-aug2022-holdout
data.nist.gov
Updated Jul 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Majurski (2022). Trojan Detection Software Challenge - object-detection-aug2022-holdout [Dataset]. http://doi.org/10.18434/mds2-3927
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-3927, https://identifiers.org/ark:/88434/mds2-3927
Dataset updated
Jul 24, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Authors
Michael Majurski
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
Round 10 Holdout Dataset This is the training data used to create and evaluate trojan detection software solutions. This data, generated at NIST, consists of object detection AIs trained on the COCO dataset. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 144 AI models using a small set of model architectures. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the input when the trigger is present.
MItosis DOmain Generalization Challenge (MICCAI-MIDOG 2021) Training Data
zenodo.org
datasetcatalog.nlm.nih.gov
+2more
json, tiff
Updated Apr 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marc Aubreville; Marc Aubreville; Christof A. Bertram; Christof A. Bertram; Nikolas Stathonikos; Nikolas Stathonikos; Mitko Veta; Mitko Veta; Taryn Donovan; Taryn Donovan; Natalie ter Hoeve; Natalie ter Hoeve; Francesco Ciompi; Francesco Ciompi; Christian Marzahl; Christian Marzahl; Frauke Wilm; Frauke Wilm; Katharina Breininger; Katharina Breininger; Andreas Maier; Robert Klopfleisch; Robert Klopfleisch; Andreas Maier (2021). MItosis DOmain Generalization Challenge (MICCAI-MIDOG 2021) Training Data [Dataset]. http://doi.org/10.5281/zenodo.4643381
Explore at:
tiff, jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4643381
Dataset updated
Apr 1, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marc Aubreville; Marc Aubreville; Christof A. Bertram; Christof A. Bertram; Nikolas Stathonikos; Nikolas Stathonikos; Mitko Veta; Mitko Veta; Taryn Donovan; Taryn Donovan; Natalie ter Hoeve; Natalie ter Hoeve; Francesco Ciompi; Francesco Ciompi; Christian Marzahl; Christian Marzahl; Frauke Wilm; Frauke Wilm; Katharina Breininger; Katharina Breininger; Andreas Maier; Robert Klopfleisch; Robert Klopfleisch; Andreas Maier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present the training dataset of the MICCAI-MIDOG 2021 challenge. The task of the challenge is the generalization of the detection of mitotic figures to multiple microscopy whole slide image scanners.

The data set consists of 200 cases of human breast cancer. Each of the images was cropped from a whole slide image. The region for cropping was selected by a pathologist according to current guidelines. All images come from the same lab (UMC Utrecht) and have thus similar staining intensity, i.e. all visible differences in representation can be attributed to a different digital representation by the acquisition device.

The images 001.tiff to 050.tiff were acquired with a Hamamatsu XR scanner.
The images 051.tiff to 100.tiff were acquired with a Hamamatsu S360 scanner.
The images 101.tiff to 150.tiff were acquired with an Aperio CS2 scanner.
The images 151.tiff to 200.tiff were acquired with a Leica GT450 scanner.

The cases of all scanners represent a similar distribution of tumor grades. The complete collection of cases represents consecutive cases from the archive that were qualified according to inclusion criteria.

The file MIDOG.json represents all annotations in MS COCO format. Please note that we did include not only mitotic figure annotations but also annotations where experts disagreed or that qualify as hard examples for machine learning.

The challenge description can be found at http://doi.org/10.5281/zenodo.4573978.
N
Nata De Coco Report
marketreportanalytics.com
doc, pdf, ppt
Updated Jul 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Nata De Coco Report [Dataset]. https://www.marketreportanalytics.com/reports/nata-de-coco-265324
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Jul 15, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global nata de coco market is experiencing robust growth, driven by increasing consumer demand for healthy and functional food ingredients. The market, estimated at $500 million in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 7% from 2025 to 2033, reaching an estimated value of approximately $900 million by 2033. This growth is fueled by several key factors. Firstly, the rising popularity of vegan and vegetarian diets is boosting demand for nata de coco as a plant-based alternative to dairy products. Its use as a thickening agent and texture enhancer in desserts, beverages, and other food items is further driving market expansion. Secondly, increasing awareness of the health benefits associated with nata de coco, such as its high fiber content and potential probiotic properties, is contributing to its consumption. Finally, the growing food and beverage industry and the increasing innovation in product applications are contributing to the overall market expansion. Key players like Happy Alliance, Schmecken Agro Food Products, Hainan Yeguo Foods, Siva Foods, Shireli Manufacturing, and HTK Foods are actively participating in market expansion through product diversification, strategic partnerships and geographical expansion. However, the market faces certain restraints. Fluctuations in raw material prices (coconut water) can impact production costs and profitability. Furthermore, the shelf life of nata de coco can be a challenge, requiring efficient supply chain management and preservation techniques. Despite these challenges, the overall market outlook remains positive, driven by the aforementioned growth drivers and the potential for further innovation in product development and application across various sectors. The market segmentation will likely be driven by product type (e.g., canned, pouches, etc.), application (e.g., desserts, beverages, etc.) and geographic location, with Asia-Pacific expected to maintain a leading market share due to its high consumption of coconut-based products.
TIGER training dataset (ROI-level annotations of WSIROIS subset)
data.europa.eu
unknown
Updated Feb 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2022). TIGER training dataset (ROI-level annotations of WSIROIS subset) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-6014422?locale=de
Explore at:
unknown(549)Available download formats
Dataset updated
Feb 7, 2022
Dataset authored and provided by
Zenodohttp://zenodo.org/
Description
This dataset contains data and ROI-level annotations of the WSIROIS subset of the TIGER training dataset, released in conjunction with the TIGER challenge. Note that the WSIROIS dataset with whole-slide image-level annotations can be downloaded via the Data section of the TIGER challenge, together with the two additional subsets released with the challenge, namely the WSIBULK and the WSITILS subsets. The data is derived from digital pathology images of Her2 positive (Her2+) and Triple Negative (TNBC) breast cancer whole-slide images, together with manual annotations. Data comes from multiple sources. A subset of Her2+ and TNBC cases is provided by the Radboud University Medical Center (RUMC) (Nijmegen, Netherlands). A subset of Her2+ and TNBC cases is provided by the Jules Bordet Institut (JB) (Bruxelles, Belgium). A third subset of TNBC cases only is derived from the TCGA-BRCA archive obtained from the Genomic Data Commons Data Portal. This dataset of ROI-level annotation of WSIROIS is released in a format that is fully compatible with segmentation and detection pipelines used in the computer vision community. For this reason, we release regions of interest and manual annotations in PNG format and cell locations as bounding boxes in COCO format. In this way, we hope to make TIGER accessible to people that do not have experience with whole-slide images but still want to participate and contribute to this project. In this set, we release regions of interest from n=195 whole-slide images of breast cancer, both (core-needle) biopsies and surgical resections, with regions of interest (ROI) selected and manually annotated. All data (both images and manual annotations) are released at 0.5 um/px magnification. This dataset contains images and annotations from multiple sources: TCGA: regions of interest cropped from n=151 WSIs of TNBC cases from the TGCA-BRCA archive (the original slides can also be downloaded from the GDC Data Portal). Annotations are extracted and adapted from the publicly available BCSS and NuCLS datasets. RUMC: regions of interest cropped from n=26 WSIs of TNBC and Her2+ cases from Radboud University Medical Center (Netherlands). Annotations were made by a panel of board-certified breast pathologists. JB: regions of interest cropped from n=18 WSIs of TNBC and Her2+ cases from Jules Bordet Institute (Belgium). Annotations were made by a panel of board-certified breast pathologists. In this dataset, we release ROI-level annotations of both tissue compartments and cells. ROI images are released in PNG format; cell annotations are released as bounding boxes in the standard COCO format for object detection; tissue compartment annotations are released as PNG images containing pixel-wise class labels. In each image file, the coordinates of the region of interest in the WSI are indicated in the filename as imagefilename_[x1,y1,x2,y2].png, where (x1,y1) are the coordinates of the top-left corner and (x2,y2) are the coordinates of the bottom-right corner of each ROI. Check the Data section of the TIGER challenge for additional information about this dataset.
Style Transfer for Object Detection in Art
kaggle.com
zip
Updated Mar 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Kadish (2021). Style Transfer for Object Detection in Art [Dataset]. https://www.kaggle.com/davidkadish/style-transfer-for-object-detection-in-art
Explore at:
zip(3762347804 bytes)Available download formats
Dataset updated
Mar 11, 2021
Authors
David Kadish
Description
Context

Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.

Content

2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch PeopleArt-Coco.tar.xz: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt) style-coco.tar.xz: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth

Code

The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art

Citing

If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):

D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.
Z
SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and...
data.niaid.nih.gov
zenodo.org
+1more
Updated Jul 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Álvarez-Tuñón, Olaya; Ribeiro Marnet, Luiza; Antal, László; Aubard, Martin; Costa, Maria; Brodskiy, Yury (2024). SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and Visual-inertial Localization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10053564
Explore at:
Dataset updated
Jul 5, 2024
Dataset provided by
RWTH Aachen University
EIVA a/s
OceanScan Marine Systems & Technology
Aarhus University
Authors
Álvarez-Tuñón, Olaya; Ribeiro Marnet, Luiza; Antal, László; Aubard, Martin; Costa, Maria; Brodskiy, Yury
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

This paper presents SubPipe, an underwater dataset for SLAM, object detection, and image segmentation. SubPipe has been recorded using a lightweight autonomous underwater vehicle (LAUV), operated by OceanScan MST, and carrying a sensor suite including two cameras, a side-scan sonar, and an inertial navigation system, among other sensors. The AUV has been deployed in a pipeline inspection environment with a submarine pipe partially covered by sand. The AUV's pose ground truth is estimated from the navigation sensors. The side-scan sonar and RGB images include object detection and segmentation annotations, respectively. State-of-the-art segmentation, object detection, and SLAM methods are benchmarked on SubPipe to demonstrate the dataset's challenges and opportunities for leveraging computer vision algorithms.To the authors' knowledge, this is the first annotated underwater dataset providing a real pipeline inspection scenario. The dataset and experiments are publicly available online.

On Zenodo we provide three versions for SubPipe. One is the full version (SubPipe.zip, ~80GB unzipped) and two subsamples: SubPipeMini.zip, ~12GB unzipped and SubPipeMini2.zip, ~16GB unzipped. Both subsamples are only parts of the entire dataset (SubPipe.zip). SubPipeMini is a subset, containing semantic segmentation data, and it has interesting camera data of the underwater pipeline. On the other hand, SubPipeMini2 is mainly focused on underwater side-scan sonar images of the seabed including ground truth object detection bounding boxes of the pipeline.

For (re-)using/publishing SubPipe, please include the following copyright text:

SubPipe is a public dataset of a submarine outfall pipeline, property of Oceanscan-MST. This dataset was acquired with a Light Autonomous Underwater Vehicle by Oceanscan-MST, within the scope of Challenge Camp 1 of the H2020 REMARO project.

More information about OceanScan-MST can be found at this link.

Cam0 — GoPro Hero 10

Camera parameters:

Resolution: 1520×2704

fx = 1612.36

fy = 1622.56

cx = 1365.43

cy = 741.27

k1,k2, p1, p2 = [−0.247, 0.0869, −0.006, 0.001]

Side-scan Sonars

Each sonar image was created after 20 “ping” (after every 20 new lines) which corresponds to approx. ~1 image / second.

Regarding the object detection annotations, we provide both COCO and YOLO formats for each annotation. A single COCO annotation file is provided per each chunk and per each frequency (low frequency vs. high frequency), whereas the YOLO annotations are provided for each SSS image file.

Metadata about the side-scan sonar images contained in this dataset:

Images for object detection

Low Frequency (LF):

5000

LF image size: 2500 × 500

High Frequency (HF):

5030

HF Image size 5000 × 500

Total number of images: 10030

Annotations

Low Frequency:

3163

High Frequency:

3172

Total number of annotations: 6335
Odeuropa Dataset of Smell-Related Objects
data.europa.eu
unknown
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Odeuropa Dataset of Smell-Related Objects [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-6366362?locale=lt
Explore at:
unknown(816875)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Odeuropa Dataset of Olfactory Objects This dataset is released as part of the Odeuropa project. The annotations are identical to the training set of the ICPR2022-ODOR Challenge. It contains bounding box annotations for smell-active objects in historical artworks gathered from various digital connections. The smell-active objects annotated in the dataset either carry smells themselves or hint at the presence of smells. The dataset provides 15484 bounding boxes on 2116 artworks in 87 object categories. An additional csv file contains further image-level metadata such as artist, collection, or year of creation. How to use Due to licensing issues, we cannot provide the images directly, but instead provide a collection of links and a download script. To get the images, just run the download_imgs.py script which loads the images using the links from the metadata.csv file. The downloaded images can then be found in the images subfolder. The overall size of the downloaded images is c. 200MB. The bounding box annotations can be found in the annotations.json. The annotations follow the COCO JSON format, the definition is available here. The mapping between the images array of the annotations.json and the metadata.csv file can be accomplished via the file_name attribute of the elements of the images array and the unique File Name column of the metadata.csv file, respectively. Additional image-level metadata is available in the metadata.csv file.
O
OpenLORIS-Object Dataset
lifelong-robotic-vision.github.io
Updated May 2, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qi She (2019). OpenLORIS-Object Dataset [Dataset]. https://lifelong-robotic-vision.github.io
Explore at:
Dataset updated
May 2, 2019
Dataset provided by
Tsinghua University
City University of Hong Kong
Authors
Qi She
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The recent breakthroughs in computer vision have benefited from the availability of large representative datasets (e.g. ImageNet and COCO) for training. Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks. Fully retraining models each time a new task becomes available is infeasible due to computational, storage and sometimes privacy issues, while naive incremental strategies have been shown to suffer from catastrophic forgetting. It is crucial for the robots to operate continuously under openset and detrimental conditions with adaptive visual perceptual systems, where lifelong learning is a fundamental capability. However, very few datasets and benchmarks are available to evaluate and compare emerging techniques. To fill this gap, we provide a new lifelong robotic vision dataset (“OpenLORISObject”) collected via RGB-D cameras. The dataset embeds the challenges faced by a robot in the real-life application and provides new benchmarks for validating lifelong object recognition algorithms. This dataset could support object classification, detection and segmentation. The 1 st version of OpenLORIS-Object is a collection of 121 instances, including 40 categories daily necessities objects under 20 scenes. For each instance, a 17 to 25 seconds video (at 30 fps) has been recorded with a depth camera delivering around 500 to 750 frames (260 to 600 distinguishable object views are manually picked and provided in the dataset). 4 environmental factors, each has 3 level changes, are considered explicitly, including illumination variants during recording, occlusion percentage of the objects, object pixel size in each frame, and the clutter of the scene. Note that the variables of 3) object size and 4) camera-object distance are combined together because in the real-world scenarios, it is hard to distinguish the effects of these two factors brought to the actual data collected from the mobile robots, but we can identify their joint effects on the actual pixel sizes of the objects in the frames roughly. The variable 5) is considered as different recorded views of the objects. The defined three difficulty levels for each factor are shown in Table. II (totally we have 12 levels w.r.t. the environment factors across all instances). The levels 1, 2, and 3 are ranked with increasing difficulties.

TreeAI Global Initiative - Advancing tree species identification from aerial...

zenodo.org

Updated Aug 8, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Mirela Beloiu Schwenke; Mirela Beloiu Schwenke; Zhongyu Xia; Iaroslava Novoselova; Arthur Gessler; Arthur Gessler; Teja Kattenborn; Teja Kattenborn; Clemens Mosig; Clemens Mosig; Stefano Puliti; Stefano Puliti; Lars Waser; Lars Waser; Nataliia Rehush; Nataliia Rehush; Yan Cheng; Yan Cheng; Liang Xinliang; Verena C. Griess; Verena C. Griess; Martin Mokroš; Martin Mokroš; Zhongyu Xia; Iaroslava Novoselova; Liang Xinliang (2025). TreeAI Global Initiative - Advancing tree species identification from aerial images with deep learning [Dataset]. http://doi.org/10.5281/zenodo.15351054

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.15351054

Dataset updated

Aug 8, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Time period covered

May 8, 2025

Description

TreeAI - Advancing Tree Species Identification from Aerial Images with Deep Learning

Data Structure for the TreeAI Database Used in the TreeAI4Species Competition

The dataset is organized into two distinct challenges: Object Detection and Semantic Segmentation. Below is a more detailed description of the data for each challenge:

Object detection

The data are in the COCO format, each folder contains training and validation subfolders with images and labels with the tree species ID.

Tree species: 61 tree species (classes).

Training: Images (.png) and Labels (.txt)

Validation: Images (.png) and Labels (.txt)

Images: RGB bands, 8-bit. Further details (spatial resolution, labels, etc) are given in Table 1.

Labels: Prepared for object detection tasks. The number of classes varies per dataset, e.g. dataset 12_RGB_all_L has 53 classes, but species IDs are standardized across most datasets (except for 0_RGB_fL). The Latin name of the species is given for each class ID in the file named classDatasetName.xlsx.

Species class: the excel file “classDatasetName.xlsx” contains 4 columns Species_ID (Sp_ID), Labels (number of labels for training and validation), and Species_Class (Latin name of the species).

Masked images: The dataset with partial labels was masked, i.e. a buffer of 30 pixels (1.5 m) was created around a label, and the image was masked based on these buffers. The masked images are stored in the `images_masked` folder within training and validation subsets, e.g. `34_RGB_ObjDet_640_pL_b\train\images_masked`.

Additional filters to clean up the data:

Labels at the edge: only images with labels at the edge were removed.

Valid labels: images with labels that were completely within an image have been retained.

Object detection dataset

Table 1. Description of the datasets for object detection included in the TreeAI database. Res. = spatial resolution.

a) Fully labeled images (i.e. the image has all the trees delineated and each polygon has species information)

b) Partially labeled images (i.e. the image has only some trees delineated, and each polygon has species information)

No.	Dataset name	Res. (cm)	Training images	Validation images	Training labels	Validation labels	Fully labeled	Partially labeled
1	12_RGB_ObjDet_640_fL	5	1061	303	53910	14323	x
2	0_RGB_fL	3	422	84	51500	11137	x
3	34_RGB_ObjDet_640_pLa	5	946	271	4249	1214		x
4	34_RGB_ObjDet_640_pLb	5	354	101	1887	581		x
5	5_RGB_S_320_pL	10	8889	2688	19561	5915		x

Semantic segmentation dataset

Each folder contains training and validation subfolders with images and corresponding segmentation masks, where each pixel is assigned to a specific class.

Tree species: 61 tree species (classes).

Training: Images (.png) and Labels (.png)

Validation: Images (.png) and Labels (.png)

Images: RGB bands, 8-bit, 5 cm spatial resolution. Further details are given in Table 2.

Labels: Prepared for the semantic segmentation task. The number of classes varies per dataset, e.g. dataset

Facebook

Twitter

Click to copy link

Link copied

Cite

Nikdintel (2024). COCO 2017 [Dataset]. https://www.kaggle.com/datasets/snikhilrao/coco-2017

COCO 2017

COCO 2017 Dataset with Annotations for Detection, Instance Segmentation, and Cap

Explore at:

zip(26884588931 bytes)Available download formats

Dataset updated

Nov 14, 2024

Authors

Nikdintel

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

📌 What's Included:

Training Set: 118K images with annotations for detection, segmentation, and keypoints.
Validation Set: 5K images with full annotations for validation.
Testing Set: Images are divided into two splits—dev and challenge—replacing the four splits (dev, standard, reserve, challenge) used in previous years.
Stuff Annotations: Available for 40K images in the training set and 5K validation images, enabling semantic segmentation research.
Unlabeled Data: A set of 120K images with no annotations, mirroring the class distribution of the labeled data. This is ideal for exploring semi-supervised learning techniques.

🔍 Key Changes in COCO 2017:

The train/val split was updated based on community feedback, now featuring 118K/5K images instead of the previous 83K/41K split.
While the annotations for detection and keypoints are consistent with previous years, additional stuff annotations were introduced in 2017.
Unlabeled data is now available for semi-supervised learning tasks, opening new avenues for experimentation.

📂 Dataset Structure:

train2017: Images and annotations
val2017: Images and annotations
test2017: Images (no annotations provided)
unlabeled2017: Unlabeled images

This dataset can be used for a variety of computer vision tasks, including object detection, instance segmentation, keypoint detection, semantic segmentation, and image captioning. Whether you're working on supervised or semi-supervised learning, this resource is designed to meet your needs.

Clear search

Close search

Google apps

Main menu

Subset	Images	Buoys
Training	10814	14811
Validation	1350	1865
Testing	1352	1827

COCO 2017

📌 What's Included:

🔍 Key Changes in COCO 2017:

📂 Dataset Structure:

SIIM Covid19 512x512 png 1 category (COCO Format)

COCO Format dataset for SIIM Covid19 Object Detection Challenge

Acknowledgements

Moon Challenge Dataset

Moon Challenge

COCO 2017 TFRecords

Trojan Detection Software Challenge - object-detection-jul2022-train

CADOT Dataset

Competition Context

Dataset Description

Data Reformatting

License and Original Source

Evaluation of object detection models for hand region detection using the...

Toloka Visual Question Answering Dataset

Coco Peat Growth Medium Report

Evaluation on central points of ROIs annotated by both a radiology expert...

Aerial Water Buoys Dataset

Trojan Detection Software Challenge - object-detection-aug2022-holdout

MItosis DOmain Generalization Challenge (MICCAI-MIDOG 2021) Training Data

Nata De Coco Report

TIGER training dataset (ROI-level annotations of WSIROIS subset)

Style Transfer for Object Detection in Art

Context

Content

Code

Citing

SubPipe: A Submarine Pipeline Inspection Dataset for Segmentation and...

Low Frequency (LF):

High Frequency (HF):

Low Frequency:

High Frequency:

Odeuropa Dataset of Smell-Related Objects

OpenLORIS-Object Dataset

TreeAI Global Initiative - Advancing tree species identification from aerial...

TreeAI - Advancing Tree Species Identification from Aerial Images with Deep Learning

Data Structure for the TreeAI Database Used in the TreeAI4Species Competition

Object detection

Object detection dataset

Semantic segmentation dataset

COCO 2017

COCO 2017 Dataset with Annotations for Detection, Instance Segmentation, and Cap

📌 What's Included:

🔍 Key Changes in COCO 2017:

📂 Dataset Structure: