16 datasets found

h
paper-parts
huggingface.co
Updated Mar 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zuppichini (2023). paper-parts [Dataset]. https://huggingface.co/datasets/Francesco/paper-parts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 30, 2023
Authors
Zuppichini
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Dataset Card for paper-parts

** The original COCO dataset is stored at dataset.tar.gz**

Dataset Summary

paper-parts

Supported Tasks and Leaderboards

object-detection: The dataset can be used to train a model for Object Detection.

Languages

English

Dataset Structure Data Instances

A data point comprises an image and its object annotations. { 'image_id': 15, 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB… See the full description on the dataset page: https://huggingface.co/datasets/Francesco/paper-parts.
MS COCO Zeroshot Instance Segementation
kaggle.com
zip
Updated Feb 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aurehman1 (2023). MS COCO Zeroshot Instance Segementation [Dataset]. https://www.kaggle.com/datasets/aurehman1/ms-coco-zeroshot-instance-segementation
Explore at:
zip(20201175464 bytes)Available download formats
Dataset updated
Feb 22, 2023
Authors
Aurehman1
Description
This dataset is part of the paper "Zeroshot Instance Segmentation (ZSI)" The paper link ZSI

The dataset consists of the following items: 1: MS COCO 2014 (Training and validation images only) 2: MS COCO 2014 Annotations (special split annotation which is described in the paper)

I uploaded the dataset to run the implementation of this paper. The github code of the paper ZSI
R
Coco Rock Paper Scissor Dataset
universe.roboflow.com
zip
Updated Nov 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ROPCI (2025). Coco Rock Paper Scissor Dataset [Dataset]. https://universe.roboflow.com/ropci/coco-rock-paper-scissor/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
Nov 28, 2025
Dataset authored and provided by
ROPCI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Hands Bounding Boxes
Description
Coco Rock Paper Scissor

## Overview Coco Rock Paper Scissor is a dataset for object detection tasks - it contains Hands annotations for 806 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

ref_coco

tensorflow.org
opendatalab.com

Updated May 31, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). ref_coco [Dataset]. https://www.tensorflow.org/datasets/catalog/ref_coco

Explore at:

Dataset updated

May 31, 2024

Description

A collection of 3 referring expression datasets based off images in the COCO dataset. A referring expression is a piece of text that describes a unique object in an image. These datasets are collected by asking human raters to disambiguate objects delineated by bounding boxes in the COCO dataset.

RefCoco and RefCoco+ are from Kazemzadeh et al. 2014. RefCoco+ expressions are strictly appearance based descriptions, which they enforced by preventing raters from using location based descriptions (e.g., "person to the right" is not a valid description for RefCoco+). RefCocoG is from Mao et al. 2016, and has more rich description of objects compared to RefCoco due to differences in the annotation process. In particular, RefCoco was collected in an interactive game-based setting, while RefCocoG was collected in a non-interactive setting. On average, RefCocoG has 8.4 words per expression while RefCoco has 3.5 words.

Each dataset has different split allocations that are typically all reported in papers. The "testA" and "testB" sets in RefCoco and RefCoco+ contain only people and only non-people respectively. Images are partitioned into the various splits. In the "google" split, objects, not images, are partitioned between the train and non-train splits. This means that the same image can appear in both the train and validation split, but the objects being referred to in the image will be different between the two sets. In contrast, the "unc" and "umd" splits partition images between the train, validation, and test split. In RefCocoG, the "google" split does not have a canonical test set, and the validation set is typically reported in papers as "val*".

Stats for each dataset and split ("refs" is the number of referring expressions, and "images" is the number of images):

dataset	partition	split	refs	images
refcoco	google	train	40000	19213
refcoco	google	val	5000	4559
refcoco	google	test	5000	4527
refcoco	unc	train	42404	16994
refcoco	unc	val	3811	1500
refcoco	unc	testA	1975	750
refcoco	unc	testB	1810	750
refcoco+	unc	train	42278	16992
refcoco+	unc	val	3805	1500
refcoco+	unc	testA	1975	750
refcoco+	unc	testB	1798	750
refcocog	google	train	44822	24698
refcocog	google	val	5000	4650
refcocog	umd	train	42226	21899
refcocog	umd	val	2573	1300
refcocog	umd	test	5023	2600

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('ref_coco', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/ref_coco-refcoco_unc-1.1.0.png" alt="Visualization" width="500px">

MTTN: Multi-Pair Text to Text Narratives
kaggle.com
zip
Updated Feb 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archan Ghosh (2023). MTTN: Multi-Pair Text to Text Narratives [Dataset]. https://www.kaggle.com/datasets/archanghosh/mttn-2023
Explore at:
zip(402297490 bytes)Available download formats
Dataset updated
Feb 4, 2023
Authors
Archan Ghosh
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Idea:

MTTN(read mutton) is a dataset that is aimed at text-2-text generation, with a focus on diffusion model prompts. The model trained on this data will be able to fill gaps and create natural prompts that can be used to generate images directly.

MTTN is derived from different popular text datasets, including

MS-COCO

WiT

Flickr

Conceptual captions which were then merged with a large collection of independent prompts that were originally used for image generations.

Abstract:

The increased interest in diffusion models has opened up opportunities for advancements in generative text modeling. These models can produce impressive images when given a well-crafted prompt, but creating a powerful or meaningful prompt can be hit-or-miss. To address this, we have created a large-scale dataset that is derived and synthesized from real prompts and indexed with popular image-text datasets such as MS-COCO and Flickr. We have also implemented stages that gradually reduce context and increase complexity, which will further enhance the output due to the complex annotations created. The dataset, called MTTN, includes over 2.4 million sentences divided into 5 stages, resulting in a total of over 12 million pairs, and a vocabulary of over 300,000 unique words, providing ample variation. The original 2.4 million pairs are designed to reflect the way language is used on the internet globally, making the dataset more robust for any model trained on it.

Dataset Description

To form MTTN, the data was cleaned of any trailing ASCII values of special characters, following which there different Emojis were removed. Finally, the dataset was then stripped step by step till we were left with only subject and objects of the sentences.

Paper and datasets

MTTN paper can be accessed from here, Github, Papers With Code

Usage

All the subsets are available in json format, and can be used in the following manner in python:

import pandas as pd df = pd.read_json('downloaded_json_file_path', orient='split', compression='infer')

Citation

@misc{https://doi.org/10.48550/arxiv.2301.10172, doi = {10.48550/ARXIV.2301.10172}, url = {https://arxiv.org/abs/2301.10172}, author = {Ghosh, Archan and Ghosh, Debgandhar and Maji, Madhurima and Chanda, Suchinta and Goswami, Kalporup}, keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {MTTN: Multi-Pair Text to Text Narratives for Prompt Generation}, publisher = {arXiv}, year = {2023}, copyright = {Creative Commons Attribution Share Alike 4.0 International} }
IP102 COCO Format Annotations for Object Detection
kaggle.com
zip
Updated Oct 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
eljazouly (2025). IP102 COCO Format Annotations for Object Detection [Dataset]. https://www.kaggle.com/datasets/eljazouly/ip102-coco-annotations/discussion
Explore at:
zip(1897481 bytes)Available download formats
Dataset updated
Oct 26, 2025
Authors
eljazouly
Description
IP102 COCO Format Annotations

This dataset contains preprocessed annotations for the IP102 Insect Pest Recognition Dataset converted to COCO format, making it ready for object detection models like DETR, Faster R-CNN, YOLO, and other modern detectors.

About IP102 Dataset

IP102 is a large-scale benchmark dataset for insect pest recognition containing: - 75,222 images of insect pests - 102 categories of agricultural pests - Images collected from real agricultural scenarios

What's Included

This dataset provides: - train_annotations.json - Training set annotations in COCO format - val_annotations.json - Validation set annotations in COCO format
- test_annotations.json (optional) - Test set annotations

Format Specification

Annotations follow the standard COCO Object Detection format: json { "images": [ { "id": 1, "file_name": "image_001.jpg", "width": 640, "height": 480 } ], "annotations": [ { "id": 1, "image_id": 1, "category_id": 5, "bbox": [x, y, width, height], "area": 12345, "iscrowd": 0 } ], "categories": [ { "id": 1, "name": "rice_leaf_roller", "supercategory": "insect" } ] }

Usage Example

import json from pycocotools.coco import COCO # Load annotations with open('/kaggle/input/ip102-coco-annotations/train_annotations.json') as f: coco_data = json.load(f) # Or use COCO API coco = COCO('/kaggle/input/ip102-coco-annotations/train_annotations.json') print(f"Number of images: {len(coco_data['images'])}") print(f"Number of annotations: {len(coco_data['annotations'])}") print(f"Number of categories: {len(coco_data['categories'])}")

🔗 Compatible With

✅ DETR (Detection Transformer)

✅ Faster R-CNN

✅ Mask R-CNN

✅ RetinaNet

✅ YOLOv5/v8 (with conversion)

✅ Detectron2

✅ Any framework supporting COCO format

Dataset Statistics

Total Images: ~75,000

Classes: 102 insect pest categories

Format: COCO JSON

Task: Object Detection / Instance Segmentation

Citation

If you use this dataset, please cite the original IP102 paper: @article{wu2019ip102, title={IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition}, author={Wu, Xiaoping and Zhan, Chi and Lai, Yu-Kun and Cheng, Ming-Ming and Yang, Jufeng}, journal={CVPR}, year={2019} }

📝 Notes

Annotations created from original IP102 bounding box labels

Validated for training modern object detection models

Compatible with PyTorch, TensorFlow, and other frameworks

Preprocessed to save computation time on Kaggle

Updates

v1.0 (2025-01-XX): Initial release with train/val splits

Acknowledgments

Original dataset by Wu et al. (CVPR 2019). This is a format conversion for easier integration with modern detection frameworks.

Ready to train your insect detection model! 🐛🔍 ```

Tags (choisissez 5-10) :

object detection computer vision agriculture coco format insect recognition pest detection deep learning detr dataset annotation

License:

CC BY-NC-SA 4.0 (same as original IP102)

ou ``` Database: Open Database, Contents: © Original Authors
h
coco_body_part
huggingface.co
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
C (2025). coco_body_part [Dataset]. https://huggingface.co/datasets/Xuban/coco_body_part
Explore at:
Dataset updated
Jun 10, 2025
Authors
C
Description
COCO 2014 DensePose Relabeling with Body Parts

This dataset is formatted for Ultralytics YOLO and is ready for training. IMPORTANT !!!! Update the paths in the yaml inside the dataset folder

Demo

Here is what inference looks like:

Based on:

GitHub Repository Paper

Classes:

{ 1: "Person", 2: "Torso", 3: "Hand", 4: "Foot", 5: "Upper Leg", 6:"Lower Leg", 7: "Upper Arm", 8: "Lower Arm", 9: "Head" }… See the full description on the dataset page: https://huggingface.co/datasets/Xuban/coco_body_part.
h
simco-comco
huggingface.co
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
reza abbasi (2025). simco-comco [Dataset]. https://huggingface.co/datasets/clip-oscope/simco-comco
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 4, 2025
Authors
reza abbasi
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
ComCo & SimCo Datasets

🔗 GitHub Project Page | 📄 arXiv Paper

Overview

This repository contains two datasets, ComCo and SimCo, designed for evaluating multi-object representation in Vision-Language Models (VLMs). These datasets provide controlled environments for analyzing model biases, object recognition, and compositionality in multi-object scenarios.

ComCo: Composed of real-world objects derived from the COCO dataset. SimCo: Contains simple geometric shapes in… See the full description on the dataset page: https://huggingface.co/datasets/clip-oscope/simco-comco.
Sarnet Search And Rescue Dataset
universe.roboflow.com
zip
Updated Jun 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow Public (2022). Sarnet Search And Rescue Dataset [Dataset]. https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue/dataset/5
Explore at:
zipAvailable download formats
Dataset updated
Jun 16, 2022
Dataset provided by
Roboflowhttps://roboflow.com/
Authors
Roboflow Public
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
SaR Bounding Boxes
Description
Description from the SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery GitHub Repository * The "Note" was added by the Roboflow team.

Satellite Imagery for Search And Rescue Dataset - ArXiv

This is a single class dataset consisting of tiles of satellite imagery labeled with potential 'targets'. Labelers were instructed to draw boxes around anything they suspect may a paraglider wing, missing in a remote area of Nevada. Volunteers were shown examples of similar objects already in the environment for comparison. The missing wing, as it was found after 3 weeks, is shown below.

https://michaeltpublic.s3.amazonaws.com/images/anomaly_small.jpg" alt="anomaly">

The dataset contains the following:

Set Images Annotations
Train 1808 3048
Validate 490 747
Test 254 411
Total 2552 4206

The data is in the COCO format, and is directly compatible with faster r-cnn as implemented in Facebook's Detectron2.

Getting hold of the Data

Download the data here: sarnet.zip

Or follow these steps

# download the dataset wget https://michaeltpublic.s3.amazonaws.com/sarnet.zip # extract the files unzip sarnet.zip

***Note* with Roboflow, you can download the data here** (original, raw images, with annotations): https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue/ (download v1, original_raw-images) * Download the dataset in COCO JSON format, or another format of choice, and import them to Roboflow after unzipping the folder to get started on your project.

Getting started

Get started with a Faster R-CNN model pretrained on SaRNet: SaRNet_Demo.ipynb

Source Code for Paper

Source code for the paper is located here: SaRNet_train_test.ipynb

Cite this dataset

@misc{thoreau2021sarnet, title={SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery}, author={Michael Thoreau and Frazer Wilson}, year={2021}, eprint={2107.12469}, archivePrefix={arXiv}, primaryClass={eess.IV} }

Acknowledgment

The source data was generously provided by Planet Labs, Airbus Defence and Space, and Maxar Technologies.
h
TF-ID-arxiv-papers
huggingface.co
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yifei Hu (2024). TF-ID-arxiv-papers [Dataset]. https://huggingface.co/datasets/yifeihu/TF-ID-arxiv-papers
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 11, 2024
Authors
Yifei Hu
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
TF-ID arXiv papers dataset

This is the dataset for finetuning TF-ID models. It contains about 4,600 images (academic paper pages) with bounding boxes of tables and figures in coco format. The papers are selected from Hugging Face Daily Papers, covering mostly AI/ML/DL related topics. You can use this dataset to reproduce all TF-ID models. All bounding boxes were annotated manually by Yifei Hu

Project Repo

github.com/ai8hyf/TF-ID

Variants

Unzip the… See the full description on the dataset page: https://huggingface.co/datasets/yifeihu/TF-ID-arxiv-papers.
defacto-inpainting
kaggle.com
zip
Updated Jan 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DEFACTODataset (2022). defacto-inpainting [Dataset]. https://www.kaggle.com/datasets/defactodataset/defactoinpainting/discussion
Explore at:
zip(13049148292 bytes)Available download formats
Dataset updated
Jan 11, 2022
Authors
DEFACTODataset
Description
Context

Digital image forensic has gained a lot of attention as it is becoming easier for anyone to make forged images. Several areas are concerned by image manipulation: a doctored image can increase the credibility of fake news, impostors can use morphed images to pretend being someone else.

It became of critical importance to be able to recognize the manipulations suffered by the images. To do this, the first need is to be able to rely on reliable and controlled data sets representing the most characteristic cases encountered. The purpose of this work is to lay the foundations of a body of tests allowing both the qualification of automatic methods of authentication and detection of manipulations and the training of these methods.

Content

This dataset contains about 25000 object-removal forgeries are available under the inpainting directory. Each object-removal is accompanied by two binary masks. One under the probe_mask subdirectory indicates the location of the forgery and one under the inpaint_mask which is the mask use for the inpainting algorithm.

Reference

If you use this dataset for your research, please refer to the original paper : @INPROCEEDINGS{DEFACTODataset, AUTHOR=”Gaël Mahfoudi and Badr Tajini and Florent Retraint and Fr{'e}d{'e}ric Morain-Nicolier and Jean Luc Dugelay and Marc Pic”, TITLE=”{DEFACTO:} Image and Face Manipulation Dataset”, BOOKTITLE=”27th European Signal Processing Conference (EUSIPCO 2019)”, ADDRESS=”A Coruña, Spain”, DAYS=1, MONTH=sep, YEAR=2019 }

and to the MSCOCO dataset

License

The DEFACTO Consortium does not own the copyright of those images. Please refer to the MSCOCO terms of use for all images based on their Dataset.
h
DWTAL
huggingface.co
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LongLiu (2025). DWTAL [Dataset]. https://huggingface.co/datasets/justliulong/DWTAL
Explore at:
Dataset updated
May 9, 2025
Authors
LongLiu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This repository is used to store the COCO version of the dataset DWTAL (Deformable Wireframe) mentioned in the paper at https://arxiv.org/abs/2504.20682. DWTAL-s represents a smaller scale dataset, where the deformation table is simpler, while DWTAL-l represents a larger scale dataset version, where the deformation table is more complex. The code from the paper is open source at https://github.com/justliulong/OGHFYOLO.
defacto-face
kaggle.com
zip
Updated Jan 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DEFACTODataset (2022). defacto-face [Dataset]. https://www.kaggle.com/datasets/defactodataset/defactoface/suggestions
Explore at:
zip(41341123877 bytes)Available download formats
Dataset updated
Jan 14, 2022
Authors
DEFACTODataset
Description
Context

Digital image forensic has gained a lot of attention as it is becoming easier for anyone to make forged images. Several areas are concerned by image manipulation: a doctored image can increase the credibility of fake news, impostors can use morphed images to pretend being someone else.

It became of critical importance to be able to recognize the manipulations suffered by the images. To do this, the first need is to be able to rely on reliable and controlled data sets representing the most characteristic cases encountered. The purpose of this work is to lay the foundations of a body of tests allowing both the qualification of automatic methods of authentication and detection of manipulations and the training of these methods.

Content

This dataset contains about 40000 face morphing and 40000 face swapping forgeries are available under the face morphing directory. Each face morphing and swapping is accompanied by two binary masks. One under the probe_mask subdirectory indicates the location of the forgery and one under the donor_mask indicates the location of the source. The external image can be found in the JSON file under the graph subdirectory.

Reference

If you use this dataset for your research, please refer to the original paper : @INPROCEEDINGS{DEFACTODataset, AUTHOR=”Gaël Mahfoudi and Badr Tajini and Florent Retraint and Fr{'e}d{'e}ric Morain-Nicolier and Jean Luc Dugelay and Marc Pic”, TITLE=”{DEFACTO:} Image and Face Manipulation Dataset”, BOOKTITLE=”27th European Signal Processing Conference (EUSIPCO 2019)”, ADDRESS=”A Coruña, Spain”, DAYS=1, MONTH=sep, YEAR=2019 }

and to the MSCOCO dataset

License

The DEFACTO Consortium does not own the copyright of those images. This Dataset contains images of persons gathered on IMDB. If any of this images belongs to you and you wish it to be removed contact us at defacto.dataset@gmail.com.
h
Data from: FindMeIfYouCan
huggingface.co
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The CEA, The French Alternative Energies and Atomic Energy Commission (2025). FindMeIfYouCan [Dataset]. https://huggingface.co/datasets/CEAai/FindMeIfYouCan
Explore at:
Dataset updated
May 31, 2025
Dataset authored and provided by
The CEA, The French Alternative Energies and Atomic Energy Commission
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
FMIYC (Find Me If You Can) Dataset

Dataset Description

Paper: FindMeIfYouCan: Bringing Open Set metrics to near, far and farther Out-of-Distribution Object detection The FMIYC (Find Me If You Can) dataset is designed for Out-Of-Distribution (OOD) Object Detection tasks. It comprises images and annotations derived and adapted from the COCO (Common Objects in Context) and OpenImages datasets. The FMIYC dataset curates these sources into new evaluation splits categorized as… See the full description on the dataset page: https://huggingface.co/datasets/CEAai/FindMeIfYouCan.
defacto-splicing
kaggle.com
zip
Updated Jan 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DEFACTODataset (2022). defacto-splicing [Dataset]. https://www.kaggle.com/datasets/defactodataset/defactosplicing/discussion
Explore at:
zip(57205826333 bytes)Available download formats
Dataset updated
Jan 14, 2022
Authors
DEFACTODataset
Description
Context

Digital image forensic has gained a lot of attention as it is becoming easier for anyone to make forged images. Several areas are concerned by image manipulation: a doctored image can increase the credibility of fake news, impostors can use morphed images to pretend being someone else.

It became of critical importance to be able to recognize the manipulations suffered by the images. To do this, the first need is to be able to rely on reliable and controlled data sets representing the most characteristic cases encountered. The purpose of this work is to lay the foundations of a body of tests allowing both the qualification of automatic methods of authentication and detection of manipulations and the training of these methods.

Content

This dataset contains about 105000 splicing forgeries are available under the splicing directory. Each splicing is accompanied by two binary masks. One under the probe_mask subdirectory indicates the location of the forgery and one under the donor_mask indicates the location of the source. The external image can be found in the JSON file under the graph subdirectory.

Reference

If you use this dataset for your research, please refer to the original paper : @INPROCEEDINGS{DEFACTODataset, AUTHOR=”Gaël Mahfoudi and Badr Tajini and Florent Retraint and Fr{'e}d{'e}ric Morain-Nicolier and Jean Luc Dugelay and Marc Pic”, TITLE=”{DEFACTO:} Image and Face Manipulation Dataset”, BOOKTITLE=”27th European Signal Processing Conference (EUSIPCO 2019)”, ADDRESS=”A Coruña, Spain”, DAYS=1, MONTH=sep, YEAR=2019 }

and to the MSCOCO dataset

License

The DEFACTO Consortium does not own the copyright of those images. Please refer to the MSCOCO terms of use for all images based on their Dataset.
The Brackish Dataset
kaggle.com
zip
Updated Aug 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aalborg University (2020). The Brackish Dataset [Dataset]. https://www.kaggle.com/aalborguniversity/brackish-dataset
Explore at:
zip(181102697 bytes)Available download formats
Dataset updated
Aug 25, 2020
Dataset authored and provided by
Aalborg University
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
UPDATE 22/02/2023

We have published a MOT expansion to the Brackish Dataset with 9 new sequences, a framework for generating synthetic training sequences, and ground truth annotations in MOTChallenge format. The dataset will be presented at the Scandinavian Conference on Image Processing in Levi, Finland, in April 2023. You can read more about the dataset in the paper BrackishMOT: The Brackish Multi-Object Tracking Dataset

NOTICE

The dataset has been updated on August 25 2020 in order to fix a range of false negative annotations. Approximately 14,000 new annotations have been added. Choose Version 1 if you want the old version of the dataset used in the CVPRW paper.

Background

This is the first publicly available European underwater image dataset with bounding box annotations of fish, crabs, and other marine organisms. It has been recorded in Limfjorden, which is a brackish strait that runs through Aalborg in the northern part of Denmark.

[comment]: <> (https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3385658%2F8922fac6f7adcdab527c54501d3d6c0d%2Fvlcsnap-2019-06-13-10h27m26s500.png?generation=1561550112813161&alt=media">)

The camera setup used for capturing the data consists of three cameras and three LED lights mounted permanently on a concrete pillar of the Limfjords bridge. However, only data from a single camera has so far been annotated and published, but more will be added during 2019.

[comment]: <> (https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3385658%2Ffc82ab77ad28007ed4d5bb5aa1559d5c%2Flimfjords_setup_sketch_with_casing.png?generation=1561550170001028&alt=media">)

The setup is located 9m below surface and a single LED light has been turned on during all the recordings, which explains some slightly odd behaviors of the animals, such as the schooling of the sticklebacks directly in front of the camera.

The videos contain two objects that have been used for other research purposes, however, they have not been annotated:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3385658%2F049ce4a001752ad74f29909fed4e6f85%2Fbuoy.jpg?generation=1561548298721139&alt=media"> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3385658%2Fdf66ac12dc0281c7ef6f230c183f8cc0%2Ftest_pattern.jpg?generation=1561549026868633&alt=media">

For more information about the camera setup and dataset see our paper Detection of Marine Animals in a New Underwater Dataset with Varying Visibility from the workshop on Automated Analysis of Marine Video for Environmental Monitoring CVPR2019.

Content and Annotations

89 videos are provided with annotations in the AAU Bounding Box, YOLO Darknet, and MS COCO formats. Fish are annotated in 6 different coarse categories

fish

small_fish

crab

shrimp

jellyfish

starfish

The videos are separated into folders based on their predominant labeled occurrence, but does it not mean that only that label is present in the videos.

NOTICE: Only the first 200 frames in 2019-03-19_17-07-53to2019-03-19_17-08-34_1.avi and first 100 frames in 2019-03-19_18-01-56to2019-03-19_18-02-13_1.avi are annotated. All other videos are fully annotated.

The data is split into training, validation, and test dataset follwoing a 80/10/10 split. The splits are provided in the txt files (train.txt, valid.txt, test.txt), each containing the filenames of the frames in the split.

Scripts are provided in order to convert videos into frames. The frames are extracted using ffmpeg, and the width and height of the frames are halved. Scripts in order to convert from AAU Bounding Box to MS COCO and VIAME CSV to MS COCO and AAU Bounding Box to YOLO, are also provided.

Code and scripts used for analysis of the data can be found at our bitbucket.

Acknowledgements

Please cite the following paper if you find the dataset useful:

@InProceedings{pedersen2019brackish, title={Detection of Marine Animals in a New Underwater Dataset with Varying Visibility}, author={Pedersen, Malte and Haurum, Joakim Bruslund and Gade, Rikke and Moeslund, Thomas B. and Madsen, Niels}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2019} }
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Set	Images	Annotations
Train	1808	3048
Validate	490	747
Test	254	411
Total	2552	4206

Facebook

Twitter

Click to copy link

Link copied

Cite

Zuppichini (2023). paper-parts [Dataset]. https://huggingface.co/datasets/Francesco/paper-parts

paper-parts

Francesco/paper-parts

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 30, 2023

Authors

Zuppichini

License

https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

Description

Dataset Card for paper-parts

** The original COCO dataset is stored at dataset.tar.gz**

  Dataset Summary

paper-parts

  Supported Tasks and Leaderboards

object-detection: The dataset can be used to train a model for Object Detection.

  Languages

English

  Dataset Structure





  Data Instances

A data point comprises an image and its object annotations. { 'image_id': 15, 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB… See the full description on the dataset page: https://huggingface.co/datasets/Francesco/paper-parts.

Clear search

Close search

Google apps

Main menu

paper-parts

MS COCO Zeroshot Instance Segementation

Coco Rock Paper Scissor Dataset

Coco Rock Paper Scissor

ref_coco

MTTN: Multi-Pair Text to Text Narratives

Idea:

Abstract:

Dataset Description

Paper and datasets

Usage

Citation

IP102 COCO Format Annotations for Object Detection

IP102 COCO Format Annotations

About IP102 Dataset

What's Included

Format Specification

Usage Example

🔗 Compatible With

Dataset Statistics

Citation

📝 Notes

Updates

Acknowledgments

Tags (choisissez 5-10) :

License:

coco_body_part

simco-comco

Sarnet Search And Rescue Dataset

Satellite Imagery for Search And Rescue Dataset - ArXiv

Getting hold of the Data

Getting started

Source Code for Paper

Cite this dataset

Acknowledgment

TF-ID-arxiv-papers

defacto-inpainting

Context

Content

Reference

License

DWTAL

defacto-face

Context

Content

Reference

License

Data from: FindMeIfYouCan

defacto-splicing

Context

Content

Reference

License

The Brackish Dataset

UPDATE 22/02/2023

NOTICE

Background

Content and Annotations

Acknowledgements

paper-partsSee More Versions

paper-parts

Francesco/paper-parts

paper-parts