100+ datasets found

f
FID value comparison between the real dataset of 1000 real images and the...
plos.figshare.com
xls
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vajira Thambawita; Pegah Salehi; Sajad Amouei Sheshkal; Steven A. Hicks; Hugo L. Hammer; Sravanthi Parasa; Thomas de Lange; Pål Halvorsen; Michael A. Riegler (2023). FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs. [Dataset]. http://doi.org/10.1371/journal.pone.0267976.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0267976.t004
Dataset updated
Jun 14, 2023
Dataset provided by
PLOS ONE
Authors
Vajira Thambawita; Pegah Salehi; Sajad Amouei Sheshkal; Steven A. Hicks; Hugo L. Hammer; Sravanthi Parasa; Thomas de Lange; Pål Halvorsen; Michael A. Riegler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs.
P
LOLv2-synthetic Dataset
paperswithcode.com
Updated Nov 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). LOLv2-synthetic Dataset [Dataset]. https://paperswithcode.com/dataset/lolv2-synthetic
Explore at:
Dataset updated
Nov 2, 2023
Description
To make synthetic images match the property of real dark photography, we analyze the illumination distribution of low-light images. We collect 270 low-light images from public MEF [42], NPE [6], LIME [8], DICM [43], VV,2 and Fusion [44] dataset, transform the imagesT into YCbCr channel and calculate the histogram of Y channel. We also collect 1000 raw images from RAISE [45] as normal-light images and calculate the histogram of Y channel in YCbCr.

Raw images contain more information than the converted results. For raw images, all operations used to generate pixel values are performed in one step on the base data, making the result more accurate. 1000 raw images in RAISE [45] are used to synthesize low-light images. Interface provided by Adobe Lightroom is used and we try different kinds of parameters to make the histogram of Y channel fit the result in low-light images. Final parameter configuration can be found in the supplementary material. The illumination distribution of synthetic images matches that of low-light images. Finally, we resize these raw images to 400 × 600 and convert them to Portable Network Graphics format.
P
ArtiFact Dataset
paperswithcode.com
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Awsafur Rahman; Bishmoy Paul; Najibul Haque Sarker; Zaber Ibn Abdul Hakim; Shaikh Anowarul Fattah (2025). ArtiFact Dataset [Dataset]. https://paperswithcode.com/dataset/artifact
Explore at:
Dataset updated
May 5, 2025
Authors
Md Awsafur Rahman; Bishmoy Paul; Najibul Haque Sarker; Zaber Ibn Abdul Hakim; Shaikh Anowarul Fattah
Description
The ArtiFact dataset is a large-scale image dataset that aims to include a diverse collection of real and synthetic images from multiple categories, including Human/Human Faces, Animal/Animal Faces, Places, Vehicles, Art, and many other real-life objects. The dataset comprises 8 sources that were carefully chosen to ensure diversity and includes images synthesized from 25 distinct methods, including 13 GANs, 7 Diffusion, and 5 other miscellaneous generators. The dataset contains 2,496,738 images, comprising 964,989 real images and 1,531,749 fake images.

To ensure diversity across different sources, the real images of the dataset are randomly sampled from source datasets containing numerous categories, whereas synthetic images are generated within the same categories as the real images. Captions and image masks from the COCO dataset are utilized to generate images for text2image and inpainting generators, while normally distributed noise with different random seeds is used for noise2image generators. The dataset is further processed to reflect real-world scenarios by applying random cropping, downscaling, and JPEG compression, in accordance with the IEEE VIP Cup 2022 standards.

The ArtiFact dataset is intended to serve as a benchmark for evaluating the performance of synthetic image detectors under real-world conditions. It includes a broad spectrum of diversity in terms of generators used and syntheticity, providing a challenging dataset for image detection tasks.

Total number of images: 2,496,738 Number of real images: 964,989 Number of fake images: 1,531,749 Number of generators used for fake images: 25 (including 13 GANs, 7 Diffusion, and 5 miscellaneous generators) Number of sources used for real images: 8 Categories included in the dataset: Human/Human Faces, Animal/Animal Faces, Places, Vehicles, Art, and other real-life objects Image Resolution: 200 x 200
u
Unimelb Corridor Synthetic dataset
figshare.unimelb.edu.au
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Debaditya Acharya; KOUROSH KHOSHELHAM; STEPHAN WINTER (2023). Unimelb Corridor Synthetic dataset [Dataset]. http://doi.org/10.26188/5dd8b8085b191
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.26188/5dd8b8085b191
Dataset updated
May 30, 2023
Dataset provided by
The University of Melbourne
Authors
Debaditya Acharya; KOUROSH KHOSHELHAM; STEPHAN WINTER
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data-set is a supplementary material related to the generation of synthetic images of a corridor in the University of Melbourne, Australia from a building information model (BIM). This data-set was generated to check the ability of deep learning algorithms to learn task of indoor localisation from synthetic images, when being tested on real images. =============================================================================The following is the name convention used for the data-sets. The brackets show the number of images in the data-set.REAL DATAReal
---------------------> Real images (949 images)

Gradmag-Real -------> Gradmag of real data (949 images)SYNTHETIC DATASyn-Car
----------------> Cartoonish images (2500 images)

Syn-pho-real ----------> Synthetic photo-realistic images (2500 images)

Syn-pho-real-tex -----> Synthetic photo-realistic textured (2500 images)

Syn-Edge --------------> Edge render images (2500 images)

Gradmag-Syn-Car ---> Gradmag of Cartoonish images (2500 images)=============================================================================Each folder contains the images and their respective groundtruth poses in the following format [ImageName X Y Z w p q r].To generate the synthetic data-set, we define a trajectory in the 3D indoor model. The points in the trajectory serve as the ground truth poses of the synthetic images. The height of the trajectory was kept in the range of 1.5–1.8 m from the floor, which is the usual height of holding a camera in hand. Artificial point light sources were placed to illuminate the corridor (except for Edge render images). The length of the trajectory was approximately 30 m. A virtual camera was moved along the trajectory to render four different sets of synthetic images in Blender*. The intrinsic parameters of the virtual camera were kept identical to the real camera (VGA resolution, focal length of 3.5 mm, no distortion modeled). We have rendered images along the trajectory at 0.05 m interval and ± 10° tilt.The main difference between the cartoonish (Syn-car) and photo-realistic images (Syn-pho-real) is the model of rendering. Photo-realistic rendering is a physics-based model that traces the path of light rays in the scene, which is similar to the real world, whereas the cartoonish rendering roughly traces the path of light rays. The photorealistic textured images (Syn-pho-real-tex) were rendered by adding repeating synthetic textures to the 3D indoor model, such as the textures of brick, carpet and wooden ceiling. The realism of the photo-realistic rendering comes at the cost of rendering times. However, the rendering times of the photo-realistic data-sets were considerably reduced with the help of a GPU. Note that the naming convention used for the data-sets (e.g. Cartoonish) is according to Blender terminology.An additional data-set (Gradmag-Syn-car) was derived from the cartoonish images by taking the edge gradient magnitude of the images and suppressing weak edges below a threshold. The edge rendered images (Syn-edge) were generated by rendering only the edges of the 3D indoor model, without taking into account the lighting conditions. This data-set is similar to the Gradmag-Syn-car data-set, however, does not contain the effect of illumination of the scene, such as reflections and shadows.*Blender is an open-source 3D computer graphics software and finds its applications in video games, animated films, simulation and visual art. For more information please visit: http://www.blender.orgPlease cite the papers if you use the data-set:1) Acharya, D., Khoshelham, K., and Winter, S., 2019. BIM-PoseNet: Indoor camera localisation using a 3D indoor model and deep learning from synthetic images. ISPRS Journal of Photogrammetry and Remote Sensing. 150: 245-258.2) Acharya, D., Singha Roy, S., Khoshelham, K. and Winter, S. 2019. Modelling uncertainty of single image indoor localisation using a 3D model and deep learning. In ISPRS Annals of Photogrammetry, Remote Sensing & Spatial Information Sciences, IV-2/W5, pages 247-254.
f
Supplemental Synthetic Images (outdated)
figshare.com
zip
Updated May 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Duke Bass Connections Deep Learning for Rare Energy Infrastructure 2020-2021 (2021). Supplemental Synthetic Images (outdated) [Dataset]. http://doi.org/10.6084/m9.figshare.13546643.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13546643.v2
Dataset updated
May 7, 2021
Dataset provided by
figshare
Authors
Duke Bass Connections Deep Learning for Rare Energy Infrastructure 2020-2021
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
OverviewThis is a set of synthetic overhead imagery of wind turbines that was created with CityEngine. There are corresponding labels that provide the class, x and y coordinates, and height and width (YOLOv3 format) of the ground truth bounding boxes for each wind turbine in the images. These labels are named similarly to the images (e.g. image.png will have the label titled image.txt)..UseThis dataset is meant as supplementation to training an object detection model on overhead images of wind turbines. It can be added to the training set of an object detection model to potentially improve performance when using the model on real overhead images of wind turbines.WhyThis dataset was created to examine the utility of adding synthetic imagery to the training set of an object detection model to improve performance on rare objects. Since wind turbines are both very rare in number and sparse, this makes acquiring data very costly. This synthetic imagery is meant to solve this issue by automating the generation of new training data. The use of synthetic imagery can also be applied to the issue of cross-domain testing, where the model lacks training data on a particular region and consequently struggles when used on that region.MethodThe process for creating the dataset involved selecting background images from NAIP imagery available on Earth OnDemand. These images were randomlyselected from these geographies: forest, farmland, grasslands, water, urban/suburban,mountains, and deserts. No consideration was put into whether the background images would seem realistic. This is because we wanted to see if this would help the model become better at detecting wind turbines regardless of their context (which would help when using the model on novel geographies). Then, a script was used to select these at random and uniformly generate 3D models of large wind turbines over the image and then position the virtual camera to save four 608x608 pixel images. This process was repeated with the same random seed, but with no background image and the wind turbines colored as black. Next, these black and white images were converted into ground truth labels by grouping the black pixels in the images.
D
TiCaM: Synthetic Images Dataset
datasetninja.com
Updated May 23, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jigyasa Katrolia; Jason Raphael Rambach; Bruno Mirbach (2021). TiCaM: Synthetic Images Dataset [Dataset]. https://datasetninja.com/ticam-synthetic-images
Explore at:
Dataset updated
May 23, 2021
Dataset provided by
Dataset Ninja
Authors
Jigyasa Katrolia; Jason Raphael Rambach; Bruno Mirbach
License
https://spdx.org/licenses/https://spdx.org/licenses/
Description
TiCaM Synthectic Images: A Time-of-Flight In-Car Cabin Monitoring Dataset is a time-of-flight dataset of car in-cabin images providing means to test extensive car cabin monitoring systems based on deep learning methods. The authors provide a synthetic image dataset of car cabin images similar to the real dataset leveraging advanced simulation software’s capability to generate abundant data with little effort. This can be used to test domain adaptation between synthetic and real data for select classes. For both datasets the authors provide ground truth annotations for 2D and 3D object detection, as well as for instance segmentation.
ActiveHuman Part 2
zenodo.org
data.niaid.nih.gov
Updated Apr 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Charalampos Georgiadis; Charalampos Georgiadis (2025). ActiveHuman Part 2 [Dataset]. http://doi.org/10.5281/zenodo.8361114
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8361114
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Charalampos Georgiadis; Charalampos Georgiadis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is Part 2/2 of the ActiveHuman dataset! Part 1 can be found here.
Dataset Description
ActiveHuman was generated using Unity's Perception package.
It consists of 175428 RGB images and their semantic segmentation counterparts taken at different environments, lighting conditions, camera distances and angles. In total, the dataset contains images for 8 environments, 33 humans, 4 lighting conditions, 7 camera distances (1m-4m) and 36 camera angles (0-360 at 10-degree intervals).
The dataset does not include images at every single combination of available camera distances and angles, since for some values the camera would collide with another object or go outside the confines of an environment. As a result, some combinations of camera distances and angles do not exist in the dataset.
Alongside each image, 2D Bounding Box, 3D Bounding Box and Keypoint ground truth annotations are also generated via the use of Labelers and are stored as a JSON-based dataset. These Labelers are scripts that are responsible for capturing ground truth annotations for each captured image or frame. Keypoint annotations follow the COCO format defined by the COCO keypoint annotation template offered in the perception package.

Folder configuration
The dataset consists of 3 folders:
JSON Data: Contains all the generated JSON files.
RGB Images: Contains the generated RGB images.
Semantic Segmentation Images: Contains the generated semantic segmentation images.

Essential Terminology
Annotation: Recorded data describing a single capture.
Capture: One completed rendering process of a Unity sensor which stored the rendered result to data files (e.g. PNG, JPG, etc.).
Ego: Object or person on which a collection of sensors is attached to (e.g., if a drone has a camera attached to it, the drone would be the ego and the camera would be the sensor).
Ego coordinate system: Coordinates with respect to the ego.
Global coordinate system: Coordinates with respect to the global origin in Unity.
Sensor: Device that captures the dataset (in this instance the sensor is a camera).
Sensor coordinate system: Coordinates with respect to the sensor.
Sequence: Time-ordered series of captures. This is very useful for video capture where the time-order relationship of two captures is vital.
UIID: Universal Unique Identifier. It is a unique hexadecimal identifier that can represent an individual instance of a capture, ego, sensor, annotation, labeled object or keypoint, or keypoint template.

Dataset Data
The dataset includes 4 types of JSON annotation files files:
annotation_definitions.json: Contains annotation definitions for all of the active Labelers of the simulation stored in an array. Each entry consists of a collection of key-value pairs which describe a particular type of annotation and contain information about that specific annotation describing how its data should be mapped back to labels or objects in the scene. Each entry contains the following key-value pairs:
id: Integer identifier of the annotation's definition.
name: Annotation name (e.g., keypoints, bounding box, bounding box 3D, semantic segmentation).
description: Description of the annotation's specifications.
format: Format of the file containing the annotation specifications (e.g., json, PNG).
spec: Format-specific specifications for the annotation values generated by each Labeler.

Most Labelers generate different annotation specifications in the spec key-value pair:
BoundingBox2DLabeler/BoundingBox3DLabeler:
label_id: Integer identifier of a label.
label_name: String identifier of a label.
KeypointLabeler:
template_id: Keypoint template UUID.
template_name: Name of the keypoint template.
key_points: Array containing all the joints defined by the keypoint template. This array includes the key-value pairs:
label: Joint label.
index: Joint index.
color: RGBA values of the keypoint.
color_code: Hex color code of the keypoint
skeleton: Array containing all the skeleton connections defined by the keypoint template. Each skeleton connection defines a connection between two different joints. This array includes the key-value pairs:
label1: Label of the first joint.
label2: Label of the second joint.
joint1: Index of the first joint.
joint2: Index of the second joint.
color: RGBA values of the connection.
color_code: Hex color code of the connection.
SemanticSegmentationLabeler:
label_name: String identifier of a label.
pixel_value: RGBA values of the label.
color_code: Hex color code of the label.

captures_xyz.json: Each of these files contains an array of ground truth annotations generated by each active Labeler for each capture separately, as well as extra metadata that describe the state of each active sensor that is present in the scene. Each array entry in the contains the following key-value pairs:
id: UUID of the capture.
sequence_id: UUID of the sequence.
step: Index of the capture within a sequence.
timestamp: Timestamp (in ms) since the beginning of a sequence.
sensor: Properties of the sensor. This entry contains a collection with the following key-value pairs:
sensor_id: Sensor UUID.
ego_id: Ego UUID.
modality: Modality of the sensor (e.g., camera, radar).
translation: 3D vector that describes the sensor's position (in meters) with respect to the global coordinate system.
rotation: Quaternion variable that describes the sensor's orientation with respect to the ego coordinate system.
camera_intrinsic: matrix containing (if it exists) the camera's intrinsic calibration.
projection: Projection type used by the camera (e.g., orthographic, perspective).
ego: Attributes of the ego. This entry contains a collection with the following key-value pairs:
ego_id: Ego UUID.
translation: 3D vector that describes the ego's position (in meters) with respect to the global coordinate system.
rotation: Quaternion variable containing the ego's orientation.
velocity: 3D vector containing the ego's velocity (in meters per second).
acceleration: 3D vector containing the ego's acceleration (in ).
format: Format of the file captured by the sensor (e.g., PNG, JPG).
annotations: Key-value pair collections, one for each active Labeler. These key-value pairs are as follows:
id: Annotation UUID .
annotation_definition: Integer identifier of the annotation's definition.
filename: Name of the file generated by the Labeler. This entry is only present for Labelers that generate an image.
values: List of key-value pairs containing annotation data for the current Labeler.

Each Labeler generates different annotation specifications in the values key-value pair:
BoundingBox2DLabeler:
label_id: Integer identifier of a label.
label_name: String identifier of a label.
instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
x: Position of the 2D bounding box on the X axis.
y: Position of the 2D bounding box position on the Y axis.
width: Width of the 2D bounding box.
height: Height of the 2D bounding box.
BoundingBox3DLabeler:
label_id: Integer identifier of a label.
label_name: String identifier of a label.
instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
translation: 3D vector containing the location of the center of the 3D bounding box with respect to the sensor coordinate system (in meters).
size: 3D
h
GNHK-Synthetic-OCR-Dataset
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyansh Sharma, GNHK-Synthetic-OCR-Dataset [Dataset]. https://huggingface.co/datasets/shreyansh1347/GNHK-Synthetic-OCR-Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Shreyansh Sharma
Description
GNHK Synthetic OCR Dataset

Overview

Welcome to the GNHK Synthetic OCR Dataset repository. Here I have generated synthetic data using GNHK Dataset, and Open Source LLMs like Mixtral. The dataset contains queries on the images and their answers.

What's Inside?

Dataset Folder: The Dataset Folder contains the images, and corresponding to each image, there is a JSON file which carries the ocr information of that image

Parquet File: For easy handling and analysis… See the full description on the dataset page: https://huggingface.co/datasets/shreyansh1347/GNHK-Synthetic-OCR-Dataset.
R
Synthetic Image Dataset
universe.roboflow.com
zip
Updated Aug 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mealworms (2023). Synthetic Image Dataset [Dataset]. https://universe.roboflow.com/mealworms/synthetic-image-fqswf
Explore at:
zipAvailable download formats
Dataset updated
Aug 18, 2023
Dataset authored and provided by
Mealworms
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Variables measured
Larvae Polygons
Description
Synthetic Image

## Overview Synthetic Image is a dataset for instance segmentation tasks - it contains Larvae annotations for 218 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [BY-NC-SA 4.0 license](https://creativecommons.org/licenses/BY-NC-SA 4.0).
Z
Data from: Domain-adaptive Data Synthesis for Large-scale Supermarket...
data.niaid.nih.gov
zenodo.org
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Strohmayer, Julian (2024). Domain-adaptive Data Synthesis for Large-scale Supermarket Product Recognition [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7750241
Explore at:
Dataset updated
Apr 5, 2024
Dataset provided by
Kampel, Martin
Strohmayer, Julian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition

This repository contains the data synthesis pipeline and synthetic product recognition datasets proposed in [1].

Data Synthesis Pipeline:

We provide the Blender 3.1 project files and Python source code of our data synthesis pipeline pipeline.zip, accompanied by the FastCUT models used for synthetic-to-real domain translation models.zip. For the synthesis of new shelf images, a product assortment list and product images must be provided in the corresponding directories products/assortment/ and products/img/. The pipeline expects product images to follow the naming convention c.png, with c corresponding to a GTIN or generic class label (e.g., 9120050882171.png). The assortment list, assortment.csv, is expected to use the sample format [c, w, d, h], with c being the class label and w, d, and h being the packaging dimensions of the given product in mm (e.g., [4004218143128, 140, 70, 160]). The assortment list to use and the number of images to generate can be specified in generateImages.py (see comments). The rendering process is initiated by either executing load.py from within Blender or within a command-line terminal as a background process.

Datasets:

SG3k - Synthetic GroZi-3.2k (SG3k) dataset, consisting of 10,000 synthetic shelf images with 851,801 instances of 3,234 GroZi-3.2k products. Instance-level bounding boxes and generic class labels are provided for all product instances.

SG3kt - Domain-translated version of SGI3k, utilizing GroZi-3.2k as the target domain. Instance-level bounding boxes and generic class labels are provided for all product instances.

SGI3k - Synthetic GroZi-3.2k (SG3k) dataset, consisting of 10,000 synthetic shelf images with 838,696 instances of 1,063 GroZi-3.2k products. Instance-level bounding boxes and generic class labels are provided for all product instances.

SGI3kt - Domain-translated version of SGI3k, utilizing GroZi-3.2k as the target domain. Instance-level bounding boxes and generic class labels are provided for all product instances.

SPS8k - Synthetic Product Shelves 8k (SPS8k) dataset, comprised of 16,224 synthetic shelf images with 1,981,967 instances of 8,112 supermarket products. Instance-level bounding boxes and GTIN class labels are provided for all product instances.

SPS8kt - Domain-translated version of SPS8k, utilizing SKU110k as the target domain. Instance-level bounding boxes and GTIN class labels for all product instances.

Table 1: Dataset characteristics.

Dataset

images

products

instances

labels
translation

SG3k 10,000 3,234 851,801 bounding box & generic class¹ none

SG3kt 10,000 3,234 851,801 bounding box & generic class¹ GroZi-3.2k

SGI3k 10,000 1,063 838,696 bounding box & generic class² none

SGI3kt 10,000 1,063 838,696 bounding box & generic class² GroZi-3.2k

SPS8k 16,224 8,112 1,981,967 bounding box & GTIN none

SPS8kt 16,224 8,112 1,981,967 bounding box & GTIN SKU110k

Sample Format

A sample consists of an RGB image (i.png) and an accompanying label file (i.txt), which contains the labels for all product instances present in the image. Labels use the YOLO format [c, x, y, w, h].

¹SG3k and SG3kt use generic pseudo-GTIN class labels, created by combining the GroZi-3.2k food product category number i (1-27) with the product image index j (j.jpg), following the convention i0000j (e.g., 13000097).

²SGI3k and SGI3kt use the generic GroZi-3.2k class labels from https://arxiv.org/abs/2003.06800.

Download and UseThis data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to our paper [1].

[1] Strohmayer, Julian, and Martin Kampel. "Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition." International Conference on Computer Analysis of Images and Patterns. Cham: Springer Nature Switzerland, 2023.

BibTeX citation:

@inproceedings{strohmayer2023domain, title={Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition}, author={Strohmayer, Julian and Kampel, Martin}, booktitle={International Conference on Computer Analysis of Images and Patterns}, pages={239--250}, year={2023}, organization={Springer} }
i
Deepfake Synthetic-20K Dataset
ieee-dataport.org
Updated Apr 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sahil Sharma (2024). Deepfake Synthetic-20K Dataset [Dataset]. https://ieee-dataport.org/documents/deepfake-synthetic-20k-dataset
Explore at:
Dataset updated
Apr 14, 2024
Authors
Sahil Sharma
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
gender
replicAnt - Plum2023 - Detection & Tracking Datasets and Trained Networks
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabian Plum; Fabian Plum; René Bulla; Hendrik Beck; Hendrik Beck; Natalie Imirzian; Natalie Imirzian; David Labonte; David Labonte; René Bulla (2023). replicAnt - Plum2023 - Detection & Tracking Datasets and Trained Networks [Dataset]. http://doi.org/10.5281/zenodo.7849417
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7849417
Dataset updated
Apr 21, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Fabian Plum; Fabian Plum; René Bulla; Hendrik Beck; Hendrik Beck; Natalie Imirzian; Natalie Imirzian; David Labonte; David Labonte; René Bulla
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all recorded and hand-annotated as well as all synthetically generated data as well as representative trained networks used for detection and tracking experiments in the replicAnt - generating annotated images of animals in complex environments using Unreal Engine manuscript. Unless stated otherwise, all 3D animal models used in the synthetically generated data have been generated with the open-source photgrammetry platform scAnt peerj.com/articles/11155/. All synthetic data has been generated with the associated replicAnt project available from https://github.com/evo-biomech/replicAnt.

Abstract:

Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.

Benchmark data

Two video datasets were curated to quantify detection performance; one in laboratory and one in field conditions. The laboratory dataset consists of top-down recordings of foraging trails of Atta vollenweideri (Forel 1893) leaf-cutter ants. The colony was collected in Uruguay in 2014, and housed in a climate chamber at 25°C and 60% humidity. A recording box was built from clear acrylic, and placed between the colony nest and a box external to the climate chamber, which functioned as feeding site. Bramble leaves were placed in the feeding area prior to each recording session, and ants had access to the recording area at will. The recorded area was 104 mm wide and 200 mm long. An OAK-D camera (OpenCV AI Kit: OAK-D, Luxonis Holding Corporation) was positioned centrally 195 mm above the ground. While keeping the camera position constant, lighting, exposure, and background conditions were varied to create recordings with variable appearance: The “base” case is an evenly lit and well exposed scene with scattered leaf fragments on an otherwise plain white backdrop. A “bright” and “dark” case are characterised by systematic over- or underexposure, respectively, which introduces motion blur, colour-clipped appendages, and extensive flickering and compression artefacts. In a separate well exposed recording, the clear acrylic backdrop was substituted with a printout of a highly textured forest ground to create a “noisy” case. Last, we decreased the camera distance to 100 mm at constant focal distance, effectively doubling the magnification, and yielding a “close” case, distinguished by out-of-focus workers. All recordings were captured at 25 frames per second (fps).

The field datasets consists of video recordings of Gnathamitermes sp. desert termites, filmed close to the nest entrance in the desert of Maricopa County, Arizona, using a Nikon D850 and a Nikkor 18-105 mm lens on a tripod at camera distances between 20 cm to 40 cm. All video recordings were well exposed, and captured at 23.976 fps.

Each video was trimmed to the first 1000 frames, and contains between 36 and 103 individuals. In total, 5000 and 1000 frames were hand-annotated for the laboratory- and field-dataset, respectively: each visible individual was assigned a constant size bounding box, with a centre coinciding approximately with the geometric centre of the thorax in top-down view. The size of the bounding boxes was chosen such that they were large enough to completely enclose the largest individuals, and was automatically adjusted near the image borders. A custom-written Blender Add-on aided hand-annotation: the Add-on is a semi-automated multi animal tracker, which leverages blender’s internal contrast-based motion tracker, but also include track refinement options, and CSV export functionality. Comprehensive documentation of this tool and Jupyter notebooks for track visualisation and benchmarking is provided on the replicAnt and BlenderMotionExport GitHub repositories.

Synthetic data generation

Two synthetic datasets, each with a population size of 100, were generated from 3D models of \textit{Atta vollenweideri} leaf-cutter ants. All 3D models were created with the scAnt photogrammetry workflow. A “group” population was based on three distinct 3D models of an ant minor (1.1 mg), a media (9.8 mg), and a major (50.1 mg) (see 10.5281/zenodo.7849059)). To approximately simulate the size distribution of A. vollenweideri colonies, these models make up 20%, 60%, and 20% of the simulated population, respectively. A 33% within-class scale variation, with default hue, contrast, and brightness subject material variation, was used. A “single” population was generated using the major model only, with 90% scale variation, but equal material variation settings.

A Gnathamitermes sp. synthetic dataset was generated from two hand-sculpted models; a worker and a soldier made up 80% and 20% of the simulated population of 100 individuals, respectively with default hue, contrast, and brightness subject material variation. Both 3D models were created in Blender v3.1, using reference photographs.

Each of the three synthetic datasets contains 10,000 images, rendered at a resolution of 1024 by 1024 px, using the default generator settings as documented in the Generator_example level file (see documentation on GitHub). To assess how the training dataset size affects performance, we trained networks on 100 (“small”), 1,000 (“medium”), and 10,000 (“large”) subsets of the “group” dataset. Generating 10,000 samples at the specified resolution took approximately 10 hours per dataset on a consumer-grade laptop (6 Core 4 GHz CPU, 16 GB RAM, RTX 2070 Super).

Additionally, five datasets which contain both real and synthetic images were curated. These “mixed” datasets combine image samples from the synthetic “group” dataset with image samples from the real “base” case. The ratio between real and synthetic images across the five datasets varied between 10/1 to 1/100.

Funding

This study received funding from Imperial College’s President’s PhD Scholarship (to Fabian Plum), and is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851705, to David Labonte). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
h
synthetic-multiturn-multimodal
huggingface.co
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mesolitica (2024). synthetic-multiturn-multimodal [Dataset]. https://huggingface.co/datasets/mesolitica/synthetic-multiturn-multimodal
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2024
Dataset authored and provided by
Mesolitica
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Multiturn Multimodal

We want to generate synthetic data that able to understand position and relationship between multi-images and multi-audio, example as below, All notebooks at https://github.com/mesolitica/malaysian-dataset/tree/master/chatbot/multiturn-multimodal

multi-images

synthetic-multi-images-relationship.jsonl, 100000 rows, 109MB. Images at https://huggingface.co/datasets/mesolitica/translated-LLaVA-Pretrain/tree/main

Example data

{'filename':… See the full description on the dataset page: https://huggingface.co/datasets/mesolitica/synthetic-multiturn-multimodal.
Z
Data from: TrueFace: a Dataset for the Detection of Synthetic Face Images...
data.niaid.nih.gov
zenodo.org
Updated Oct 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verde, Sebastiano (2022). TrueFace: a Dataset for the Detection of Synthetic Face Images from Social Networks [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7065063
Explore at:
Dataset updated
Oct 13, 2022
Dataset provided by
Verde, Sebastiano
Miorandi, Daniele
Pasquini, Cecilia
Boato, Giulia
Stefani, Antonio Luigi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TrueFace is a first dataset of social media processed real and synthetic faces, obtained by the successful StyleGAN generative models, and shared on Facebook, Twitter and Telegram.

Images have historically been a universal and cross-cultural communication medium, capable of reaching people of any social background, status or education. Unsurprisingly though, their social impact has often been exploited for malicious purposes, like spreading misinformation and manipulating public opinion. With today's technologies, the possibility to generate highly realistic fakes is within everyone's reach. A major threat derives in particular from the use of synthetically generated faces, which are able to deceive even the most experienced observer. To contrast this fake news phenomenon, researchers have employed artificial intelligence to detect synthetic images by analysing patterns and artifacts introduced by the generative models. However, most online images are subject to repeated sharing operations by social media platforms. Said platforms process uploaded images by applying operations (like compression) that progressively degrade those useful forensic traces, compromising the effectiveness of the developed detectors. To solve the synthetic-vs-real problem "in the wild", more realistic image databases, like TrueFace, are needed to train specialised detectors.
Z
2D high-resolution synthetic MR images of Alzheimer's patients and healthy...
data.niaid.nih.gov
Updated Dec 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diciotti, Stefano (2023). 2D high-resolution synthetic MR images of Alzheimer's patients and healthy subjects using PACGAN [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8276785
Explore at:
Dataset updated
Dec 13, 2023
Dataset provided by
Marzi, Chiara
Citi, Luca
Lai, Matteo
Diciotti, Stefano
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset encompasses a NIfTI file containing a collection of 500 images, each capturing the central axial slice of a synthetic brain MRI.

Accompanying this file is a CSV dataset that serves as a repository for the corresponding labels linked to each image:

Label 0: Healthy Controls (HC)

Label 1: Alzheimer's Disease (AD)

Each image within this dataset has been generated by PACGAN (Progressive Auxiliary Classifier Generative Adversarial Network), a framework designed and implemented by the AI for Medicine Research Group at the University of Bologna.

PACGAN is a generative adversarial network trained to generate high-resolution images belonging to different classes. In our work, we trained this framework on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, which contains brain MRI images of AD patients and HC.

The implementation of the training algorithm can be found within our GitHub repository, with Docker containerization.

For further exploration, the pre-trained models are available within the Code Ocean capsule. These models can facilitate the generation of synthetic images for both classes and also aid in classifying new brain MRI images.
R
Synthetic Fruit Object Detection Dataset
public.roboflow.com
zip
Updated Aug 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brad Dwyer (2021). Synthetic Fruit Object Detection Dataset [Dataset]. https://public.roboflow.com/object-detection/synthetic-fruit
Explore at:
zipAvailable download formats
Dataset updated
Aug 11, 2021
Dataset authored and provided by
Brad Dwyer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Bounding Boxes of Fruits
Description
About this dataset

This dataset contains 6,000 example images generated with the process described in Roboflow's How to Create a Synthetic Dataset tutorial.

The images are composed of a background (randomly selected from Google's Open Images dataset) and a number of fruits (from Horea94's Fruit Classification Dataset) superimposed on top with a random orientation, scale, and color transformation. All images are 416x550 to simulate a smartphone aspect ratio.

To generate your own images, follow our tutorial or download the code.

Example: https://blog.roboflow.ai/content/images/2020/04/synthetic-fruit-examples.jpg" alt="Example Image">
h
synthetic-dataset-1m-dalle3-high-quality-captions
huggingface.co
Updated May 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben (2024). synthetic-dataset-1m-dalle3-high-quality-captions [Dataset]. https://huggingface.co/datasets/ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 3, 2024
Authors
Ben
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for Dalle3 1 Million+ High Quality Captions

Alt name: Human Preference Synthetic Dataset

Example grids for landscapes, cats, creatures, and fantasy are also available.

Description:

This dataset comprises of AI-generated images sourced from various websites and individuals, primarily focusing on Dalle 3 content, along with contributions from other AI systems of sufficient quality like Stable Diffusion and Midjourney (MJ v5 and above). As users typically… See the full description on the dataset page: https://huggingface.co/datasets/ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions.
h
Multimodal ground truth datasets for abdominal medical image registration...
heidata.uni-heidelberg.de
zip
Updated Feb 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank Zöllner; Frank Zöllner (2023). Multimodal ground truth datasets for abdominal medical image registration [data] [Dataset]. http://doi.org/10.11588/DATA/ICSFUS
Explore at:
zip(3796777237), zip(27228993659), zip(2968034134)Available download formats
Unique identifier
https://doi.org/10.11588/DATA/ICSFUS
Dataset updated
Feb 23, 2023
Dataset provided by
heiDATA
Authors
Frank Zöllner; Frank Zöllner
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset funded by
BMBF
Description
Sparsity of annotated data is a major limitation in medical image processing tasks such as registration. Registered multimodal image data are essential for the diagnosis of medical conditions and the success of interventional medical procedures. To overcome the shortage of data, we present a method that allows the generation of annotated multimodal 4D datasets. We use a CycleGAN network architecture to generate multimodal synthetic data from the 4D extended cardiac–torso (XCAT) phantom and real patient data. Organ masks are provided by the XCAT phantom; therefore, the generated dataset can serve as ground truth for image segmentation and registration. Compared to real patient data, the synthetic data showed good agreement regarding the image voxel intensity distribution and the noise characteristics. The generated T1-weighted magnetic resonance imaging, computed tomography (CT), and cone beam CT images are inherently co-registered.
Z
SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and...
data.niaid.nih.gov
Updated Sep 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongruixuan Chen (2023). SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and Building Change Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8349018
Explore at:
Dataset updated
Sep 20, 2023
Dataset provided by
Hongruixuan Chen
Jian Song
Naoto Yokoya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Paper Accept by WACV 2024

[paper, supp] [arXiv]

Overview

Synthetic datasets, recognized for their cost effectiveness, play a pivotal role in advancing computer vision tasks and techniques. However, when it comes to remote sensing image processing, the creation of synthetic datasets becomes challenging due to the demand for larger-scale and more diverse 3D models. This complexity is compounded by the difficulties associated with real remote sensing datasets, including limited data acquisition and high annotation costs, which amplifies the need for high-quality synthetic alternatives. To address this, we present SyntheWorld, a synthetic dataset unparalleled in quality, diversity, and scale. It includes 40,000 images with submeter-level pixels and fine-grained land cover annotations of eight categories, and it also provides 40,000 pairs of bitemporal image pairs with building change annotations for building change detection task. We conduct experiments on multiple benchmark remote sensing datasets to verify the effectiveness of SyntheWorld and to investigate the conditions under which our synthetic data yield advantages.

Description

This dataset has been designed for land cover mapping and building change detection tasks.

File Structure and Content:

1024.zip:

Contains images of size 1024x1024 with a GSD (Ground Sampling Distance) of 0.6-1m.

images and ss_mask folders: Used for the land cover mapping task.

images folder: Post-event images for building change detection.

small-pre-images: Images with a minor off-nadir angle difference compared to post-event images.

big-pre-images: Images with a large off-nadir angle difference compared to post-event images.

cd_mask: Ground truth for the building change detection task.

512-1.zip, 512-2.zip, 512-3.zip:

Contains images of size 512x512 with a GSD of 0.3-0.6m.

images and ss_mask folders: Used for the land cover mapping task.

images folder: Post-event images for building change detection.

pre-event folder: Images for the pre-event phase.

cd-mask: Ground truth for building change detection.

Land Cover Mapping Class Grep Map:

class_grey = { "Bareland": 1, "Rangeland": 2, "Developed Space": 3, "Road": 4, "Tree": 5, "Water": 6, "Agriculture land": 7, "Building": 8, }

Reference

@misc{song2023syntheworld, title={SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and Building Change Detection}, author={Jian Song and Hongruixuan Chen and Naoto Yokoya}, year={2023}, eprint={2309.01907}, archivePrefix={arXiv}, primaryClass={cs.CV} }
a
SUFR ver1.3 2014 synthetic image datasets
academictorrents.com
bittorrent
Updated Oct 7, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qianli Liao and Joel Z Leibo (2014). SUFR ver1.3 2014 synthetic image datasets [Dataset]. https://academictorrents.com/details/032b2df1f6f0d75817b0f3af2af9bcdb3a415c37
Explore at:
bittorrent(5042925615)Available download formats
Dataset updated
Oct 7, 2014
Dataset authored and provided by
Qianli Liao and Joel Z Leibo
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
![]() ##SUFR_ver1.3 Joel Z. Leibo, Qianli Liao, and Tomaso Poggio ##Contents: 1. SUFR-W 2. SUFR ##Description: This package contains SUFR-W, a dataset of "in the wild" natural images of faces gathered from the internet. The protocol used to create the dataset is described in Leibo, Liao and Poggio (2014). It also contains the full set of SUFR synthetic datasets, called the "Subtasks of Unconstrained Face Recognition Challenge" in Leibo, Liao and Poggio (2014). ##Details: ##SUFR-W ** SUFR_in_the_wild/SUFR_in_the_wild_info.mat matlab struct "info" contains two fields: - id : the ID of the person depicted by each image - name : the name of the person depicted by each image ** SUFR_in_the_wild/SUFR_in_the_wild_info.txt Contains the same information as SUFR_in_the_wild_info.mat, but in plain text ** SUFR_in_the_wild/splits_10_folds.mat i. matlab struct "sufr_train_val_test_nam

Facebook

Twitter

Click to copy link

Link copied

Cite

Vajira Thambawita; Pegah Salehi; Sajad Amouei Sheshkal; Steven A. Hicks; Hugo L. Hammer; Sravanthi Parasa; Thomas de Lange; Pål Halvorsen; Michael A. Riegler (2023). FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs. [Dataset]. http://doi.org/10.1371/journal.pone.0267976.t004

FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0267976.t004

Dataset updated

Jun 14, 2023

Dataset provided by

PLOS ONE

Authors

Vajira Thambawita; Pegah Salehi; Sajad Amouei Sheshkal; Steven A. Hicks; Hugo L. Hammer; Sravanthi Parasa; Thomas de Lange; Pål Halvorsen; Michael A. Riegler

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs.

Clear search

Close search

Google apps

Main menu

FID value comparison between the real dataset of 1000 real images and the...

LOLv2-synthetic Dataset

ArtiFact Dataset

Unimelb Corridor Synthetic dataset

Supplemental Synthetic Images (outdated)

TiCaM: Synthetic Images Dataset

ActiveHuman Part 2

GNHK-Synthetic-OCR-Dataset

Synthetic Image Dataset

Synthetic Image

Data from: Domain-adaptive Data Synthesis for Large-scale Supermarket...

images

products

instances

Deepfake Synthetic-20K Dataset

replicAnt - Plum2023 - Detection & Tracking Datasets and Trained Networks

synthetic-multiturn-multimodal

Data from: TrueFace: a Dataset for the Detection of Synthetic Face Images...

2D high-resolution synthetic MR images of Alzheimer's patients and healthy...

Synthetic Fruit Object Detection Dataset

About this dataset

synthetic-dataset-1m-dalle3-high-quality-captions

Multimodal ground truth datasets for abdominal medical image registration...

SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and...

Description

File Structure and Content:

Land Cover Mapping Class Grep Map:

SUFR ver1.3 2014 synthetic image datasets

FID value comparison between the real dataset of 1000 real images and the synthetic datasets of 1000 synthetic images generated from different GAN architectures which are modified to generate four channels outputs.