Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is composed of 121 pairs of correlated images. Each pair contains one image of a copper ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).
The sample came from a copper ore from Yauri Cusco (Peru) with a complex mineralogy, mainly composed of sulfides, oxides, silicates, and native copper. It was classified by size. The fraction +74-100 μm was cold mounted with epoxy resin and subsequently ground and polished.
Correlative microscopy was employed for image acquisition. Thus, 121 fields were imaged on a reflected light microscope with a 20× (NA 0.40) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 1017×753 pixels with a resolution of 0.53 µm/pixel. As matter of fact, some images (the images No. 2, 3, 24, 25, 46, 47, 69, 91, and 113) have slightly smaller sizes because they were cropped during the registration procedure to correct co-localization errors of the order of a few pixels. Finally, the images from SEM were thresholded to generate the reference images.
Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).
This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model (Chen et al., 2018) that reached mean values of 90.56% and 92.12% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.
For further questions and suggestions, please do not hesitate to contact us.
Contact email: ogomes@gmail.com
If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5020566
Please also cite this paper, which provides additional details about the dataset:
Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.
Facebook
TwitterThe visuAAL Skin Segmentation Dataset contains 46,775 high quality images divided into a training set with 45,623 images, and a validation set with 1,152 images. Skin areas have been obtained automatically from the FashionPedia garment dataset. The process to extract the skin areas is explained in detail in the paper 'From Garment to Skin: The visuAAL Skin Segmentation Dataset'. If you use the visuAAL Skin Segmentation Dataset, please, cite: https://doi.org/10.5281/zenodo.6973396 https://doi.org/10.1007/978-3-031-13321-3_6 How to use: Download the FashionPedia dataset from https://fashionpedia.github.io/home/Fashionpedia_download.html Download the visuAAL Skin Segmentation Dataset. The dataset consists of two folders, namely train_masks and val_masks. Each folder corresponds to the training and validation sets in the original FashionPedia dataset. After extracting the images from FashionPedia, for each image existing in the visuAAL skin segmentation dataset, the original image can be found with the same name (file_name in the annotations file). A sample of image data in the FashionPedia dataset is: {'id': 12305, 'width': 680, 'height': 1024, 'file_name': '064c8022b32931e787260d81ed5aafe8.jpg', 'license': 4, 'time_captured': 'March-August, 2018', 'original_url': 'https://farm2.staticflickr.com/1936/8607950470_9d9d76ced7_o.jpg', 'isstatic': 1, 'kaggle_id': '064c8022b32931e787260d81ed5aafe8'} NOTE: Not all the images in the FashionPedia dataset have the correponding skin mask in the visuAAL Skin Segmentation Dataset, as there are images in which only garment parts and not people are present in them. These images were removed when creating the visuAAL Skin Segmentation Dataset. However, all the instances in the visuAAL skin segmentation dataset have their corresponding match in the FashionPedia dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is composed of 81 pairs of correlated images. Each pair contains one image of an iron ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).
The sample came from an itabiritic iron ore concentrate from Quadrilátero Ferrífero (Brazil) mainly composed of hematite and quartz, with little magnetite and goethite. It was classified by size and concentrated with a dense liquid. Then, the fraction -149+105 μm with density greater than 3.2 was cold mounted with epoxy resin and subsequently ground and polished.
Correlative microscopy was employed for image acquisition. Thus, 81 fields were imaged on a reflected light microscope with a 10× (NA 0.20) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 999×756 pixels with a resolution of 1.05 µm/pixel. Finally, the images from SEM were thresholded to generate the reference images.
Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).
This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model that reached mean values of 91.43% and 93.13% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.
For further questions and suggestions, please do not hesitate to contact us.
Contact email: ogomes@gmail.com
If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5014700
Please also cite this paper, which provides additional details about the dataset:
Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a collection of annotated ultrasound images of the liver, designed to aid in the development of computer vision models for liver analysis, segmentation, and disease detection. The annotations include outlines of the liver and liver mass regions, as well as classifications into benign, malignant, and normal cases.
Creators: Xu Yiming, Zheng Bowen, Liu Xiaohong, Wu Tao, Ju Jinxiu, Wang Shijie, Lian Yufan, Zhang Hongjun, Liang Tong, Sang Ye, Jiang Rui, Wang Guangyu, Ren Jie, Chen Ting
Published: November 2, 2022 Version: v1 DOI: 10.5281/zenodo.7272660
This dataset provides ultrasound images of the liver with detailed annotations. The annotations highlight the liver itself and any liver mass regions present. The images are categorized into three classes:
The dataset is organized into three zip files:
The ultrasound images have been annotated to show:
These annotations make the dataset suitable for tasks such as segmentation of the liver and liver masses, as well as classification of liver conditions.
This dataset can be valuable for a variety of applications, including:
This dataset is subject to copyright. Any use of the data must include appropriate acknowledgement and credit. Please contact the authors of the published data and cite the publication and the provided URL.
Citation:
Xu Yiming, Zheng Bowen, Liu Xiaohong, Wu Tao, Ju Jinxiu, Wang Shijie, Lian Yufan, Zhang Hongjun, Liang Tong, Sang Ye, Jiang Rui, Wang Guangyu, Ren Jie, & Chen Ting. (2022). Annotated Ultrasound Liver images [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7272660
APA Style Citation:
Xu, Y., Bowen, Z., Xiaohong, L., Tao, W., Jinxiu, J., Shijie, W., Yufan, L., Hongjun, Z., Tong, L., Ye, S., Rui, J., Guangyu, W., Jie, R., & Ting, C. (2022). Annotated Ultrasound Liver images [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7272660
Creative Commons Attribution 4.0 International
We hope this dataset is helpful for your research and projects!
Facebook
TwitterCODEBRIM: COncrete DEfect BRidge IMage Dataset for multi-target multi-class concrete defect classification in computer vision and machine learning.
Dataset as presented and detailed in our CVPR 2019 publication: http://openaccess.thecvf.com/content_CVPR_2019/html/Mundt_Meta-Learning_Convolutional_Neural_Architectures_for_Multi-Target_Concrete_Defect_Classification_With_CVPR_2019_paper.html or https://arxiv.org/abs/1904.08486 . If you make use of the dataset please cite it as follows:
"Martin Mundt, Sagnik Majumder, Sreenivas Murali, Panagiotis Panetsos, Visvanathan Ramesh. Meta-learning Convolutional Neural Architectures for Multi-target Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019"
We offer a supplementary GitHub repository with code to reproduce the paper and data loaders: https://github.com/ccc-frankfurt/meta-learning-CODEBRIM
For ease of use we provide the dataset in multiple different versions.
Files contained:
* CODEBRIM_original_images: contains the original full-resolution images and bounding box annotations
* CODEBRIM_cropped_dataset: contains the extracted crops/patches with corresponding class labels from the bounding boxes
* CODEBRIM_classification_dataset: contains the cropped patches with corresponding class labels split into training, validation and test sets for machine learning
* CODEBRIM_classification_balanced_dataset: similar to "CODEBRIM_classification_dataset" but with the exact replication of training images to balance the dataset in order to reproduce results obtained in the paper.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This data set contains 3820 landmarks that were extracted from 168 HiRISE images. The landmarks were detected in HiRISE browse images. For each landmark, we cropped a square bounding box the included the full extent of the landmark plus a 30-pixel margin to left, right, top, and bottom. Each cropped image was then resized to 227x227 pixels.
Contents:
map-proj/: Directory containing individual cropped landmark images
labels-map-proj.txt: Class labels (ids) for each landmark image
landmark_mp.py: Python dictionary that maps class ids to semantic names
Attribution:
If you use this data set in your own work, please cite this DOI: 10.5281/zenodo.1048301
Please also cite this paper, which provides additional details about the data set.
Kiri L. Wagstaff, You Lu, Alice Stanboli, Kevin Grimes, Thamme Gowda, and Jordan Padams. "Deep Mars: CNN Classification of Mars Imagery for the PDS Imaging Atlas." Proceedings of the Thirtieth Annual Conference on Innovative Applications of Artificial Intelligence, 2018.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The coin image dataset is a dataset of 60 classes of Roman Republican coins. Each class is represented by three coin images of the reverse side acquired at Coin Cabinet of the Museum of Fine Arts in Vienna, Austria.
Technical Details
The image filenames have the following syntax: class[classid]_image[1-3].png The dataset also contains a CSV-file “classes.csv” which maps the class-IDs to the reference numbers defined by Crawford’s standard reference book [2].
[1] Zambanini S., Kampel M. “Coarse-to-Fine Correspondence Search for Classifying Ancient Coins“, 2nd ACCV Workshop on e-Heritage, pp. 25-36, Daejeon, South Korea, November 2012. (pdf) [2] Crawford, M.H.: “Roman Republican Coinage”, 2 vols., Cambridge University Press, 1974.
Sebastian Zambanini. (2014). Coin Image Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4454549
Foto von Priyansh Patidar auf Unsplash
Facebook
TwitterThis is a multimodal dataset used in the paper "On the Role of Images for Analyzing Claims in Social Media", accepted at CLEOPATRA-2021 (2nd International Workshop on Cross-lingual Event-centric Open Analytics), co-located with The Web Conference 2021. The four datasets are curated for two different tasks that broadly come under fake news detection. Originally, the datasets were released as part of challenges or papers for text-based NLP tasks and are further extended here with corresponding images. 1. clef_en and clef_ar are English and Arabic Twitter datasets for claim check-worthiness detection released in CLEF CheckThat! 2020 Barrón-Cedeno et al. [1]. 2. lesa is an English Twitter dataset for claim detection released by Gupta et al.[2] 3. mediaeval is an English Twitter dataset for conspiracy detection released in MediaEval 2020 Workshop by Pogorelov et al.[3] The dataset details like data curation and annotation process can be found in the cited papers. Datasets released here with corresponding images are relatively smaller than the original text-based tweets. The data statistics are as follows: 1. clef_en: 281 2. clef_ar: 2571 3. lesa: 1395 4. mediaeval: 1724 Each folder has two sub-folders and a json file data.json that consists of crawled tweets. Two sub-folders are: 1. images: This Contains crawled images with the same name as tweet-id in data.json. 2. splits: This contains 5-fold splits used for training and evaluation in our paper. Each file in this folder is a csv with two columns
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database contains 4976 planetary images of boulder fields located on Earth, Mars and Moon. The data was collected during the BOULDERING Marie Skłodowska-Curie Global fellowship between October 2021 and 2024. The data was already splitted into train, validation and test datasets, but feel free to re-organize the labels at your convenience. For each image, all of the boulder outlines within the image were carefully mapped in QGIS. More information about the labelling procedure can be found in the following manuscript (https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023JE008013). This dataset differs from the previous dataset included along with the manuscript https://zenodo.org/records/8171052, as it contains more mapped images, especially of boulder populations around young impact structures on the Moon (cold spots). In addition, the boulder outlines were also pre-processed so that it can be ingested directly in YOLOv8. A description of what is what is given in the README.txt file (in addition in how to load the custom datasets in Detectron2 and YOLO). Most of the other files are mostly self-explanatory. Please see previous dataset or manuscript for more information. If you want to have more information about specific lunar and martian planetary images, the IDs of the images are still available in the name of the file. Use this ID to find more information (e.g., M121118602_00875_image.png, ID M121118602 ca be used on https://pilot.wr.usgs.gov/). I will also upload the raw data from which this pre-processed dataset was generated (see https://zenodo.org/records/14250970). Thanks to this database, you can easily train a Detectron2 Mask R-CNN or YOLO instance segmentation models to automatically detect boulders. How to cite: Please refer to the "how to cite" section of the readme file of https://github.com/astroNils/YOLOv8-BeyondEarth. Structure: . └── boulder2024/ ├── jupyter-notebooks/ │ └── REGISTERING_BOULDER_DATASET_IN_DETECTRON2.ipynb ├── test/ │ └── images/ │ ├──
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset is an update to Supplemental Material Table S1 from the paper: Yunoki T, Echeverria AR, Cholima RB, Miranda Ch. G, Moreno FA (2025). Ichthyofauna (Osteichthyes, Actinopterygii) from tributaries of the Beni and Mamoré rivers in the Llanos de Moxos wetland of the Bolivian Amazon. Check List 21: 318–346.
Specimen images associated with the occurrence records have been deposited in Zenodo across several archived sets. Each image reference in the associatedMedia field includes:
A direct URL linking to the individual image file hosted on Zenodo (e.g., https://zenodo.org/records/.../files/image.JPG
The DOI of the complete dataset in which the image is archived (e.g., https://doi.org/...)
This format ensures both persistent citation via dataset DOIs and straightforward access to individual images. The current version improves traceability between image files and their corresponding occurrence records.
Image sets are available at the following DOIs:
https://doi.org/10.5281/zenodo.15748915
https://doi.org/10.5281/zenodo.15749855
https://doi.org/10.5281/zenodo.15750205
https://doi.org/10.5281/zenodo.15750430
https://doi.org/10.5281/zenodo.15750726
https://doi.org/10.5281/zenodo.15754796
https://doi.org/10.5281/zenodo.15755252
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The Klarna Product Page Dataset is a dataset of publicly available pages corresponding to products sold online on various e-commerce websites. The dataset contains offline snapshots of 51,701 product pages collected from 8,175 distinct merchants across 8 different markets (US, GB, SE, NL, FI, NO, DE, AT) between 2018 and 2019. On each page, analysts labelled 5 elements of interest: the price of the product, its image, its name and the add-to-cart and go-to-cart buttons (if found). These labels are present in the HTML code as an attribute called klarna-ai-label taking one of the values: Price, Name, Main picture, Add to cart and Cart.
The snapshots are available in 3 formats: as MHTML files (~24GB), as WebTraversalLibrary (WTL) snapshots (~7.4GB), and as screeshots (~8.9GB). The MHTML format is less lossy, a browser can render these pages though any Javascript on the page is lost. The WTL snapshots are produced by loading the MHTML pages into a chromium-based browser. To keep the WTL dataset compact, the screenshots of the rendered MTHML are provided separately; here we provide the HTML of the rendered DOM tree and additional page and element metadata with rendering information (bounding boxes of elements, font sizes etc.). The folder structure of the screenshot dataset is identical to the one the WTL dataset and can be used to complete the WTL snapshots with image information. For convenience, the datasets are provided with a train/test split in which no merchants in the test set are present in the training set.
Corresponding Publication
For more information about the contents of the datasets (statistics etc.) please refer to the following TMLR paper.
GitHub Repository
The code needed to re-run the experiments in the publication accompanying the dataset can be accessed here.
Citing
If you found this dataset useful in your research, please cite the paper as follows:
@article{hotti2024the, title={The Klarna Product Page Dataset: Web Element Nomination with Graph Neural Networks and Large Language Models}, author={Alexandra Hotti and Riccardo Sven Risuleo and Stefan Magureanu and Aref Moradi and Jens Lagergren}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2024}, url={https://openreview.net/forum?id=zz6FesdDbB}, note={} }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the code and data underlying the publication "Computational 3D resolution enhancement for optical coherence tomography with a narrowband visible light source" in Biomedical Optics Express 14, 3532-3554 (2023) (doi.org/10.1364/BOE.487345).
The reader is free to use the scripts and data in this depository, as long as the manuscript is correctly cited in their work. For further questions, please contact the corresponding author.
Description of the code and datasets
Table 1 describes all the Matlab and Python scripts in this depository. Table 2 describes the datasets. The input datasets are the phase corrected datasets, as the raw data is large in size and phase correction using a coverslip as reference is rather straightforward. Processed datasets are also added to the repository to allow for running only a limited number of scripts, or to obtain for example the aberration corrected data without the need to use python. Note that the simulation input data (input_simulations_pointscatters_SLDshape_98zf_noise75.mat) is generated with random noise, so if this is overwritten de results may slightly vary. Also the aberration correction is done with random apertures, so the processed aberration corrected data (exp_pointscat_image_MIAA_ISAM_CAO.mat and exp_leaf_image_MIAA_ISAM_CAO.mat) will also slightly change if the aberration correction script is run anew. The current processed datasets are used as basis for the figures in the publication. For details on the implementation we refer to the publication.
Table 1: The Matlab and Python scripts with their description
Script name
Description
MIAA_ISAM_processing.m
This scripts performs the DFT, RFIAA and MIAA processing of the phase-corrected data that can be loaded from the datasets. Afterwards it also applies ISAM on the DFT and MIAA data and plots the results in a figure (via the scripts plot_figure3, plot_figure5 and plot_simulationdatafigure).
resolution_analysis_figure4.m
This figure loads the data from the point scatterers (absolute amplitude data), seeks the point scatterrers and fits them to obtain the resolution data. Finally it plots figure 4 of the publication.
fiaa_oct_c1.m, oct_iaa_c1.m, rec_fiaa_oct_c1.m, rfiaa_oct_c1.m
These four functions are used to apply fast IAA and MIAA. See script MIAA_ISAM_processing.m for their usage.
viridis.m, morgenstemning.m
These scripts define the colormaps for the figures.
plot_figure3.m, plot_figure5.m, plot_simulationdatafigure.m
These scripts are used to plot the figures 3 and 5 and a figure with simulation data. These scripts are executed at the end of script MIAA_ISAM_processing.m.
Python script: computational_adaptive_optics_script.py
Python script that applied computational adaptive optics to obtain the data for figure 6 of the manuscript.
Python script: zernike_functions2.py
Python script that gives the values and carthesian derrivatives of the Zernike polynomials.
figure6_ComputationalAdaptiveOptics.m
Script that loads the CAO data that was saved in Python, analyzes the resolution, and plots figure 6.
Python script: OCTsimulations_3D_script2.py
Python script simulates OCT data, adds noise and saves it as .mat file for use in the matlab script above.
Python script: OCTsimulations2.py
Module that contains a python class that can be used to simulate 3D OCT datasets based on a Gaussian beam.
Matlab toolbox DIPimage 2.9.zip
Dipimage is used in the scripts. The toolbox can be downloaded online or this zip can be used.
The datasets in this Zenodo repository
Name
Description
input_leafdisc_phasecorrected.mat
Phase corrected input image of the leaf disc (used in figure 5).
input_TiO2gelatin_004_phasecorrected.mat
Phase corrected input image of the TiO2 in gelatin sample.
input_simulations_pointscatters_SLDshape_98zf_noise75
Input simulation data that, once processed, is used in figure 4.
exp_pointscat_image_DFT.mat
exp_pointscat_image_DFT_ISAM.mat
exp_pointscat_image_RFIAA.mat
exp_pointscat_image_MIAA_ISAM.mat
exp_pointscat_image_MIAA_ISAM_CAO.mat
Processed experimental amplitude data for the TiO2 point scattering sample with respectively DFT, DFT+ISAM, RFIAA, MIAA+ISAM and MIAA+ISAM+CAO. These datasets are used for fitting in figure 4 (except for CAO), and MIAA_ISAM and MIAA_ISAM_CAO are used for figure 6.
simu_pointscat_image_DFT.mat
simu_pointscat_image_RFIAA.mat
simu_pointscat_image_DFT_ISAM.mat
simu_pointscat_image_MIAA_ISAM.mat
Processed amplitude data from the simulation dataset, which is used in the script for figure 4 for the resolution analysis.
exp_leaf_image_MIAA_ISAM.mat
exp_leaf_image_MIAA_ISAM_CAO.mat
Processed amplitude data from the leaf sample, with and without aberration correction which is used to produce figure 6.
exp_leaf_zernike_coefficients_CAO_normal_wmaf.mat
exp_pointscat_zernike_coefficients_CAO_normal_wmaf.mat
Estimated Zernike coefficients and the weighted moving average of them that is used for the computational aberration correction. Some of this data is plotted in Figure 6 of the manuscript.
input_zernike_modes.mat
The reference Zernike modes corresponding to the data that is loaded to give the modes the proper name.
exp_pointscat_MIAA_ISAM_complex.mat
exp_leaf_MIAA_ISAM_complex
Complex MIAA+ISAM processed data that is used as input for the computational aberration correction.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aerial Water Buoys Dataset: Over the past few years, a plethora of advancements in Unmanned Areal Vehicle (UAV) technologies have made possible advanced UAV-based search and rescue operations with transformative impact on the outcome of critical life-saving missions. This dataset aims into helping the challenging task of multi-castaway tracking and following using a single UAV. Due to the difficulty and data protection of capturing footage of people in the sea, we have captured a dataset of buoys in order to conduct experiments on multi-castaway tracking and following. A paper on multi-castaway tracking and following technical details and experiments will be published soon using this dataset. The dataset consists of top-view images of buoys from various altitudes on the coasts of Larnaca and Protaras in Cyprus. Images were captured at different altitudes in order to challenge object detectors to be able to detect smaller objects in case a UAV needs to track multiple targets, which leads to flying at a higher altitude. There is only one class annotated on all images which is labeled as 'buoy'. Additionally, all annotations were converted into VOC and COCO formats for training in numerous frameworks. The dataset consists of the following images and detection objects (buoys): Subset Images Buoys Training 10814 14811 Validation 1350 1865 Testing 1352 1827 It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing). NOTE If you use this dataset in your research/publication please cite us using the following Antreas Anastasiou, Rafael Makrigiorgis, & Panayiotis Kolios. (2022). Aerial Water Buoys Dataset (1.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7288444
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Seatizen Atlas image dataset
This repository contains the resources and tools for accessing and utilizing the annotated images within the Seatizen Atlas dataset, as described in the paper Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery.
Download the Dataset
This annotated dataset is part of a bigger dataset composed of labeled and unlabeled images. To access information about the whole dataset, please visit the Zenodo repository and follow the download instructions provided.
Scientific Publication
If you use this dataset in your research, please consider citing the associated paper:
@article{Contini2025, author = {Matteo Contini and Victor Illien and Mohan Julien and Mervyn Ravitchandirane and Victor Russias and Arthur Lazennec and Thomas Chevrier and Cam Ly Rintz and Léanne Carpentier and Pierre Gogendeau and César Leblanc and Serge Bernard and Alexandre Boyer and Justine Talpaert Daudon and Sylvain Poulain and Julien Barde and Alexis Joly and Sylvain Bonhommeau}, doi = {10.1038/s41597-024-04267-z}, issn = {2052-4463}, issue = {1}, journal = {Scientific Data}, pages = {67}, title = {Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery}, volume = {12}, url = {https://doi.org/10.1038/s41597-024-04267-z}, year = {2025},}
For detailed information about the dataset and experimental results, please refer to the previous paper.
Overview
The Seatizen Atlas dataset includes 14,492 multilabel and 1,200 instance segmentation annotated images. These images are useful for training and evaluating AI models for marine biodiversity research. The annotations follow standards from the Global Coral Reef Monitoring Network (GCRMN).
Annotation Details
Annotation Types:
Multilabel Convention: Identifies all observed classes in an image.
Instance Segmentation: Highlights contours of each instance for each class.
List of Classes
Algae
Algal Assemblage
Algae Halimeda
Algae Coralline
Algae Turf
Coral
Acropora Branching
Acropora Digitate
Acropora Submassive
Acropora Tabular
Bleached Coral
Dead Coral
Gorgonian
Living Coral
Non-acropora Millepora
Non-acropora Branching
Non-acropora Encrusting
Non-acropora Foliose
Non-acropora Massive
Non-acropora Coral Free
Non-acropora Submassive
Seagrass
Syringodium Isoetifolium
Thalassodendron Ciliatum
Habitat
Rock
Rubble
Sand
Other Organisms
Thorny Starfish
Sea Anemone
Ascidians
Giant Clam
Fish
Other Starfish
Sea Cucumber
Sea Urchin
Sponges
Turtle
Custom Classes
Blurred
Homo Sapiens
Human Object
Trample
Useless
Waste
These classes reflect the biodiversity and variety of habitats captured in the Seatizen Atlas dataset, providing valuable resources for training AI models in marine biodiversity research.
Usage Notes
The annotated images are available for non-commercial use. Users are requested to cite the related publication in any resulting works. A GitHub repository has been set up to facilitate data reuse and sharing: GitHub Repository.
Code Availability
All related codes for data processing, downloading, and AI model training can be found in the following GitHub repositories:
Plancha Workflow
Zenodo Tools
DinoVdeau Model
Acknowledgements
This dataset and associated research have been supported by several organizations, including the Seychelles Islands Foundation, Réserve Naturelle Marine de la Réunion, and Monaco Explorations, among others.
For any questions or collaboration inquiries, please contact seatizen.ifremer@gmail.com.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The RT-BENE dataset is licensed under CC BY-NC-SA 4.0. Commercial usage is not permitted. If you use our blink estimation code or dataset, please cite the relevant paper: @inproceedings{CortaceroICCV2019W, author={Kevin Cortacero and Tobias Fischer and Yiannis Demiris}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision Workshops}, title = {RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments}, year = {2019}, } More information can be found on the Personal Robotic Lab's website: https://www.imperial.ac.uk/personal-robotics/software/. Overview We manually annotated images that are contained in the "noglasses" part of the RT-GENE dataset with blink annotations. This dataset contains the extracted eye image patches and associated annotations. In particular, rt_bene_subjects.csv is an overview CSV file with the following columns: id subject csv file path to left eye images path to right eye images training/validation/discarded category fold-id for the 3-fold evaluation. Each individual "blink_labels" CSV file (s000_blink_labels.csv to s016_blink_labels.csv) contains two columns: image file name label, where 0.0 is the annotation for open eyes, 1.0 for blinks and 0.5 for annotator disagreement (these images are discarded) Associated code Please see the code repository for code allowing to train and evaluate a deep neural network based on the RT-BENE dataset. The code repository also links to pre-trained models and code for real-time inference.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
The Dataset
A collection of images of parking lots for vehicle detection, segmentation, and counting. Each image is manually labeled with pixel-wise masks and bounding boxes localizing vehicle instances. The dataset includes about 250 images depicting several parking areas describing most of the problematic situations that we can find in a real scenario: seven different cameras capture the images under various weather conditions and viewing angles. Another challenging aspect is the presence of partial occlusion patterns in many scenes such as obstacles (trees, lampposts, other cars) and shadowed cars. The main peculiarity is that images are taken during the day and the night, showing utterly different lighting conditions.
We suggest a three-way split (train-validation-test). The train split contains images taken during the daytime while validation and test splits include images gathered at night. In line with these splits we provide some annotation files:
train_coco_annotations.json and val_coco_annotations.json --> JSON files that follow the golden standard MS COCO data format (for more info see https://cocodataset.org/#format-data) for the training and the validation splits, respectively. All the vehicles are labeled with the COCO category 'car'. They are suitable for vehicle detection and instance segmentation.
train_dot_annotations.csv and val_dot_annotations.csv --> CSV files that contain xy coordinates of the centroids of the vehicles for the training and the validation splits, respectively. Dot annotation is commonly used for the visual counting task.
ground_truth_test_counting.csv --> CSV file that contains the number of vehicles present in each image. It is only suitable for testing vehicle counting solutions.
Citing our work
If you found this dataset useful, please cite the following paper
@inproceedings{Ciampi_visapp_2021, doi = {10.5220/0010303401850195}, url = {https://doi.org/10.5220%2F0010303401850195}, year = 2021, publisher = {{SCITEPRESS} - Science and Technology Publications}, author = {Luca Ciampi and Carlos Santiago and Joao Costeira and Claudio Gennaro and Giuseppe Amato}, title = {Domain Adaptation for Traffic Density Estimation}, booktitle = {Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications} }
and this Zenodo Dataset
@dataset{ciampi_ndispark_6560823, author = {Luca Ciampi and Carlos Santiago and Joao Costeira and Claudio Gennaro and Giuseppe Amato}, title = {{Night and Day Instance Segmented Park (NDISPark) Dataset: a Collection of Images taken by Day and by Night for Vehicle Detection, Segmentation and Counting in Parking Areas}}, month = may, year = 2022, publisher = {Zenodo}, version = {1.0.0}, doi = {10.5281/zenodo.6560823}, url = {https://doi.org/10.5281/zenodo.6560823} }
Contact Information
If you would like further information about the dataset or if you experience any issues downloading files, please contact us at luca.ciampi@isti.cnr.it
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation and even the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we share a new dataset composed of a large number of synthetic and natural printed face images. Such a dataset can be used with several computer vision and machine learning approaches for two tasks: pinpointing the printer source of a document and detecting printed pictures generated by deep fakes. When using the dataset, don't forget to cite our paper: @Article{jimaging7030050, AUTHOR = {Ferreira, Anselmo and Nowroozi, Ehsan and Barni, Mauro}, TITLE = {VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents}, JOURNAL = {Journal of Imaging}, VOLUME = {7}, YEAR = {2021}, NUMBER = {3}, ARTICLE-NUMBER = {50}, URL = {https://www.mdpi.com/2313-433X/7/3/50}, ISSN = {2313-433X}, DOI = {10.3390/jimaging7030050} }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tone mapping operators (TMO) are functions that map high dynamic range (HDR) images to a standard dynamic range (SDR), while aiming to preserve the perceptual cues of a scene that govern its visual quality. Despite the increasing number of studies on quality assessment of tone mapped images, current subjective quality datasets have relatively small numbers of images and subjective opinions. Moreover, existing challenges in transferring laboratory experiments to crowdsourcing platforms put a barrier for collecting large-scale datasets through crowdsourcing. We address these challenges and propose the RealVision-TMO (RV-TMO), a large-scale tone mapped image quality dataset. RV-TMO contains 250 unique HDR images, their tone mapped versions obtained using four TMOs and pairwise comparison results from seventy unique observers for each pair. This dataset is published as part of the Journal paper titled as " RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of Tone Mapped Images". If you are using this dataset in your work, please cite the paper below: @ARTICLE{9872141, author={Ak, Ali and Goswami, Abhishek and Hauser, Wolf and Le Callet, Patrick and Dufaux, Frederic}, journal={IEEE Transactions on Multimedia}, title={RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of Tone Mapped Images}, year={2022}, pages={1-12}, doi={10.1109/TMM.2022.3203211}}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Authors: Kiri L. Wagstaff, Steven Lu, Gary Doran, Lukas Mandrake Contact: you.lu@jpl.nasa.gov
This data set contains a total of 73,031 landmarks. 10,433 landmarks were detected and extracted from 180 HiRISE browse images, and 62,598 landmarks were augmented from 10,433 original landmarks. For each original landmark, we cropped a square bounding box that includes the full extent of the landmark plus a 30-pixel margin to left, right, top and bottom. Each cropped landmark was resized to 227x227 pixels, and then was augmented to generate 6 additional landmarks using the following methods:
Contents: - map-proj-v3/: Directory containing individual cropped landmark images - labels-map-proj-v3.txt: Class labels (ids) for each landmark image - landmarks_map-proj-v3_classmap.csv: Dictionary that maps class ids to semantic names
Attribution: If you use this data set in your own work, please cite this DOI: 10.5281/zenodo.2538136
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
This dataset contains two hyperspectral and one multispectral anomaly detection images, and their corresponding binary pixel masks. They were initially used for real-time anomaly detection in line-scanning, but they can be used for any anomaly detection task.
They are in .npy file format (will add tiff or geotiff variants in the future), with the image datasets being in the order of (height, width, channels). The SNP dataset was collected using sentinelhub, and the Synthetic dataset was collected from AVIRIS. The Python code used to analyse these datasets can be found at: https://github.com/WiseGamgee/HyperAD
How to Get Started
All that is needed to load these datasets is Python (preferably 3.8+) and the NumPy package. Example code for loading the Beach Dataset if you put it in a folder called "data" with the python script is:
import numpy as np
hsi_array = np.load("data/beach_hsi.npy") n_pixels, n_lines, n_bands = hsi_array.shape print(f"This dataset has {n_pixels} pixels, {n_lines} lines, and {n_bands}.")
mask_array = np.load("data/beach_mask.npy") m_pixels, m_lines = mask_array.shape print(f"The corresponding anomaly mask is {m_pixels} pixels by {m_lines} lines.")
Citing the Datasets
If you use any of these datasets, please cite the following paper:
@article{garske2024erx, title={ERX - a Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line-Scanning}, author={Garske, Samuel and Evans, Bradley and Artlett, Christopher and Wong, KC}, journal={arXiv preprint arXiv:2408.14947}, year={2024},}
If you use the beach dataset please cite the following paper as well (original source):
@article{mao2022openhsi, title={OpenHSI: A complete open-source hyperspectral imaging solution for everyone}, author={Mao, Yiwei and Betters, Christopher H and Evans, Bradley and Artlett, Christopher P and Leon-Saval, Sergio G and Garske, Samuel and Cairns, Iver H and Cocks, Terry and Winter, Robert and Dell, Timothy}, journal={Remote Sensing}, volume={14}, number={9}, pages={2244}, year={2022}, publisher={MDPI} }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is composed of 121 pairs of correlated images. Each pair contains one image of a copper ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).
The sample came from a copper ore from Yauri Cusco (Peru) with a complex mineralogy, mainly composed of sulfides, oxides, silicates, and native copper. It was classified by size. The fraction +74-100 μm was cold mounted with epoxy resin and subsequently ground and polished.
Correlative microscopy was employed for image acquisition. Thus, 121 fields were imaged on a reflected light microscope with a 20× (NA 0.40) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 1017×753 pixels with a resolution of 0.53 µm/pixel. As matter of fact, some images (the images No. 2, 3, 24, 25, 46, 47, 69, 91, and 113) have slightly smaller sizes because they were cropped during the registration procedure to correct co-localization errors of the order of a few pixels. Finally, the images from SEM were thresholded to generate the reference images.
Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).
This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model (Chen et al., 2018) that reached mean values of 90.56% and 92.12% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.
For further questions and suggestions, please do not hesitate to contact us.
Contact email: ogomes@gmail.com
If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5020566
Please also cite this paper, which provides additional details about the dataset:
Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.