99 datasets found

Cu dataset – A copper ore labeled images dataset for segmentation training...
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Otávio da Fonseca Martins Gomes; Otávio da Fonseca Martins Gomes; Sidnei Paciornik; Sidnei Paciornik; Michel Pedro Filippo; Michel Pedro Filippo; Gilson Alexandre Ostwald Pedro da Costa; Gilson Alexandre Ostwald Pedro da Costa; Guilherme Lucio Abelha Mota; Guilherme Lucio Abelha Mota (2021). Cu dataset – A copper ore labeled images dataset for segmentation training and testing [Dataset]. http://doi.org/10.5281/zenodo.5020566
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5020566
Dataset updated
Jul 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Otávio da Fonseca Martins Gomes; Otávio da Fonseca Martins Gomes; Sidnei Paciornik; Sidnei Paciornik; Michel Pedro Filippo; Michel Pedro Filippo; Gilson Alexandre Ostwald Pedro da Costa; Gilson Alexandre Ostwald Pedro da Costa; Guilherme Lucio Abelha Mota; Guilherme Lucio Abelha Mota
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is composed of 121 pairs of correlated images. Each pair contains one image of a copper ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).

The sample came from a copper ore from Yauri Cusco (Peru) with a complex mineralogy, mainly composed of sulfides, oxides, silicates, and native copper. It was classified by size. The fraction +74-100 μm was cold mounted with epoxy resin and subsequently ground and polished.

Correlative microscopy was employed for image acquisition. Thus, 121 fields were imaged on a reflected light microscope with a 20× (NA 0.40) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 1017×753 pixels with a resolution of 0.53 µm/pixel. As matter of fact, some images (the images No. 2, 3, 24, 25, 46, 47, 69, 91, and 113) have slightly smaller sizes because they were cropped during the registration procedure to correct co-localization errors of the order of a few pixels. Finally, the images from SEM were thresholded to generate the reference images.

Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).

This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model (Chen et al., 2018) that reached mean values of 90.56% and 92.12% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.

For further questions and suggestions, please do not hesitate to contact us.

Contact email: ogomes@gmail.com

If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5020566

Please also cite this paper, which provides additional details about the dataset:

Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.
u
visuAAL Skin Segmentation Dataset
observatorio-cientifico.ua.es
data.niaid.nih.gov
+2more
Updated 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hashemifard, Kooshan; Florez-Revuelta, Francisco; Hashemifard, Kooshan; Florez-Revuelta, Francisco (2022). visuAAL Skin Segmentation Dataset [Dataset]. https://observatorio-cientifico.ua.es/documentos/668fc45eb9e7c03b01bdb3c6
Explore at:
Dataset updated
2022
Authors
Hashemifard, Kooshan; Florez-Revuelta, Francisco; Hashemifard, Kooshan; Florez-Revuelta, Francisco
Description
The visuAAL Skin Segmentation Dataset contains 46,775 high quality images divided into a training set with 45,623 images, and a validation set with 1,152 images. Skin areas have been obtained automatically from the FashionPedia garment dataset. The process to extract the skin areas is explained in detail in the paper 'From Garment to Skin: The visuAAL Skin Segmentation Dataset'. If you use the visuAAL Skin Segmentation Dataset, please, cite: https://doi.org/10.5281/zenodo.6973396 https://doi.org/10.1007/978-3-031-13321-3_6 How to use: Download the FashionPedia dataset from https://fashionpedia.github.io/home/Fashionpedia_download.html Download the visuAAL Skin Segmentation Dataset. The dataset consists of two folders, namely train_masks and val_masks. Each folder corresponds to the training and validation sets in the original FashionPedia dataset. After extracting the images from FashionPedia, for each image existing in the visuAAL skin segmentation dataset, the original image can be found with the same name (file_name in the annotations file). A sample of image data in the FashionPedia dataset is: {'id': 12305, 'width': 680, 'height': 1024, 'file_name': '064c8022b32931e787260d81ed5aafe8.jpg', 'license': 4, 'time_captured': 'March-August, 2018', 'original_url': 'https://farm2.staticflickr.com/1936/8607950470_9d9d76ced7_o.jpg', 'isstatic': 1, 'kaggle_id': '064c8022b32931e787260d81ed5aafe8'} NOTE: Not all the images in the FashionPedia dataset have the correponding skin mask in the visuAAL Skin Segmentation Dataset, as there are images in which only garment parts and not people are present in them. These images were removed when creating the visuAAL Skin Segmentation Dataset. However, all the instances in the visuAAL skin segmentation dataset have their corresponding match in the FashionPedia dataset.
Z
FeM dataset – An iron ore labeled images dataset for segmentation training...
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jul 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gomes, Otávio da Fonseca Martins; Paciornik, Sidnei; Filippo, Michel Pedro; da Costa, Gilson Alexandre Ostwald Pedro; Mota, Guilherme Lucio Abelha (2021). FeM dataset – An iron ore labeled images dataset for segmentation training and testing [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5014699
Explore at:
Dataset updated
Jul 16, 2021
Dataset provided by
Dept. of Chemical and Materials Engineering, PUC-Rio
Dept. of Informatics and Computer Science, Rio de Janeiro State University (UERJ)
Centre for Mineral Technology
Postgraduate Program in Computational Sciences, Rio de Janeiro State University (UERJ)
Authors
Gomes, Otávio da Fonseca Martins; Paciornik, Sidnei; Filippo, Michel Pedro; da Costa, Gilson Alexandre Ostwald Pedro; Mota, Guilherme Lucio Abelha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is composed of 81 pairs of correlated images. Each pair contains one image of an iron ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).

The sample came from an itabiritic iron ore concentrate from Quadrilátero Ferrífero (Brazil) mainly composed of hematite and quartz, with little magnetite and goethite. It was classified by size and concentrated with a dense liquid. Then, the fraction -149+105 μm with density greater than 3.2 was cold mounted with epoxy resin and subsequently ground and polished.

Correlative microscopy was employed for image acquisition. Thus, 81 fields were imaged on a reflected light microscope with a 10× (NA 0.20) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 999×756 pixels with a resolution of 1.05 µm/pixel. Finally, the images from SEM were thresholded to generate the reference images.

Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).

This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model that reached mean values of 91.43% and 93.13% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.

For further questions and suggestions, please do not hesitate to contact us.

Contact email: ogomes@gmail.com

If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5014700

Please also cite this paper, which provides additional details about the dataset:

Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.
Annotated Ultrasound Liver images Dataset
kaggle.com
zip
Updated Apr 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Orvile (2025). Annotated Ultrasound Liver images Dataset [Dataset]. https://www.kaggle.com/datasets/orvile/annotated-ultrasound-liver-images-dataset
Explore at:
zip(70388588 bytes)Available download formats
Dataset updated
Apr 2, 2025
Authors
Orvile
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Annotated Ultrasound Liver Images

This dataset contains a collection of annotated ultrasound images of the liver, designed to aid in the development of computer vision models for liver analysis, segmentation, and disease detection. The annotations include outlines of the liver and liver mass regions, as well as classifications into benign, malignant, and normal cases.

Creators: Xu Yiming, Zheng Bowen, Liu Xiaohong, Wu Tao, Ju Jinxiu, Wang Shijie, Lian Yufan, Zhang Hongjun, Liang Tong, Sang Ye, Jiang Rui, Wang Guangyu, Ren Jie, Chen Ting

Published: November 2, 2022 Version: v1 DOI: 10.5281/zenodo.7272660

Dataset Overview

This dataset provides ultrasound images of the liver with detailed annotations. The annotations highlight the liver itself and any liver mass regions present. The images are categorized into three classes:

Benign: Images showing benign liver conditions.

Malignant: Images showing malignant liver conditions.

Normal: Images of healthy livers.

Files Included

The dataset is organized into three zip files:

Benign.zip (16.9 MB): Contains ultrasound images classified as benign. (md5: c37fef0cb2730236a79ef57e5315995e)

Malignant.zip (46.9 MB): Contains ultrasound images classified as malignant. (md5: 63894a9e5654a69c3b94bda84071dfb0)

Normal.zip (6.6 MB): Contains ultrasound images of normal livers. (md5: a7e16299b2cf12ca4a6c3468d2e4978f)

Annotations

The ultrasound images have been annotated to show:

Outlines of the liver.

Regions of liver masses (where applicable).

These annotations make the dataset suitable for tasks such as segmentation of the liver and liver masses, as well as classification of liver conditions.

Potential Uses

This dataset can be valuable for a variety of applications, including:

Training and evaluating deep learning models for liver disease detection.

Developing algorithms for automatic segmentation of the liver and liver masses in ultrasound images.

Research in medical image analysis and computer-aided diagnosis.

Educational purposes in medical imaging and related fields.

Copyright and Citation

This dataset is subject to copyright. Any use of the data must include appropriate acknowledgement and credit. Please contact the authors of the published data and cite the publication and the provided URL.

Citation:

Xu Yiming, Zheng Bowen, Liu Xiaohong, Wu Tao, Ju Jinxiu, Wang Shijie, Lian Yufan, Zhang Hongjun, Liang Tong, Sang Ye, Jiang Rui, Wang Guangyu, Ren Jie, & Chen Ting. (2022). Annotated Ultrasound Liver images [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7272660

APA Style Citation:

Xu, Y., Bowen, Z., Xiaohong, L., Tao, W., Jinxiu, J., Shijie, W., Yufan, L., Hongjun, Z., Tong, L., Ye, S., Rui, J., Guangyu, W., Jie, R., & Ting, C. (2022). Annotated Ultrasound Liver images [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7272660

License

Creative Commons Attribution 4.0 International

We hope this dataset is helpful for your research and projects!

🙏 If you find this dataset useful, please consider giving it an upvote! 👍 Thank you! 😊
CODEBRIM: COncrete DEfect BRidge IMage Dataset
zenodo.org
data-staging.niaid.nih.gov
+2more
bin, zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Mundt; Sagnik Majumder; Sreenivas Murali; Panagiotis Panetsos; Visvanathan Ramesh; Martin Mundt; Sagnik Majumder; Sreenivas Murali; Panagiotis Panetsos; Visvanathan Ramesh (2020). CODEBRIM: COncrete DEfect BRidge IMage Dataset [Dataset]. http://doi.org/10.5281/zenodo.2620293
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2620293
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martin Mundt; Sagnik Majumder; Sreenivas Murali; Panagiotis Panetsos; Visvanathan Ramesh; Martin Mundt; Sagnik Majumder; Sreenivas Murali; Panagiotis Panetsos; Visvanathan Ramesh
Description
CODEBRIM: COncrete DEfect BRidge IMage Dataset for multi-target multi-class concrete defect classification in computer vision and machine learning.

Dataset as presented and detailed in our CVPR 2019 publication: http://openaccess.thecvf.com/content_CVPR_2019/html/Mundt_Meta-Learning_Convolutional_Neural_Architectures_for_Multi-Target_Concrete_Defect_Classification_With_CVPR_2019_paper.html or https://arxiv.org/abs/1904.08486 . If you make use of the dataset please cite it as follows:

"Martin Mundt, Sagnik Majumder, Sreenivas Murali, Panagiotis Panetsos, Visvanathan Ramesh. Meta-learning Convolutional Neural Architectures for Multi-target Concrete Defect Classification with the COncrete DEfect BRidge IMage Dataset. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019"

We offer a supplementary GitHub repository with code to reproduce the paper and data loaders: https://github.com/ccc-frankfurt/meta-learning-CODEBRIM

For ease of use we provide the dataset in multiple different versions.

Files contained:
* CODEBRIM_original_images: contains the original full-resolution images and bounding box annotations
* CODEBRIM_cropped_dataset: contains the extracted crops/patches with corresponding class labels from the bounding boxes
* CODEBRIM_classification_dataset: contains the cropped patches with corresponding class labels split into training, validation and test sets for machine learning
* CODEBRIM_classification_balanced_dataset: similar to "CODEBRIM_classification_dataset" but with the exact replication of training images to balance the dataset in order to reproduce results obtained in the paper.
Z
Mars orbital image (HiRISE) labeled data set
data-staging.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
You Lu; Kiri Wagstaff (2020). Mars orbital image (HiRISE) labeled data set [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_1048300
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Jet Propulsion Laboratory
Jet Propulsion Laboratory
Authors
You Lu; Kiri Wagstaff
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This data set contains 3820 landmarks that were extracted from 168 HiRISE images. The landmarks were detected in HiRISE browse images. For each landmark, we cropped a square bounding box the included the full extent of the landmark plus a 30-pixel margin to left, right, top, and bottom. Each cropped image was then resized to 227x227 pixels.

Contents:

map-proj/: Directory containing individual cropped landmark images

labels-map-proj.txt: Class labels (ids) for each landmark image

landmark_mp.py: Python dictionary that maps class ids to semantic names

Attribution:

If you use this data set in your own work, please cite this DOI: 10.5281/zenodo.1048301

Please also cite this paper, which provides additional details about the data set.

Kiri L. Wagstaff, You Lu, Alice Stanboli, Kevin Grimes, Thamme Gowda, and Jordan Padams. "Deep Mars: CNN Classification of Mars Imagery for the PDS Imaging Atlas." Proceedings of the Thirtieth Annual Conference on Innovative Applications of Artificial Intelligence, 2018.
🪙 Coin Image Dataset
kaggle.com
zip
Updated Sep 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2023). 🪙 Coin Image Dataset [Dataset]. https://www.kaggle.com/datasets/mexwell/coin-image-dataset
Explore at:
zip(342484543 bytes)Available download formats
Dataset updated
Sep 11, 2023
Authors
mexwell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The coin image dataset is a dataset of 60 classes of Roman Republican coins. Each class is represented by three coin images of the reverse side acquired at Coin Cabinet of the Museum of Fine Arts in Vienna, Austria.

Technical Details

The image filenames have the following syntax: class[classid]_image[1-3].png The dataset also contains a CSV-file “classes.csv” which maps the class-IDs to the reference numbers defined by Crawford’s standard reference book [2].

References

[1] Zambanini S., Kampel M. “Coarse-to-Fine Correspondence Search for Classifying Ancient Coins“, 2nd ACCV Workshop on e-Heritage, pp. 25-36, Daejeon, South Korea, November 2012. (pdf) [2] Crawford, M.H.: “Roman Republican Coinage”, 2 vols., Cambridge University Press, 1974.

Citation

Sebastian Zambanini. (2014). Coin Image Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4454549

Original Data

Acknowlegement

Foto von Priyansh Patidar auf Unsplash
Data from: On the Role of Images for Analyzing Claims in Social Media
data.europa.eu
unknown
Updated Mar 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2021). On the Role of Images for Analyzing Claims in Social Media [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-4592249?locale=de
Explore at:
unknownAvailable download formats
Dataset updated
Mar 9, 2021
Dataset authored and provided by
Zenodohttp://zenodo.org/
Description
This is a multimodal dataset used in the paper "On the Role of Images for Analyzing Claims in Social Media", accepted at CLEOPATRA-2021 (2nd International Workshop on Cross-lingual Event-centric Open Analytics), co-located with The Web Conference 2021. The four datasets are curated for two different tasks that broadly come under fake news detection. Originally, the datasets were released as part of challenges or papers for text-based NLP tasks and are further extended here with corresponding images. 1. clef_en and clef_ar are English and Arabic Twitter datasets for claim check-worthiness detection released in CLEF CheckThat! 2020 Barrón-Cedeno et al. [1]. 2. lesa is an English Twitter dataset for claim detection released by Gupta et al.[2] 3. mediaeval is an English Twitter dataset for conspiracy detection released in MediaEval 2020 Workshop by Pogorelov et al.[3] The dataset details like data curation and annotation process can be found in the cited papers. Datasets released here with corresponding images are relatively smaller than the original text-based tweets. The data statistics are as follows: 1. clef_en: 281 2. clef_ar: 2571 3. lesa: 1395 4. mediaeval: 1724 Each folder has two sub-folders and a json file data.json that consists of crawled tweets. Two sub-folders are: 1. images: This Contains crawled images with the same name as tweet-id in data.json. 2. splits: This contains 5-fold splits used for training and evaluation in our paper. Each file in this folder is a csv with two columns
Pre-processed (in Detectron2 and YOLO format) planetary images and boulder...
data.europa.eu
data-staging.niaid.nih.gov
+1more
unknown
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Pre-processed (in Detectron2 and YOLO format) planetary images and boulder labels collected during the BOULDERING Marie Skłodowska-Curie Global fellowship [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-14250874?locale=no
Explore at:
unknown(601409488)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This database contains 4976 planetary images of boulder fields located on Earth, Mars and Moon. The data was collected during the BOULDERING Marie Skłodowska-Curie Global fellowship between October 2021 and 2024. The data was already splitted into train, validation and test datasets, but feel free to re-organize the labels at your convenience. For each image, all of the boulder outlines within the image were carefully mapped in QGIS. More information about the labelling procedure can be found in the following manuscript (https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023JE008013). This dataset differs from the previous dataset included along with the manuscript https://zenodo.org/records/8171052, as it contains more mapped images, especially of boulder populations around young impact structures on the Moon (cold spots). In addition, the boulder outlines were also pre-processed so that it can be ingested directly in YOLOv8. A description of what is what is given in the README.txt file (in addition in how to load the custom datasets in Detectron2 and YOLO). Most of the other files are mostly self-explanatory. Please see previous dataset or manuscript for more information. If you want to have more information about specific lunar and martian planetary images, the IDs of the images are still available in the name of the file. Use this ID to find more information (e.g., M121118602_00875_image.png, ID M121118602 ca be used on https://pilot.wr.usgs.gov/). I will also upload the raw data from which this pre-processed dataset was generated (see https://zenodo.org/records/14250970). Thanks to this database, you can easily train a Detectron2 Mask R-CNN or YOLO instance segmentation models to automatically detect boulders. How to cite: Please refer to the "how to cite" section of the readme file of https://github.com/astroNils/YOLOv8-BeyondEarth. Structure: . └── boulder2024/ ├── jupyter-notebooks/ │ └── REGISTERING_BOULDER_DATASET_IN_DETECTRON2.ipynb ├── test/ │ └── images/ │ ├──
Data from: Ichthyofauna (Osteichthyes, Actinopterygii) from tributaries of...
demo.gbif.org
Updated Oct 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioFresh (2025). Ichthyofauna (Osteichthyes, Actinopterygii) from tributaries of the Beni and Mamoré rivers in the Llanos de Moxos wetland of the Bolivian Amazon [Dataset]. http://doi.org/10.15468/sjkfca
Explore at:
Unique identifier
https://doi.org/10.15468/sjkfca
Dataset updated
Oct 3, 2025
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
BioFresh
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Sep 16, 2023 - Sep 30, 2023
Area covered

Description
This dataset is an update to Supplemental Material Table S1 from the paper: Yunoki T, Echeverria AR, Cholima RB, Miranda Ch. G, Moreno FA (2025). Ichthyofauna (Osteichthyes, Actinopterygii) from tributaries of the Beni and Mamoré rivers in the Llanos de Moxos wetland of the Bolivian Amazon. Check List 21: 318–346.
Specimen images associated with the occurrence records have been deposited in Zenodo across several archived sets. Each image reference in the associatedMedia field includes:
A direct URL linking to the individual image file hosted on Zenodo (e.g., https://zenodo.org/records/.../files/image.JPG
The DOI of the complete dataset in which the image is archived (e.g., https://doi.org/...)
This format ensures both persistent citation via dataset DOIs and straightforward access to individual images. The current version improves traceability between image files and their corresponding occurrence records.
Image sets are available at the following DOIs:
https://doi.org/10.5281/zenodo.15748915
https://doi.org/10.5281/zenodo.15749855
https://doi.org/10.5281/zenodo.15750205
https://doi.org/10.5281/zenodo.15750430
https://doi.org/10.5281/zenodo.15750726
https://doi.org/10.5281/zenodo.15754796
https://doi.org/10.5281/zenodo.15755252
https://doi.org/10.5281/zenodo.15756212
https://doi.org/10.5281/zenodo.15756460
r
The Klarna Product-Page Dataset
researchdata.se
demo.researchdata.se
+2more
Updated Nov 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexandra Hotti; Riccardo Sven Risuleo; Stefan Magureanu; Aref Moradi; Jens Lagergren (2024). The Klarna Product-Page Dataset [Dataset]. http://doi.org/10.5281/zenodo.12605480
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.12605480
Dataset updated
Nov 7, 2024
Dataset provided by
KTH Royal Institute of Technology
Authors
Alexandra Hotti; Riccardo Sven Risuleo; Stefan Magureanu; Aref Moradi; Jens Lagergren
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Description

The Klarna Product Page Dataset is a dataset of publicly available pages corresponding to products sold online on various e-commerce websites. The dataset contains offline snapshots of 51,701 product pages collected from 8,175 distinct merchants across 8 different markets (US, GB, SE, NL, FI, NO, DE, AT) between 2018 and 2019. On each page, analysts labelled 5 elements of interest: the price of the product, its image, its name and the add-to-cart and go-to-cart buttons (if found). These labels are present in the HTML code as an attribute called klarna-ai-label taking one of the values: Price, Name, Main picture, Add to cart and Cart.

The snapshots are available in 3 formats: as MHTML files (~24GB), as WebTraversalLibrary (WTL) snapshots (~7.4GB), and as screeshots (~8.9GB). The MHTML format is less lossy, a browser can render these pages though any Javascript on the page is lost. The WTL snapshots are produced by loading the MHTML pages into a chromium-based browser. To keep the WTL dataset compact, the screenshots of the rendered MTHML are provided separately; here we provide the HTML of the rendered DOM tree and additional page and element metadata with rendering information (bounding boxes of elements, font sizes etc.). The folder structure of the screenshot dataset is identical to the one the WTL dataset and can be used to complete the WTL snapshots with image information. For convenience, the datasets are provided with a train/test split in which no merchants in the test set are present in the training set.

Corresponding Publication

For more information about the contents of the datasets (statistics etc.) please refer to the following TMLR paper.

GitHub Repository

The code needed to re-run the experiments in the publication accompanying the dataset can be accessed here.

Citing

If you found this dataset useful in your research, please cite the paper as follows:

@article{hotti2024the, title={The Klarna Product Page Dataset: Web Element Nomination with Graph Neural Networks and Large Language Models}, author={Alexandra Hotti and Riccardo Sven Risuleo and Stefan Magureanu and Aref Moradi and Jens Lagergren}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2024}, url={https://openreview.net/forum?id=zz6FesdDbB}, note={} }

Data from: Computational 3D resolution enhancement for optical coherence...

data.niaid.nih.gov

Updated Jul 12, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Jos de Wit; George-Othon Glentis; Jeroen Kalkman (2024). Computational 3D resolution enhancement for optical coherence tomography with a narrowband visible light source [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7870794

Explore at:

Dataset updated

Jul 12, 2024

Dataset provided by

University of Peloponnese, Tripolis, Greece
Delft University of Technology, Delft, The Netherlands

Authors

Jos de Wit; George-Othon Glentis; Jeroen Kalkman

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository contains the code and data underlying the publication "Computational 3D resolution enhancement for optical coherence tomography with a narrowband visible light source" in Biomedical Optics Express 14, 3532-3554 (2023) (doi.org/10.1364/BOE.487345).

The reader is free to use the scripts and data in this depository, as long as the manuscript is correctly cited in their work. For further questions, please contact the corresponding author.

Description of the code and datasets

Table 1 describes all the Matlab and Python scripts in this depository. Table 2 describes the datasets. The input datasets are the phase corrected datasets, as the raw data is large in size and phase correction using a coverslip as reference is rather straightforward. Processed datasets are also added to the repository to allow for running only a limited number of scripts, or to obtain for example the aberration corrected data without the need to use python. Note that the simulation input data (input_simulations_pointscatters_SLDshape_98zf_noise75.mat) is generated with random noise, so if this is overwritten de results may slightly vary. Also the aberration correction is done with random apertures, so the processed aberration corrected data (exp_pointscat_image_MIAA_ISAM_CAO.mat and exp_leaf_image_MIAA_ISAM_CAO.mat) will also slightly change if the aberration correction script is run anew. The current processed datasets are used as basis for the figures in the publication. For details on the implementation we refer to the publication.

Table 1: The Matlab and Python scripts with their description


    Script name
    Description


    MIAA_ISAM_processing.m
    This scripts performs the DFT, RFIAA and MIAA processing of the phase-corrected data that can be loaded from the datasets. Afterwards it also applies ISAM on the DFT and MIAA data and plots the results in a figure (via the scripts plot_figure3, plot_figure5 and plot_simulationdatafigure).


    resolution_analysis_figure4.m
    This figure loads the data from the point scatterers (absolute amplitude data), seeks the point scatterrers and fits them to obtain the resolution data. Finally it plots figure 4 of the publication.


    fiaa_oct_c1.m, oct_iaa_c1.m, rec_fiaa_oct_c1.m, rfiaa_oct_c1.m 
    These four functions are used to apply fast IAA and MIAA. See script MIAA_ISAM_processing.m for their usage.


    viridis.m, morgenstemning.m
    These scripts define the colormaps for the figures.


    plot_figure3.m, plot_figure5.m, plot_simulationdatafigure.m
    These scripts are used to plot the figures 3 and 5 and a figure with simulation data. These scripts are executed at the end of script MIAA_ISAM_processing.m.


    Python script: computational_adaptive_optics_script.py
    Python script that applied computational adaptive optics to obtain the data for figure 6 of the manuscript.


    Python script: zernike_functions2.py
    Python script that gives the values and carthesian derrivatives of the Zernike polynomials.


    figure6_ComputationalAdaptiveOptics.m
    Script that loads the CAO data that was saved in Python, analyzes the resolution, and plots figure 6.


    Python script: OCTsimulations_3D_script2.py
    Python script simulates OCT data, adds noise and saves it as .mat file for use in the matlab script above.


    Python script: OCTsimulations2.py
    Module that contains a python class that can be used to simulate 3D OCT datasets based on a Gaussian beam.


    Matlab toolbox DIPimage 2.9.zip
    Dipimage is used in the scripts. The toolbox can be downloaded online or this zip can be used.






The datasets in this Zenodo repository


    Name
    Description


    input_leafdisc_phasecorrected.mat
    Phase corrected input image of the leaf disc (used in figure 5).


    input_TiO2gelatin_004_phasecorrected.mat
    Phase corrected input image of the TiO2 in gelatin sample.


    input_simulations_pointscatters_SLDshape_98zf_noise75
    Input simulation data that, once processed, is used in figure 4.

exp_pointscat_image_DFT.mat

exp_pointscat_image_DFT_ISAM.mat

exp_pointscat_image_RFIAA.mat

exp_pointscat_image_MIAA_ISAM.mat

exp_pointscat_image_MIAA_ISAM_CAO.mat

    Processed experimental amplitude data for the TiO2 point scattering sample with respectively DFT, DFT+ISAM, RFIAA, MIAA+ISAM and MIAA+ISAM+CAO. These datasets are used for fitting in figure 4 (except for CAO), and MIAA_ISAM and MIAA_ISAM_CAO are used for figure 6.

simu_pointscat_image_DFT.mat

simu_pointscat_image_RFIAA.mat

simu_pointscat_image_DFT_ISAM.mat

simu_pointscat_image_MIAA_ISAM.mat

    Processed amplitude data from the simulation dataset, which is used in the script for figure 4 for the resolution analysis.

exp_leaf_image_MIAA_ISAM.mat

exp_leaf_image_MIAA_ISAM_CAO.mat

    Processed amplitude data from the leaf sample, with and without aberration correction which is used to produce figure 6.

exp_leaf_zernike_coefficients_CAO_normal_wmaf.mat

exp_pointscat_zernike_coefficients_CAO_normal_wmaf.mat

    Estimated Zernike coefficients and the weighted moving average of them that is used for the computational aberration correction. Some of this data is plotted in Figure 6 of the manuscript.


    input_zernike_modes.mat
    The reference Zernike modes corresponding to the data that is loaded to give the modes the proper name.

exp_pointscat_MIAA_ISAM_complex.mat

exp_leaf_MIAA_ISAM_complex

    Complex MIAA+ISAM processed data that is used as input for the computational aberration correction.

Aerial Water Buoys Dataset
data.europa.eu
zenodo.org
unknown
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Aerial Water Buoys Dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7288444?locale=hr
Explore at:
unknown(4)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Aerial Water Buoys Dataset: Over the past few years, a plethora of advancements in Unmanned Areal Vehicle (UAV) technologies have made possible advanced UAV-based search and rescue operations with transformative impact on the outcome of critical life-saving missions. This dataset aims into helping the challenging task of multi-castaway tracking and following using a single UAV. Due to the difficulty and data protection of capturing footage of people in the sea, we have captured a dataset of buoys in order to conduct experiments on multi-castaway tracking and following. A paper on multi-castaway tracking and following technical details and experiments will be published soon using this dataset. The dataset consists of top-view images of buoys from various altitudes on the coasts of Larnaca and Protaras in Cyprus. Images were captured at different altitudes in order to challenge object detectors to be able to detect smaller objects in case a UAV needs to track multiple targets, which leads to flying at a higher altitude. There is only one class annotated on all images which is labeled as 'buoy'. Additionally, all annotations were converted into VOC and COCO formats for training in numerous frameworks. The dataset consists of the following images and detection objects (buoys): Subset Images Buoys Training 10814 14811 Validation 1350 1865 Testing 1352 1827 It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing). NOTE If you use this dataset in your research/publication please cite us using the following Antreas Anastasiou, Rafael Makrigiorgis, & Panayiotis Kolios. (2022). Aerial Water Buoys Dataset (1.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7288444
Z
Seatizen Atlas image dataset
data.niaid.nih.gov
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matteo Contini; Julien Barde; Sylvain Bonhommeau; Victor Illien; Alexis Joly (2025). Seatizen Atlas image dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12819156
Explore at:
Dataset updated
Jan 15, 2025
Dataset provided by
UMR Marbec, IRD, France
INRIA Zenith, Montpellier, France
Ifremer DOI, La Réunion, France
Authors
Matteo Contini; Julien Barde; Sylvain Bonhommeau; Victor Illien; Alexis Joly
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Seatizen Atlas image dataset

This repository contains the resources and tools for accessing and utilizing the annotated images within the Seatizen Atlas dataset, as described in the paper Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery.

Download the Dataset

This annotated dataset is part of a bigger dataset composed of labeled and unlabeled images. To access information about the whole dataset, please visit the Zenodo repository and follow the download instructions provided.

Scientific Publication

If you use this dataset in your research, please consider citing the associated paper:

@article{Contini2025, author = {Matteo Contini and Victor Illien and Mohan Julien and Mervyn Ravitchandirane and Victor Russias and Arthur Lazennec and Thomas Chevrier and Cam Ly Rintz and Léanne Carpentier and Pierre Gogendeau and César Leblanc and Serge Bernard and Alexandre Boyer and Justine Talpaert Daudon and Sylvain Poulain and Julien Barde and Alexis Joly and Sylvain Bonhommeau}, doi = {10.1038/s41597-024-04267-z}, issn = {2052-4463}, issue = {1}, journal = {Scientific Data}, pages = {67}, title = {Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery}, volume = {12}, url = {https://doi.org/10.1038/s41597-024-04267-z}, year = {2025},}

For detailed information about the dataset and experimental results, please refer to the previous paper.

Overview

The Seatizen Atlas dataset includes 14,492 multilabel and 1,200 instance segmentation annotated images. These images are useful for training and evaluating AI models for marine biodiversity research. The annotations follow standards from the Global Coral Reef Monitoring Network (GCRMN).

Annotation Details

Annotation Types:

Multilabel Convention: Identifies all observed classes in an image.

Instance Segmentation: Highlights contours of each instance for each class.

List of Classes

Algae

Algal Assemblage

Algae Halimeda

Algae Coralline

Algae Turf

Coral

Acropora Branching

Acropora Digitate

Acropora Submassive

Acropora Tabular

Bleached Coral

Dead Coral

Gorgonian

Living Coral

Non-acropora Millepora

Non-acropora Branching

Non-acropora Encrusting

Non-acropora Foliose

Non-acropora Massive

Non-acropora Coral Free

Non-acropora Submassive

Seagrass

Syringodium Isoetifolium

Thalassodendron Ciliatum

Habitat

Rock

Rubble

Sand

Other Organisms

Thorny Starfish

Sea Anemone

Ascidians

Giant Clam

Fish

Other Starfish

Sea Cucumber

Sea Urchin

Sponges

Turtle

Custom Classes

Blurred

Homo Sapiens

Human Object

Trample

Useless

Waste

These classes reflect the biodiversity and variety of habitats captured in the Seatizen Atlas dataset, providing valuable resources for training AI models in marine biodiversity research.

Usage Notes

The annotated images are available for non-commercial use. Users are requested to cite the related publication in any resulting works. A GitHub repository has been set up to facilitate data reuse and sharing: GitHub Repository.

Code Availability

All related codes for data processing, downloading, and AI model training can be found in the following GitHub repositories:

Plancha Workflow

Zenodo Tools

DinoVdeau Model

Acknowledgements

This dataset and associated research have been supported by several organizations, including the Seychelles Islands Foundation, Réserve Naturelle Marine de la Réunion, and Monaco Explorations, among others.

For any questions or collaboration inquiries, please contact seatizen.ifremer@gmail.com.
Data from: RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation...
data.europa.eu
data.niaid.nih.gov
+1more
unknown
Updated Feb 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2020). RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-3685316?locale=fi
Explore at:
unknown(1615)Available download formats
Dataset updated
Feb 25, 2020
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
The RT-BENE dataset is licensed under CC BY-NC-SA 4.0. Commercial usage is not permitted. If you use our blink estimation code or dataset, please cite the relevant paper: @inproceedings{CortaceroICCV2019W, author={Kevin Cortacero and Tobias Fischer and Yiannis Demiris}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision Workshops}, title = {RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments}, year = {2019}, } More information can be found on the Personal Robotic Lab's website: https://www.imperial.ac.uk/personal-robotics/software/. Overview We manually annotated images that are contained in the "noglasses" part of the RT-GENE dataset with blink annotations. This dataset contains the extracted eye image patches and associated annotations. In particular, rt_bene_subjects.csv is an overview CSV file with the following columns: id subject csv file path to left eye images path to right eye images training/validation/discarded category fold-id for the 3-fold evaluation. Each individual "blink_labels" CSV file (s000_blink_labels.csv to s016_blink_labels.csv) contains two columns: image file name label, where 0.0 is the annotation for open eyes, 1.0 for blinks and 0.5 for annotator disagreement (these images are discarded) Associated code Please see the code repository for code allowing to train and evaluate a deep neural network based on the RT-BENE dataset. The code repository also links to pre-trained models and code for real-time inference.
Z
Data from: Night and Day Instance Segmented Park (NDISPark) Dataset: a...
data-staging.niaid.nih.gov
data.niaid.nih.gov
+2more
Updated Sep 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ciampi, Luca; Santiago, Carlos; Costeira, Joao Paulo; Gennaro, Claudio; Amato, Giuseppe (2023). Night and Day Instance Segmented Park (NDISPark) Dataset: a Collection of Images taken by Day and by Night for Vehicle Detection, Segmentation and Counting in Parking Areas [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_6560822
Explore at:
Dataset updated
Sep 11, 2023
Dataset provided by
Instituto Superior Técnico (LARSyS/IST), Lisbon, Portugal
Institute of Information Science and Technologies (ISTI-CNR), Pisa, Italy
Authors
Ciampi, Luca; Santiago, Carlos; Costeira, Joao Paulo; Gennaro, Claudio; Amato, Giuseppe
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
The Dataset

A collection of images of parking lots for vehicle detection, segmentation, and counting. Each image is manually labeled with pixel-wise masks and bounding boxes localizing vehicle instances. The dataset includes about 250 images depicting several parking areas describing most of the problematic situations that we can find in a real scenario: seven different cameras capture the images under various weather conditions and viewing angles. Another challenging aspect is the presence of partial occlusion patterns in many scenes such as obstacles (trees, lampposts, other cars) and shadowed cars. The main peculiarity is that images are taken during the day and the night, showing utterly different lighting conditions.

We suggest a three-way split (train-validation-test). The train split contains images taken during the daytime while validation and test splits include images gathered at night. In line with these splits we provide some annotation files:

train_coco_annotations.json and val_coco_annotations.json --> JSON files that follow the golden standard MS COCO data format (for more info see https://cocodataset.org/#format-data) for the training and the validation splits, respectively. All the vehicles are labeled with the COCO category 'car'. They are suitable for vehicle detection and instance segmentation.

train_dot_annotations.csv and val_dot_annotations.csv --> CSV files that contain xy coordinates of the centroids of the vehicles for the training and the validation splits, respectively. Dot annotation is commonly used for the visual counting task.

ground_truth_test_counting.csv --> CSV file that contains the number of vehicles present in each image. It is only suitable for testing vehicle counting solutions.

Citing our work

If you found this dataset useful, please cite the following paper

@inproceedings{Ciampi_visapp_2021, doi = {10.5220/0010303401850195}, url = {https://doi.org/10.5220%2F0010303401850195}, year = 2021, publisher = {{SCITEPRESS} - Science and Technology Publications}, author = {Luca Ciampi and Carlos Santiago and Joao Costeira and Claudio Gennaro and Giuseppe Amato}, title = {Domain Adaptation for Traffic Density Estimation}, booktitle = {Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications} }

and this Zenodo Dataset

@dataset{ciampi_ndispark_6560823, author = {Luca Ciampi and Carlos Santiago and Joao Costeira and Claudio Gennaro and Giuseppe Amato}, title = {{Night and Day Instance Segmented Park (NDISPark) Dataset: a Collection of Images taken by Day and by Night for Vehicle Detection, Segmentation and Counting in Parking Areas}}, month = may, year = 2022, publisher = {Zenodo}, version = {1.0.0}, doi = {10.5281/zenodo.6560823}, url = {https://doi.org/10.5281/zenodo.6560823} }

Contact Information

If you would like further information about the dataset or if you experience any issues downloading files, please contact us at luca.ciampi@isti.cnr.it
VIPPrint: A Large Scale Dataset for Colored Printed Documents Authentication...
data.europa.eu
data.niaid.nih.gov
unknown
Updated Feb 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2021). VIPPrint: A Large Scale Dataset for Colored Printed Documents Authentication and Source Linking [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-4454971?locale=cs
Explore at:
unknown(677)Available download formats
Dataset updated
Feb 14, 2021
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation and even the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we share a new dataset composed of a large number of synthetic and natural printed face images. Such a dataset can be used with several computer vision and machine learning approaches for two tasks: pinpointing the printer source of a document and detecting printed pictures generated by deep fakes. When using the dataset, don't forget to cite our paper: @Article{jimaging7030050, AUTHOR = {Ferreira, Anselmo and Nowroozi, Ehsan and Barni, Mauro}, TITLE = {VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents}, JOURNAL = {Journal of Imaging}, VOLUME = {7}, YEAR = {2021}, NUMBER = {3}, ARTICLE-NUMBER = {50}, URL = {https://www.mdpi.com/2313-433X/7/3/50}, ISSN = {2313-433X}, DOI = {10.3390/jimaging7030050} }
Data from: RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of...
data.europa.eu
data.niaid.nih.gov
unknown
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of Tone Mapped Images [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-8328145?locale=pl
Explore at:
unknown(2804)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Tone mapping operators (TMO) are functions that map high dynamic range (HDR) images to a standard dynamic range (SDR), while aiming to preserve the perceptual cues of a scene that govern its visual quality. Despite the increasing number of studies on quality assessment of tone mapped images, current subjective quality datasets have relatively small numbers of images and subjective opinions. Moreover, existing challenges in transferring laboratory experiments to crowdsourcing platforms put a barrier for collecting large-scale datasets through crowdsourcing. We address these challenges and propose the RealVision-TMO (RV-TMO), a large-scale tone mapped image quality dataset. RV-TMO contains 250 unique HDR images, their tone mapped versions obtained using four TMOs and pairwise comparison results from seventy unique observers for each pair. This dataset is published as part of the Journal paper titled as " RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of Tone Mapped Images". If you are using this dataset in your work, please cite the paper below: @ARTICLE{9872141, author={Ak, Ali and Goswami, Abhishek and Hauser, Wolf and Le Callet, Patrick and Dufaux, Frederic}, journal={IEEE Transactions on Multimedia}, title={RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of Tone Mapped Images}, year={2022}, pages={1-12}, doi={10.1109/TMM.2022.3203211}}
NASA Mars Landmarks Classification
kaggle.com
zip
Updated Jun 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kaizen (2022). NASA Mars Landmarks Classification [Dataset]. https://www.kaggle.com/datasets/sshikamaru/mars-landmarks/discussion
Explore at:
zip(985710335 bytes)Available download formats
Dataset updated
Jun 23, 2022
Authors
kaizen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Mars orbital image (HiRISE) labeled data set version 3

Authors: Kiri L. Wagstaff, Steven Lu, Gary Doran, Lukas Mandrake Contact: you.lu@jpl.nasa.gov

This data set contains a total of 73,031 landmarks. 10,433 landmarks were detected and extracted from 180 HiRISE browse images, and 62,598 landmarks were augmented from 10,433 original landmarks. For each original landmark, we cropped a square bounding box that includes the full extent of the landmark plus a 30-pixel margin to left, right, top and bottom. Each cropped landmark was resized to 227x227 pixels, and then was augmented to generate 6 additional landmarks using the following methods:

90 degrees clockwise rotation

180 degrees clockwise rotation

270 degrees clockwise rotation

Horizontal flip

Vertical flip

Random brightness adjustment

Contents: - map-proj-v3/: Directory containing individual cropped landmark images - labels-map-proj-v3.txt: Class labels (ids) for each landmark image - landmarks_map-proj-v3_classmap.csv: Dictionary that maps class ids to semantic names

Attribution: If you use this data set in your own work, please cite this DOI: 10.5281/zenodo.2538136

https://zenodo.org/record/2538136#.YrN7KezMJ9I
Z
Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms
data.niaid.nih.gov
zenodo.org
Updated Aug 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Garske, Samuel; Mao, Yiwei (2024). Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13370799
Explore at:
Dataset updated
Aug 29, 2024
Dataset provided by
University of Sydney
Authors
Garske, Samuel; Mao, Yiwei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary

This dataset contains two hyperspectral and one multispectral anomaly detection images, and their corresponding binary pixel masks. They were initially used for real-time anomaly detection in line-scanning, but they can be used for any anomaly detection task.

They are in .npy file format (will add tiff or geotiff variants in the future), with the image datasets being in the order of (height, width, channels). The SNP dataset was collected using sentinelhub, and the Synthetic dataset was collected from AVIRIS. The Python code used to analyse these datasets can be found at: https://github.com/WiseGamgee/HyperAD

How to Get Started

All that is needed to load these datasets is Python (preferably 3.8+) and the NumPy package. Example code for loading the Beach Dataset if you put it in a folder called "data" with the python script is:

import numpy as np

Load image file

hsi_array = np.load("data/beach_hsi.npy") n_pixels, n_lines, n_bands = hsi_array.shape print(f"This dataset has {n_pixels} pixels, {n_lines} lines, and {n_bands}.")

Load image mask

mask_array = np.load("data/beach_mask.npy") m_pixels, m_lines = mask_array.shape print(f"The corresponding anomaly mask is {m_pixels} pixels by {m_lines} lines.")

Citing the Datasets

If you use any of these datasets, please cite the following paper:

@article{garske2024erx, title={ERX - a Fast Real-Time Anomaly Detection Algorithm for Hyperspectral Line-Scanning}, author={Garske, Samuel and Evans, Bradley and Artlett, Christopher and Wong, KC}, journal={arXiv preprint arXiv:2408.14947}, year={2024},}

If you use the beach dataset please cite the following paper as well (original source):

@article{mao2022openhsi, title={OpenHSI: A complete open-source hyperspectral imaging solution for everyone}, author={Mao, Yiwei and Betters, Christopher H and Evans, Bradley and Artlett, Christopher P and Leon-Saval, Sergio G and Garske, Samuel and Cairns, Iver H and Cocks, Terry and Winter, Robert and Dell, Timothy}, journal={Remote Sensing}, volume={14}, number={9}, pages={2244}, year={2022}, publisher={MDPI} }

Facebook

Twitter

Click to copy link

Link copied

Cite

Otávio da Fonseca Martins Gomes; Otávio da Fonseca Martins Gomes; Sidnei Paciornik; Sidnei Paciornik; Michel Pedro Filippo; Michel Pedro Filippo; Gilson Alexandre Ostwald Pedro da Costa; Gilson Alexandre Ostwald Pedro da Costa; Guilherme Lucio Abelha Mota; Guilherme Lucio Abelha Mota (2021). Cu dataset – A copper ore labeled images dataset for segmentation training and testing [Dataset]. http://doi.org/10.5281/zenodo.5020566

Cu dataset – A copper ore labeled images dataset for segmentation training and testing

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5020566

Dataset updated

Jul 16, 2021

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is composed of 121 pairs of correlated images. Each pair contains one image of a copper ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).

The sample came from a copper ore from Yauri Cusco (Peru) with a complex mineralogy, mainly composed of sulfides, oxides, silicates, and native copper. It was classified by size. The fraction +74-100 μm was cold mounted with epoxy resin and subsequently ground and polished.

Correlative microscopy was employed for image acquisition. Thus, 121 fields were imaged on a reflected light microscope with a 20× (NA 0.40) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 1017×753 pixels with a resolution of 0.53 µm/pixel. As matter of fact, some images (the images No. 2, 3, 24, 25, 46, 47, 69, 91, and 113) have slightly smaller sizes because they were cropped during the registration procedure to correct co-localization errors of the order of a few pixels. Finally, the images from SEM were thresholded to generate the reference images.

Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).

This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model (Chen et al., 2018) that reached mean values of 90.56% and 92.12% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.

For further questions and suggestions, please do not hesitate to contact us.

Contact email: ogomes@gmail.com

If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5020566

Please also cite this paper, which provides additional details about the dataset:

Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.

Clear search

Close search

Google apps

Main menu

Cu dataset – A copper ore labeled images dataset for segmentation training...

visuAAL Skin Segmentation Dataset

FeM dataset – An iron ore labeled images dataset for segmentation training...

Annotated Ultrasound Liver images Dataset

Annotated Ultrasound Liver Images

Dataset Overview

Files Included

Annotations

Potential Uses

Copyright and Citation

License

🙏 If you find this dataset useful, please consider giving it an upvote! 👍 Thank you! 😊

CODEBRIM: COncrete DEfect BRidge IMage Dataset

Mars orbital image (HiRISE) labeled data set

🪙 Coin Image Dataset

References

Citation

Acknowlegement

Data from: On the Role of Images for Analyzing Claims in Social Media

Pre-processed (in Detectron2 and YOLO format) planetary images and boulder...

Data from: Ichthyofauna (Osteichthyes, Actinopterygii) from tributaries of...

The Klarna Product-Page Dataset

Data from: Computational 3D resolution enhancement for optical coherence...

Aerial Water Buoys Dataset

Seatizen Atlas image dataset

Data from: RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation...

Data from: Night and Day Instance Segmented Park (NDISPark) Dataset: a...

VIPPrint: A Large Scale Dataset for Colored Printed Documents Authentication...

Data from: RV-TMO: Large-Scale Dataset for Subjective Quality Assessment of...

NASA Mars Landmarks Classification

Mars orbital image (HiRISE) labeled data set version 3

Three Annotated Anomaly Detection Datasets for Line-Scan Algorithms

Load image file

Load image mask

Cu dataset – A copper ore labeled images dataset for segmentation training and testing