100+ datasets found

Zenodo Code Images
kaggle.com
zip
Updated Jun 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Research Computing Center (2018). Zenodo Code Images [Dataset]. https://www.kaggle.com/datasets/stanfordcompute/code-images
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Jun 18, 2018
Dataset authored and provided by
Stanford Research Computing Center
Description
Code Images

Context

This is a subset of the Zenodo-ML Dinosaur Dataset [Github] that has been converted to small png files and organized in folders by the language so you can jump right in to using machine learning methods that assume image input.

Content

Included are .tar.gz files, each named based on a file extension, and when extracted, will produce a folder of the same name.

tree -L 1 . ├── c ├── cc ├── cpp ├── cs ├── css ├── csv ├── cxx ├── data ├── f90 ├── go ├── html ├── java ├── js ├── json ├── m ├── map ├── md ├── txt └── xml

And we can peep inside a (somewhat smaller) of the set to see that the subfolders are zenodo identifiers. A zenodo identifier corresponds to a single Github repository, so it means that the png files produced are chunks of code of the extension type from a particular repository.

$ tree map -L 1 map ├── 1001104 ├── 1001659 ├── 1001793 ├── 1008839 ├── 1009700 ├── 1033697 ├── 1034342 ... ├── 836482 ├── 838329 ├── 838961 ├── 840877 ├── 840881 ├── 844050 ├── 845960 ├── 848163 ├── 888395 ├── 891478 └── 893858 154 directories, 0 files

Within each folder (zenodo id) the files are prefixed by the zenodo id, followed by the index into the original image set array that is provided with the full dinosaur dataset archive.

$ tree m/891531/ -L 1 m/891531/ ├── 891531_0.png ├── 891531_10.png ├── 891531_11.png ├── 891531_12.png ├── 891531_13.png ├── 891531_14.png ├── 891531_15.png ├── 891531_16.png ├── 891531_17.png ├── 891531_18.png ├── 891531_19.png ├── 891531_1.png ├── 891531_20.png ├── 891531_21.png ├── 891531_22.png ├── 891531_23.png ├── 891531_24.png ├── 891531_25.png ├── 891531_26.png ├── 891531_27.png ├── 891531_28.png ├── 891531_29.png ├── 891531_2.png ├── 891531_30.png ├── 891531_3.png ├── 891531_4.png ├── 891531_5.png ├── 891531_6.png ├── 891531_7.png ├── 891531_8.png └── 891531_9.png 0 directories, 31 files

So what's the difference?

The difference is that these files are organized by extension type, and provided as actual png images. The original data is provided as numpy data frames, and is organized by zenodo ID. Both are useful for different things - this particular version is cool because we can actually see what a code image looks like.

How many images total?

We can count the number of total images:

find "." -type f -name *.png | wc -l 3,026,993

Dataset Curation

The script to create the dataset is provided here. Essentially, we start with the top extensions as identified by this work (excluding actual images files) and then write each 80x80 image to an actual png image, organizing by extension then zenodo id (as shown above).

Saving the Image

I tested a few methods to write the single channel 80x80 data frames as png images, and wound up liking cv2's imwrite function because it would save and then load the exact same content.

import cv2 cv2.imwrite(image_path, image)

Loading the Image

Given the above, it's pretty easy to load an image! Here is an example using scipy, and then for newer Python (if you get a deprecation message) using imageio.

image_path = '/tmp/data1/data/csv/1009185/1009185_0.png' from imageio import imread image = imread(image_path) array([[116, 105, 109, ..., 32, 32, 32], [ 48, 44, 48, ..., 32, 32, 32], [ 48, 46, 49, ..., 32, 32, 32], ..., [ 32, 32, 32, ..., 32, 32, 32], [ 32, 32, 32, ..., 32, 32, 32], [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8) image.shape (80,80) # Deprecated from scipy import misc misc.imread(image_path) Image([[116, 105, 109, ..., 32, 32, 32], [ 48, 44, 48, ..., 32, 32, 32], [ 48, 46, 49, ..., 32, 32, 32], ..., [ 32, 32, 32, ..., 32, 32, 32], [ 32, 32, 32, ..., 32, 32, 32], [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8)

Remember that the values in the data are characters that have been converted to ordinal. Can you guess what 32 is?

ord(' ') 32 # And thus if you wanted to convert it back... chr(32)

So how t...
VegeNet - Image datasets and Codes
zenodo.org
data.niaid.nih.gov
zip
Updated Oct 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jo Yen Tan; Jo Yen Tan (2022). VegeNet - Image datasets and Codes [Dataset]. http://doi.org/10.5281/zenodo.7254508
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7254508
Dataset updated
Oct 27, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jo Yen Tan; Jo Yen Tan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Compilation of python codes for data preprocessing and VegeNet building, as well as image datasets (zip files).

Image datasets:

vege_original : Images of vegetables captured manually in data acquisition stage

vege_cropped_renamed : Images in (1) cropped to remove background areas and image labels renamed

non-vege images : Images of non-vegetable foods for CNN network to recognize other-than-vegetable foods

food_image_dataset : Complete set of vege (2) and non-vege (3) images for architecture building.

food_image_dataset_split : Image dataset (4) split into train and test sets

process : Images created when cropping (pre-processing step) to create dataset (2).
T
open_images_v4
tensorflow.org
opendatalab.com
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). open_images_v4 [Dataset]. https://www.tensorflow.org/datasets/catalog/open_images_v4
Explore at:
Dataset updated
Jun 1, 2024
Description
Open Images is a dataset of ~9M images that have been annotated with image-level labels and object bounding boxes.

The training set of V4 contains 14.6M bounding boxes for 600 object classes on 1.74M images, making it the largest existing dataset with object location annotations. The boxes have been largely manually drawn by professional annotators to ensure accuracy and consistency. The images are very diverse and often contain complex scenes with several objects (8.4 per image on average). Moreover, the dataset is annotated with image-level labels spanning thousands of classes.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('open_images_v4', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/open_images_v4-original-2.0.0.png" alt="Visualization" width="500px">
Raspberry Turk Project
kaggle.com
zip
Updated Mar 14, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
joeymeyer (2017). Raspberry Turk Project [Dataset]. https://www.kaggle.com/datasets/joeymeyer/raspberryturk
Explore at:
zip(36266263 bytes)Available download formats
Dataset updated
Mar 14, 2017
Authors
joeymeyer
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
http://www.raspberryturk.com/assets/img/logo.png" alt="Raspberry Turk logo">

Context

This dataset was created as part of the Raspberry Turk project. The Raspberry Turk is a robot that can play chess—it's entirely open source, based on Raspberry Pi, and inspired by the 18th century chess playing machine, the Mechanical Turk. The dataset was used to train models for the vision portion of the project.

Content

http://www.raspberryturk.com/assets/img/rawcapture.png" alt="Raw chessboard image">

In the raw form the dataset contains 312 480x480 images of chessboards with their associated board FENs. Each chessboard contains 30 empty squares, 8 orange pawns, 2 orange knights, 2 orange bishops, 2 orange rooks, 2 orange queens, 1 orange king, 8 green pawns, 2 green knights, 2 green bishops, 2 green rooks, 2 green queens, and 1 green king arranged in different random positions.

Scripts for Data Processing

The Raspberry Turk source code includes several scripts for converting this raw data to a more usable form.

To get started download the raw.zip file below and then:

$ git clone git@github.com:joeymeyer/raspberryturk.git $ cd raspberryturk $ unzip ~/Downloads/raw.zip -d data $ conda env create -f data/environment.yml $ source activate raspberryturk

From this point there are two scripts you will need to run. First, convert the raw data to an interim form (individual 60x60 rgb/grayscale images) using process_raw.py like this:

$ python -m raspberryturk.core.data.process_raw data/raw/ data/interim/

This will split the raw images into individual squares and put them in labeled folders inside the interim folder. The final step is to convert the images into a dataset that can be loaded into a numpy array for training/validation. The create_dataset.py utility accomplishes this. The tool takes a number of parameters that can be used to customize the dataset (ex. choose the labels, rgb/grayscale, zca whiten images first, include rotated images, etc). Below is the documentation for create_dataset.py.

$ python -m raspberryturk.core.data.create_dataset --help usage: raspberryturk/core/data/create_dataset.py [-h] [-g] [-r] [-s SAMPLE] [-o] [-t TEST_SIZE] [-e] [-z] base_path {empty_or_not,white_or_black,color_piece,color_piece_noempty,piece,piece_noempty} filename Utility used to create a dataset from processed images. positional arguments: base_path Base path for data processing. {empty_or_not,white_or_black,color_piece,color_piece_noempty,piece,piece_noempty} Encoding function to use for piece classification. See class_encoding.py for possible values. filename Output filename for dataset. Should be .npz optional arguments: -h, --help show this help message and exit -g, --grayscale Dataset should use grayscale images. -r, --rotation Dataset should use rotated images. -s SAMPLE, --sample SAMPLE Dataset should be created by only a sample of images. Must be value between 0 and 1. -o, --one_hot Dataset should use one hot encoding for labels. -t TEST_SIZE, --test_size TEST_SIZE Test set partition size. Must be value between 0 and 1. -e, --equalize_classes Equalize class distributions. -z, --zca ZCA whiten dataset.

Example of how it can be used:

$ python -m raspberryturk.core.data.create_dataset data/interim/ promotable_piece data/processed/example_dataset.npz --rotation --grayscale --one_hot --sample=0.3 --zca

Finally, the dataset is created and can be easily loaded into Python either using raspberryturk.core.data.dataset.Dataset or simply np.load.

In [1]: from raspberryturk.core.data.dataset import Dataset In [2]: d = Dataset.load_file('data/processed/example_dataset.npz')

or

In [1]: with open('data/processed/example_dataset.npz', 'r') as f: : data = np.load(f)

Visit the data collection page of the Raspberry Turk website for more details.

Creator

Joey Meyer
R
Data from: Fashion Mnist Dataset
universe.roboflow.com
opendatalab.com
+3more
zip
Updated Aug 10, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Popular Benchmarks (2022). Fashion Mnist Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/fashion-mnist-ztryt/model/3
Explore at:
zipAvailable download formats
Dataset updated
Aug 10, 2022
Dataset authored and provided by
Popular Benchmarks
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Clothing
Description
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Authors:

Han Xiao, Kashif Rasul and Roland Vollgraf

https://arxiv.org/abs/1708.07747

Dataset Obtained From: https://github.com/zalandoresearch/fashion-mnist

All images were sized 28x28 in the original dataset

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits. * Source

Here's an example of how the data looks (each class takes three-rows): https://github.com/zalandoresearch/fashion-mnist/raw/master/doc/img/fashion-mnist-sprite.png" alt="Visualized Fashion MNIST dataset">

Version 1 (original-images_Original-FashionMNIST-Splits):

Original images, with the original splits for MNIST: train (86% of images - 60,000 images) set and test (14% of images - 10,000 images) set only.

This version was not trained

Version 3 (original-images_trainSetSplitBy80_20):

Original, raw images, with the train set split to provide 80% of its images to the training set and 20% of its images to the validation set

https://blog.roboflow.com/train-test-split/ https://i.imgur.com/angfheJ.png" alt="Train/Valid/Test Split Rebalancing">

Citation:

@online{xiao2017/online, author = {Han Xiao and Kashif Rasul and Roland Vollgraf}, title = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms}, date = {2017-08-28}, year = {2017}, eprintclass = {cs.LG}, eprinttype = {arXiv}, eprint = {cs.LG/1708.07747}, }
F
Data from: A Neural Approach for Text Extraction from Scholarly Figures
data.uni-hannover.de
zip
Updated Jan 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TIB (2022). A Neural Approach for Text Extraction from Scholarly Figures [Dataset]. https://data.uni-hannover.de/dataset/a-neural-approach-for-text-extraction-from-scholarly-figures
Explore at:
zipAvailable download formats
Dataset updated
Jan 20, 2022
Dataset authored and provided by
TIB
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
A Neural Approach for Text Extraction from Scholarly Figures

This is the readme for the supplemental data for our ICDAR 2019 paper.

You can read our paper via IEEE here: https://ieeexplore.ieee.org/document/8978202

If you found this dataset useful, please consider citing our paper:

@inproceedings{DBLP:conf/icdar/MorrisTE19, author = {David Morris and Peichen Tang and Ralph Ewerth}, title = {A Neural Approach for Text Extraction from Scholarly Figures}, booktitle = {2019 International Conference on Document Analysis and Recognition, {ICDAR} 2019, Sydney, Australia, September 20-25, 2019}, pages = {1438--1443}, publisher = {{IEEE}}, year = {2019}, url = {https://doi.org/10.1109/ICDAR.2019.00231}, doi = {10.1109/ICDAR.2019.00231}, timestamp = {Tue, 04 Feb 2020 13:28:39 +0100}, biburl = {https://dblp.org/rec/conf/icdar/MorrisTE19.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

This work was financially supported by the German Federal Ministry of Education and Research (BMBF) and European Social Fund (ESF) (InclusiveOCW project, no. 01PE17004).

Datasets

We used different sources of data for testing, validation, and training. Our testing set was assembled by the work we cited by Böschen et al. We excluded the DeGruyter dataset, and use it as our validation dataset.

Testing

These datasets contain a readme with license information. Further information about the associated project can be found in the authors' published work we cited: https://doi.org/10.1007/978-3-319-51811-4_2

Validation

The DeGruyter dataset does not include the labeled images due to license restrictions. As of writing, the images can still be downloaded from DeGruyter via the links in the readme. Note that depending on what program you use to strip the images out of the PDF they are provided in, you may have to re-number the images.

Training

We used label_generator's generated dataset, which the author made available on a requester-pays amazon s3 bucket. We also used the Multi-Type Web Images dataset, which is mirrored here.

Code

We have made our code available in code.zip. We will upload code, announce further news, and field questions via the github repo.

Our text detection network is adapted from Argman's EAST implementation. The EAST/checkpoints/ours subdirectory contains the trained weights we used in the paper.

We used a tesseract script to run text extraction from detected text rows. This is inside our code code.tar as text_recognition_multipro.py.

We used a java script provided by Falk Böschen and adapted to our file structure. We included this as evaluator.jar.

Parameter sweeps are automated by param_sweep.rb. This file also shows how to invoke all of these components.
Learning Privacy from Visual Entities - Curated data sets and pre-computed...
zenodo.org
zip
Updated May 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessio Xompero; Alessio Xompero; Andrea Cavallaro; Andrea Cavallaro (2025). Learning Privacy from Visual Entities - Curated data sets and pre-computed visual entities [Dataset]. http://doi.org/10.5281/zenodo.15348506
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15348506
Dataset updated
May 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alessio Xompero; Alessio Xompero; Andrea Cavallaro; Andrea Cavallaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the curated image privacy datasets and pre-computed visual entities used in the publication Learning Privacy from Visual Entities by A. Xompero and A. Cavallaro.
[arxiv][code]

Curated image privacy data sets

In the article, we trained and evaluated models on the Image Privacy Dataset (IPD) and the PrivacyAlert dataset. The datasets are originally provided by other sources and have been re-organised and curated for this work.

Our curation organises the datasets in a common structure. We updated the annotations and labelled the splits of the data in the annotation file. This avoids having separated folders of images for each data split (training, validation, testing) and allows a flexible handling of new splits, e.g. created with a stratified K-Fold cross-validation procedure. As for the original datasets (PicAlert and PrivacyAlert), we provide the link to the images in bash scripts to download the images. Another bash script re-organises the images in sub-folders with maximum 1000 images in each folder.

Both datasets refer to images publicly available on Flickr. These images have a large variety of content, including sensitive content, seminude people, vehicle plates, documents, private events. Images were annotated with a binary label denoting if the content was deemed to be public or private. As the images are publicly available, their label is mostly public. These datasets have therefore a high imbalance towards the public class. Note that IPD combines two other existing datasets, PicAlert and part of VISPR, to increase the number of private images already limited in PicAlert. Further details in our corresponding https://doi.org/10.48550/arXiv.2503.12464" target="_blank" rel="noopener">publication.

List of datasets and their original source:

PicAlert [Images occupy 2.4 GB]

VISPR [Images occupy 49.7 GB]

PrivacyAlert [Images occupy 1 GB]

Notes:

For PicAlert and PrivacyAlert, only urls to the original locations in Flickr are available in the Zenodo record

Collector and authors of the PrivacyAlert dataset selected the images from Flickr under Public Domain license

Owners of the photos on Flick could have removed the photos from the social media platform

Running the bash scripts to download the images can incur in the "429 Too Many Requests" status code

Pre-computed visual entitities

Some of the models run their pipeline end-to-end with the images as input, whereas other models require different or additional inputs. These inputs include the pre-computed visual entities (scene types and object types) represented in a graph format, e.g. for a Graph Neural Network. Re-using these pre-computed visual entities allows other researcher to build new models based on these features while avoiding re-computing the same on their own or for each epoch during the training of a model (faster training).

For each image of each dataset, namely PrivacyAlert, PicAlert, and VISPR, we provide the predicted scene probabilities as a .csv file , the detected objects as a .json file in COCO data format, and the node features (visual entities already organised in graph format with their features) as a .json file. For consistency, all the files are already organised in batches following the structure of the images in the datasets folder. For each dataset, we also provide the pre-computed adjacency matrix for the graph data.

Note: IPD is based on PicAlert and VISPR and therefore IPD refers to the scene probabilities and object detections of the other two datasets. Both PicAlert and VISPR must be downloaded and prepared to use IPD for training and testing.

Further details on downloading and organising data can be found in our GitHub repository: https://github.com/graphnex/privacy-from-visual-entities (see ARTIFACT-EVALUATION.md#pre-computed-visual-entitities-)

Enquiries, questions and comments

If you have any enquiries, question, or comments, or you would like to file a bug report or a feature request, use the issue tracker of our GitHub repository.
Lao Character Image Dataset for Classification
kaggle.com
Updated May 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
silamany (2025). Lao Character Image Dataset for Classification [Dataset]. https://www.kaggle.com/datasets/silamany/lao-characters
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 23, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
silamany
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Description:

This dataset contains a collection of images featuring individual Lao characters, specifically designed for image classification tasks. The dataset is organized into folders, where each folder is named directly with the Lao character it represents (e.g., a folder named "ກ", a folder named "ຂ", and so on) and contains 100 images of that character.

Content:

The dataset comprises images of 44 distinct Lao characters, including consonants, vowels, and tone marks.

Image Characteristics: - Resolution: 128x128 pixels - Format: JPEG (.jpg) - Appearance: Each image features a white drawn line representing the Lao character against a black background.

Structure:

- The dataset is divided into 44 folders. - Each folder is named with the actual Lao character it contains. - Each folder contains 100 images of the corresponding Lao character. - This results in a total of 4400 images in the dataset.

Potential Use Cases:

- Training and evaluating image classification models for Lao character recognition. - Developing Optical Character Recognition (OCR) systems for the Lao language. - Research in computer vision and pattern recognition for Southeast Asian scripts.

Usage Notes / Data Augmentation:

The nature of these images (white characters on a black background) lends itself well to various data augmentation techniques to improve model robustness and performance. Consider applying augmentations such as:

- Geometric Transformations: - Zoom (in/out) - Height and width shifts - Rotation - Perspective transforms - Blurring Effects: - Standard blur - Motion blur - Noise Injection: - Gaussian noise

Applying these augmentations can help create a more diverse training set and potentially lead to better generalization on unseen data.
R
Mnist Dataset
universe.roboflow.com
tensorflow.org
+4more
zip
Updated Aug 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Popular Benchmarks (2022). Mnist Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/mnist-cjkff/model/2
Explore at:
zipAvailable download formats
Dataset updated
Aug 8, 2022
Dataset authored and provided by
Popular Benchmarks
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Digits
Description
THE MNIST DATABASE of handwritten digits

Authors:

Yann LeCun, Courant Institute, NYU

Corinna Cortes, Google Labs, New York

Christopher J.C. Burges, Microsoft Research, Redmond

Dataset Obtained From: http://yann.lecun.com/exdb/mnist/

All images were sized 28x28 in the original dataset

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Version 1 (original-images_trainSetSplitBy80_20):

Original, raw images, with the train set split to provide 80% of its images to the training set and 20% of its images to the validation set

Trained from Roboflow Classification Model's ImageNet training checkpoint

Version 2 (original-images_ModifiedClasses_trainSetSplitBy80_20):

Original, raw images, with the train set split to provide 80% of its images to the training set and 20% of its images to the validation set

Modify Classes, a Roboflow preprocessing feature, was employed to change class names from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 to one, two, three, four, five, six, seven, eight, nine

Trained from the Roboflow Classification Model's ImageNet training checkpoint

Version 3 (original-images_Original-MNIST-Splits):

Original images, with the original splits for MNIST: train (86% of images - 60,000 images) set and test (14% of images - 10,000 images) set only.

This version was not trained

Citation:

@article{lecun2010mnist, title={MNIST handwritten digit database}, author={LeCun, Yann and Cortes, Corinna and Burges, CJ}, journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist}, volume={2}, year={2010} }
Visual image reconstruction
openneuro.org
Updated Jul 16, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yoichi Miyawaki; Hajime Uchida; Okito Yamashita; Masa-aki Sato; Yusuke Morito; Hiroki C. Tanabe; Norihiro Sadato; Yukiyasu Kamitani (2018). Visual image reconstruction [Dataset]. https://openneuro.org/datasets/ds000255/versions/00002
Explore at:
Dataset updated
Jul 16, 2018
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Yoichi Miyawaki; Hajime Uchida; Okito Yamashita; Masa-aki Sato; Yusuke Morito; Hiroki C. Tanabe; Norihiro Sadato; Yukiyasu Kamitani
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Visual image reconstruction

Original paper: Miyawaki Y, Uchida H, Yamashita O, Sato M, Morito Y, Tanabe HC, Sadato N & Kamitani Y (2008) Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders. Neuron 60:915-929.

Overview

This is the fMRI data from Miyawaki et al. (2008) "Visual image reconstruction from human brain activity using a combination of multiscale local image decoders". Neuron 60:915-29. In this study, we collected fMRI activity from subjects viewing images, and constructed decoders predicting local image contrast at multiple spatial scales. The combined decoders based on a linear model successfully reconstructed presented stimuli from fMRI activity.

Task

The experiment consisted of human subjects viewing contrast-based images of 12 x 12 flickering patches. There were two types of image viewing tasks: (1) random image viewing and (2) figure image (geometric shape or alphabet letter) viewing. For image presentation, a block design was used with rest periods between the presentation of each image. For random image patch presentation, images were presented for 6 s, followed by 6 s rest. For figure image presentation, images were presented for 12 s, followed by 12 s rest. The data from random image viewing runs were used to train the decoding models, and the trained model were evaluated with the data from figure image viewing runs.

Dataset

This dataset contains two subjects ('sub-01' and 'sub-02'). The subjects performed two sessions of fMRI experiments ('ses-01' and 'ses-02'). Each session is composed of several EPI runs (TR, 2000 ms; TE, 30 ms; flip angle, 80°; voxel size, 3 × 3 × 3 mm; FOV, 192 × 192 mm; number of slices, 30, slice gap, 0 mm) and inplane T2-weighted imaging (TR, 6000 ms; TE, 57 ms; flip angle, 90°; voxel size, 0.75 × 0.75 × 3.0 mm; FOV, 192 × 192 mm). The EPI images covered the entire occipital lobe. The dataset also includes a T1-weighted anatomical reference image for each subject (TR, 2250 ms; TE, 2.98 ms for sub-01 and 3.06 ms for sub-02; TI, 900 ms; flip angle, 9°; voxel size, 1.0 × 1.0 × 1.0 mm; FOV, 256 × 256 mm). The T1w images were obtained in sessions different from the fMRI experiment sessions and stored in 'ses-anat' directories. The T1w images were defaced by pydeface (https://pypi.python.org/pypi/pydeface). All DICOM files are converted to Nifti-1 files by mri_convert in FreeSurfer. In addition, the dataset contains mask images of manually defined ROIs for each subjects in sourcedata directory (See README in sourcedata for more details).

During fMRI runs, the subject viewed contrast-based images of 12 × 12 flickering image patches. Two types of runs ('viewRandom' and 'viewFigure') were included in the experiment. In 'viewRandom' runs, random images were presented as visual stimuli. Each 'viewRandom' runs consisted of 22 stimulus presentation trials and lasted for 298 s (149 volumes). The two subjects performed 20 'viewRandom' runs. In 'viewFigure' runs, either geometric shape pattern (square, small frame, large frame, plus, X) or alphabet letter pattern (n, e, u, r, o) was presented in each trial. In addition, data while the subject viewed thin and large alphabet letter patterns (n, e, u, r, o) are included in the dataset (they are not included in the results of the original study). Each 'viewFigure' run consisted of 10 stimulus presentation trials and lasted for 268 s (134 volumes). The 'sub-01' and 'sub-02' performed 12 and 10 'viewFigure' runs, respectively.

To help subjects suppress eye blinks and firmly fixate the eyes, the color of the fixation spot changed from white to red 2 s before each stimulus block started. To ensure alertness, subjects were instructed to detect the color change of the fixation (red to green, 100 ms) that occurred after a random interval of 3–5 s from the beginning of each stimulus block. Performances of the subject was monitored online during experiments, but were not recorded and omitted from the dataset.

Task event files

The value of trial_type in the task event files (*_events.tsv) indicates the type of each trial (block) as below.

rest: Rest trial (no visual stimulus).

stimulus_random: Random pattern.

stimulus_shape: Geometric shape pattern (square, small frame, large frame, plus, X).

stimulus_alphabet: Alphabet pattern (n, e, u, r, o).

stimulus_alphabet_thin: Thin alphabet pattern (n, e, u, r, o).

stimulus_alphabet_long: Long alphabet pattern (n, e, u, r, o).

Note that the results from thin and long alphabet patterns are not included in the original paper although the data were obtained in the same sessions.

Additional column stimulus_pattern contains the pattern of stimuli (12 × 12) presented in each stimulus trial. It is vectorized in row-major order. Each element in the vector corresponds to a patch (1.15° × 1.15°) in a stimulus pattern. 1 and 0 represnets a flickering checkerboard and a gray area, respectively. For example, stimulus pattern of

000000000000000000000000000000000000000111111000000111111000000110011000000110011000000110011000000110011000000000000000000000000000000000000000

represents the following stimulus.

000000000000 000000000000 000000000000 000111111000 000111111000 000110011000 000110011000 000110011000 000110011000 000000000000 000000000000 000000000000

The column holds 'null' for rest trials.

Comments added by Openfmri Curators

===========================================

General Comments

Defacing

Pydeface was used on all anatomical images to ensure de-identification of subjects. The code can be found at https://github.com/poldracklab/pydeface

Quality Control

MRIQC was run on the dataset. Results are located in derivatives/mriqc. Learn more about it here: https://mriqc.readthedocs.io/en/stable/

Where to discuss the dataset

1) www.openfmri.org/dataset/ds******/ See the comments section at the bottom of the dataset page. 2) www.neurostars.org Please tag any discussion topics with the tags openfmri and dsXXXXXX. 3) Send an email to submissions@openfmri.org. Please include the accession number in your email.

Known Issues

-behavioral performance data is not accompanied this dataset as submitter didn't submit.
h
cards-image-dataset
huggingface.co
Updated Sep 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anuhya Edupuganti (2025). cards-image-dataset [Dataset]. https://huggingface.co/datasets/aedupuga/cards-image-dataset
Explore at:
Dataset updated
Sep 30, 2025
Authors
Anuhya Edupuganti
Description
Dataset Card for aedupuga/cards-image-dataset

Dataset Description

This Dataset consists of images of some of the cards in 2 different card decks labelled as Face (0) or Value(1)

Curated by: Anuhya Edupuganti

Uses Direct Use

Training and evaluating image classification models Experimenting with image preprocessing (resizing and augmentation)

Dataset Structure

This data set contains teo splits:

original: 30 samples of cards from… See the full description on the dataset page: https://huggingface.co/datasets/aedupuga/cards-image-dataset.
Z
Data from: OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land...
data.niaid.nih.gov
zenodo.org
Updated Jan 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xia; Yokoya; Adriano; Broni-Bediako (2023). OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7223445
Explore at:
Dataset updated
Jan 2, 2023
Dataset provided by
Junshi
Clifford
Naoto
Bruno
Authors
Xia; Yokoya; Adriano; Broni-Bediako
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Project Page

https://open-earth-map.org/

Paper

https://arxiv.org/abs/2210.10732

Overview

OpenEarthMap is a benchmark dataset for global high-resolution land cover mapping. OpenEarthMap consists of 5000 aerial and satellite images with manually annotated 8-class land cover labels and 2.2 million segments at a 0.25-0.5m ground sampling distance, covering 97 regions from 44 countries across 6 continents. OpenEarthMap fosters research including but not limited to semantic segmentation and domain adaptation. Land cover mapping models trained on OpenEarthMap generalize worldwide and can be used as off-the-shelf models in a variety of applications.

Reference

@inproceedings{xia_2023_openearthmap, title = {OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping}, author = {Junshi Xia and Naoto Yokoya and Bruno Adriano and Clifford Broni-Bediako}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {6254-6264} }

License

Label data of OpenEarthMap are provided under the same license as the original RGB images, which varies with each source dataset. For more details, please see the attribution of source data here. Label data for regions where the original RGB images are in the public domain or where the license is not explicitly stated are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Note for xBD data

The RGB images of xBD dataset are not included in the OpenEarthMap dataset. Please download the xBD RGB images from https://xview2.org/dataset and add them to the corresponding folders. The "xbd_files.csv" contains information about how to prepare the xBD RGB images and add them to the corresponding folders.

Code

Sample code to add the xBD RGB images to the distributed OpenEarthMap dataset and to train baseline models is available here.

Leaderboard

Performance on the test set can be evaluated on the Codalab webpage.
Data from: Generic Object Decoding (fMRI on ImageNet)
openneuro.org
Updated Dec 6, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomoyasu Horikawa; Yukiyasu Kamitani (2019). Generic Object Decoding (fMRI on ImageNet) [Dataset]. http://doi.org/10.18112/openneuro.ds001246.v1.2.1
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds001246.v1.2.1
Dataset updated
Dec 6, 2019
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Tomoyasu Horikawa; Yukiyasu Kamitani
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Generic Object Decoding (fMRI on ImageNet)

Original paper

Horikawa, T. & Kamitani, Y. (2017) Generic decoding of seen and imagined objects using hierarchical visual features. Nature Communications 8:15037. https://www.nature.com/articles/ncomms15037

Overview

In this study, fMRI data was recorded while subjects were viewing object images (image presentation experiment) or were imagining object images (imagery experiment). The image presentation experiment consisted of two distinct types of sessions: training image sessions and test image sessions. In the training image session, a total of 1,200 images from 150 object categories (8 images from each category) were each presented only once (24 runs). In the test image session, a total of 50 images from 50 object categories (1 image from each category) were presented 35 times each (35 runs). All images were taken from ImageNet (http://www.image-net.org/, Fall 2011 release), a large-scale hierarchical image database. During the image presentation experiment, subjects performed one-back image repetition task (5 trials in each run). In the imagery experiment, subjects were required to visually imagine images from 1 of the 50 categories (20 runs; 25 categories in each run; 10 samples for each category) that were presented in the test image session of the image presentation experiment. fMRI data in the training image sessions were used to train models (decoders) which predict visual features from fMRI patterns, and those in the test image sessions and the imagery experiment were used to evaluate the model performance. Predicted features for the test image sessions and imagery experiment are used to identify seen/imagined object categories from a set of computed features for numerous object images.

Analysis demo code is available at GitHub (KamitaniLab/GenericObjectDecoding).

Dataset

MRI files

The present dataset contains fMRI data from five subjects ('sub-01', 'sub-02', 'sub-03', 'sub-04', and 'sub-05'). Each subject data contains three types of MRI data each of which was collected over multiple scanning sessions.

'ses-perceptionTraining': fMRI data from the training image sessions in the image presentation experiment (24 runs; 3-5 scanning sessions)

'ses-perceptionTest': fMRI data from the test image sessions in the image presentation experiment (35 runs; 4-6 scanning sessions)

'ses-imageryTest': fMRI data from the imagery experiment (20 runs; 3-5 scanning sessions)

Each scanning session consisted of functional (EPI) and anatomical (inplane T2) data. The functional EPI images covered the entire brain (TR, 3000 ms; TE, 30 ms; flip angle, 80°; voxel size, 3 × 3 × 3 mm; FOV, 192 × 192 mm; number of slices, 50, slice gap, 0 mm) and inplane T2-weighted anatomical images were acquired with the same slices used for the EPI (TR, 7020 ms; TE, 69 ms; flip angle, 160°; voxel size, 0.75 × 0.75 × 3.0 mm; FOV, 192 × 192 mm). The dataset also includes a T1-weighted anatomical reference image for each subject (TR, 2250 ms; TE, 3.06 ms; TI, 900 ms; flip angle, 9°; voxel size, 1.0 × 1.0 × 1.0 mm; FOV, 256 × 256 mm). The T1-weighted images were scanned only once for each subject in a separate scanning session and are stored in 'ses-anatomy' directories. The T1-weighted images were defaced by pydeface (https://pypi.python.org/pypi/pydeface). All DICOM files are converted to Nifti-1 files by mri_convert in FreeSurfer. In addition, the dataset contains mask images of manually defined ROIs for each subject in 'sourcedata' directory (See 'README' in 'sourcedata' for more details).

Preprocessed fMRI data

Preprocessed fMRI data are available in derivatives/preproc-spm. See the original paper (Horikawa & Kamitani, 2017) for the details of preprocessing.

Task event files

Task event files (‘sub-*_ses-*_task-*_run-*_events.tsv’) contains recorded event (stimuli presentation, subject responses, etc.) during fMRI runs. In task event files for perception task (‘ses-perceptionTraining' and 'ses-perceptionTest'), each column represents:

'onset': onset time (sec) of an event

'duration': duration (sec) of the event

'trial_no': trial (block) number of the event

'event_type': type of the event ('rest': Rest block without visual stimulus, 'stimulus': Stimulus presentation block)

'stimulus_id': stimulus ID of the image presented in a stimulus block ('n/a' in rest blocks)

'stimulus_name': stimulus file name of the image presented in a stimulus block ('n/a' in rest blocks)

'response_time': time of button press at the block, elapsed time (sec) from the beginning of each run ('n/a' when the subject did not press the button in the block)

Additional columns 'category_index' and 'image_index' are for internal use.

In task event files for imagery task ('ses-imageryTest'), each column represents:

'onset': onset time (sec) of an event

'duration': duration (sec) of the event

'trial_no': trial (block) number of the event

'event_type': type of the event ('rest' and 'inter_rest': rest period, 'cue': cue presentation period, 'imagery': imagery period, 'evaluation': evaluation of imagery quality period)

'category_id': ImageNet/WordNet synset ID of a synset (category) which the subject was instructed to imagine at the block ('n/a' in rest blocks)

'category_name': ImageNet/WordNet synset (category) which the subject was instructed to imagine at the block ('n/a' in rest blocks)

'response_time': time of button press for imagery quality evaluation at the block, elapsed time (sec) from the beginning of each run ('n/a' when the subject did not press the button in the block)

'evaluation': vividness of their mental imagery evaluated by the subject (very vivid, fairly vivid, rather vivid, not vivid, or cannot recognize the target)

Additional column 'category_index' is for internal use.

Image/category labels

The stimulus images are named as 'n03626115_19498' where 'n03626115' is ImageNet/WorNet ID for a synset (category) and '19498' is image ID. The categories are named as the ImageNet/WordNet sysnet ID (e.g., 'n03626115'). The stimulus and category names are included in the task event files as 'stimulus_name' and 'category_name', respectively. For use in analysis code, the task event files also contain 'stimulus_id' and 'category_id', which are float numbers generated based on the stimulus or category names (e.g., 'n03626115_19498' --> 3626115.019498).

The mapping between stimulus/category names and IDs:

stimulus_ImageNetTraining.tsv (perceptionTraining sessions)

The first and second column from the left is 'stimulus_name' and 'stimulus_id', respectively.

stimulus_ImageNetTest.tsv (perceptionTest sessions)

The first and second column from the left is 'stimulus_name' and 'stimulus_id', respectively.

category_GODImagery.tsv (imageryTest sessions)

The first and second column from the left is 'category_name' and 'category_id', respectively.

Stimulus images

Because of licensing issues, we do not include the stimulus images in the dataset. A script downloading the images from ImageNet is available at https://github.com/KamitaniLab/GenericObjectDecoding. Image features (CNN unit responses, HMAX, GIST, and SIFT) used in the original study are available at https://figshare.com/articles/Generic_Object_Decoding/7387130.

Contact

Email: brainliner-admin@atr.jp

We also accept inquires at issues on GitHub/KamitaniLab/OpenData.
UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA)
zenodo.org
bin, zip
Updated Dec 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth; Luca Giancardo; Luca Giancardo; Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth (2023). UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA) [Dataset]. http://doi.org/10.5281/zenodo.6476639
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6476639
Dataset updated
Dec 11, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth; Luca Giancardo; Luca Giancardo; Ivan Coronado; Samiksha Pachade; Rania Abdelkhaleq; Juntao Yan; Sergio Salazar-Marioni; Amanda Jagolino; Mozhdeh Bahrainian; Roomasa Channa; Sunil Sheth
Description
Introduction

Vessel segmentation in fundus images is essential in the diagnosis and prognosis of retinal diseases and the identification of image-based biomarkers. However, creating a vessel segmentation map can be a tedious and time consuming process, requiring careful delineation of the vasculature, which is especially hard for microcapillary plexi in fundus images. Optical coherence tomography angiography (OCT-A) is a relatively novel modality visualizing blood flow and microcapillary plexi not clearly observed in fundus photography. Unfortunately, current commercial OCT-A cameras have various limitations due to their complex optics making them more expensive, less portable, and with a reduced field of view (FOV) compared to fundus cameras. Moreover, the vast majority of population health data collection efforts do not include OCT-A data.

We believe that strategies able to map fundus images to en-face OCT-A can create precise vascular vessel segmentation with less effort.

In this dataset, called UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA), we include fundus images and en-face OCT-A images for 112 subjects. The two modalities have been manually aligned to allow for training of medical imaging machine learning pipelines. This dataset is accompanied by a manuscript that describes an approach to generate fundus vessel segmentations using OCT-A for training (Coronado et al., 2022). We refer to this approach as "Synthetic OCT-A".

Fundus Imaging

We include 45 degree macula centered fundus images that cover both macula and optic disc. All images were acquired using a OptoVue iVue fundus camera without pupil dilation.

The full images are available at the fov45/fundus directory. In addition, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/fundus/disc and cropped/fundus/macula.

Enface OCT-A

We include the en-face OCT-A images of the superficial capillary plexus. All images were acquired using an OptoVue Avanti OCT camera with OCT-A reconstruction software (AngioVue). Low quality images with errors in the retina layer segmentations were not included.

En-face OCTA images are located in cropped/octa/disc and cropped/octa/macula. In addition, we include a denoised version of these images where only vessels are included. This has been performed automatically using the ROSE algorithm (Ma et al. 2021). These can be found in cropped/GT_OCT_net/noThresh and cropped/GT_OCT_net/Thresh, the former contains the probabilities of the ROSE algorithm the latter a binary map.

Synthetic OCT-A

We train a custom conditional generative adversarial network (cGAN) to map a fundus image to an en face OCT-A image. Our model consists of a generator synthesizing en face OCT-A images from corresponding areas in fundus photographs and a discriminator judging the resemblance of the synthesized images to the real en face OCT-A samples. This allows us to avoid the use of manual vessel segmentation maps altogether.

The full images are available at the fov45/synthetic_octa directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/synthetic_octa/disc and cropped/synthetic_octa/macula. In addition, we performed the same denoising ROSE algorithm (Ma et al. 2021) used for the original enface OCT-A images, the results are available in cropped/denoised_synthetic_octa/noThresh and cropped/denoised_synthetic_octa/Thresh, the former contains the probabilities of the ROSE algorithm the latter a binary map.

Other Fundus Vessel Segmentations Included

In this dataset, we have also included the output of two recent vessel segmentation algorithms trained on external datasets with manual vessel segmentations. SA-Unet (Li et. al, 2020) and IterNet (Guo et. al, 2021).

SA-Unet. The full images are available at the fov45/SA_Unet directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/SA_Unet/disc and cropped/SA_Unet/macula.

IterNet. The full images are available at the fov45/Iternet directory. Then, we extracted the FOVs corresponding to the en-face OCT-A images collected in cropped/Iternet/disc and cropped/Iternet/macula.

Train/Validation/Test Replication

In order to replicate or compare your model to the results of our paper, we report below the data split used.

Training subjects IDs: 1 - 25

Validation subjects IDs: 26 - 30

Testing subjects IDs: 31 - 112

Data Acquisition

This dataset was acquired at the Texas Medical Center - Memorial Hermann Hospital in accordance with the guidelines from the Helsinki Declaration and it was approved by the UTHealth IRB with protocol HSC-MS-19-0352.

User Agreement

The UT-FSOCTA dataset is free to use for non-commercial scientific research only. In case of any publication the following paper needs to be cited

Coronado I, Pachade S, Trucco E, Abdelkhaleq R, Yan J, Salazar-Marioni S, Jagolino-Cole A, Bahrainian M, Channa R, Sheth SA, Giancardo L. Synthetic OCT-A blood vessel maps using fundus images and generative adversarial networks. Sci Rep 2023;13:15325. https://doi.org/10.1038/s41598-023-42062-9.

Funding

This work is supported by the Translational Research Institute for Space Health through NASA Cooperative Agreement NNX16AO69A.

Research Team and Acknowledgements

Here are the people behind this data acquisition effort:

Ivan Coronado, Samiksha Pachade, Rania Abdelkhaleq, Juntao Yan, Sergio Salazar-Marioni, Amanda Jagolino, Mozhdeh Bahrainian, Roomasa Channa, Sunil Sheth, Luca Giancardo

We would also like to acknowledge for their support: the Institute for Stroke and Cerebrovascular Diseases at UTHealth, the VAMPIRE team at University of Dundee, UK and Memorial Hermann Hospital System.

References

Coronado I, Pachade S, Trucco E, Abdelkhaleq R, Yan J, Salazar-Marioni S, Jagolino-Cole A, Bahrainian M, Channa R, Sheth SA, Giancardo L. Synthetic OCT-A blood vessel maps using fundus images and generative adversarial networks. Sci Rep 2023;13:15325. https://doi.org/10.1038/s41598-023-42062-9. C. Guo, M. Szemenyei, Y. Yi, W. Wang, B. Chen, and C. Fan, "SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation," in 2020 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp. 1236–1242. doi: 10.1109/ICPR48806.2021.9413346. L. Li, M. Verma, Y. Nakashima, H. Nagahara, and R. Kawasaki, "IterNet: Retinal Image Segmentation Utilizing Structural Redundancy in Vessel Networks," 2020 IEEE Winter Conf. Appl. Comput. Vis. WACV, 2020, doi: 10.1109/WACV45572.2020.9093621. Y. Ma et al., "ROSE: A Retinal OCT-Angiography Vessel Segmentation Dataset and New Model," IEEE Trans. Med. Imaging, vol. 40, no. 3, pp. 928–939, Mar. 2021, doi: 10.1109/TMI.2020.3042802.
d
Code and example images from: recolorize: An R package for flexible color...
search.dataone.org
datadryad.org
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hannah Weller; Anna Hiller; Nathan Lord; Steven Van Belleghem (2025). Code and example images from: recolorize: An R package for flexible color segmentation of biological images [Dataset]. http://doi.org/10.5061/dryad.9kd51c5r3
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.9kd51c5r3
Dataset updated
Jul 26, 2025
Dataset provided by
Dryad Digital Repository
Authors
Hannah Weller; Anna Hiller; Nathan Lord; Steven Van Belleghem
Time period covered
Jan 1, 2024
Description
Color pattern variation provides biological information in fields ranging from disease ecology to speciation dynamics. Comparing color pattern geometries across images requires color segmentation, where pixels in an image are assigned to one of a set of color classes shared by all images. Manual methods for color segmentation are slow and subjective, while automated methods can struggle with high technical variation in aggregate image sets. We present recolorize, an R package toolbox for human-subjective color segmentation with functions for batch-processing low-variation image sets and additional tools for handling images from diverse (high variation) sources. The package also includes export options for a variety of formats and color analysis packages. This paper illustrates recolorize for three example datasets, including high variation, batch processing, and combining with reflectance spectra, and demonstrates the downstream use of methods that rely on this output., The included dataset is a copy of the code and images found in the recolorize_examples GitHub repository as of January 2024:Â https://github.com/hiweller/recolorize_examples The code includes all steps necessary to recreate the analyses presented in the paper. Images were sourced as follows: Figure 1: Images of Chrysochroa beetles by Nathan P. Lord (coauthor). Figure 2:Â Pygoplites diacanthus image from John E. Randall/Bishop Museum (http://pbs.bishopmuseum.org/images/JER/detail.asp?size=i&cols=10&ID=1432937402). Figure 3: Images ofÂ Neolamprologus fishes taken by Ad Konings, used with his kind permission. Figure 4: Images ofÂ Polistes fuscatus wasps taken by James Tumulty, and are a subset of the images used for analysis in Tumulty et al. (2023): https://doi.org/10.1016/j.cub.2023.11.032 Figure 5: Images of Diglossa birds taken by Anna E. Hiller (coauthor).Â , , # Code and example images from 'recolorize: An R package for flexible color segmentation of biological images'

Summary

This repository contains example files and code from the methods paper (Weller, Hiller, Lord, and Van Belleghem, 2024) describing how to use the recolorize R package ().

Software requirements

All examples were developed and tested using R version 4.3.2 (2023-10-31, "Eye Holes"), and R version >3.5.0 is recommended.

Overview

The main directory contains folders for each of the examples demonstrated in the paper (01_beetles, 03_cichlids, 04_wasps, and 05_birds), as well as two R scripts (patPCA_total.R and 00_installation_RUN_ME_FIRST.R) and an R project file (recolorize_examples.Rproj).

Before running any of the examples, it is recommended to run the 00_installation_RUN_ME_FIRST.R script, as this will install the development versions of the relevant packages for the examples.

The .Rproj file is not necessary to run the examples but is conven...
Z
Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...
data.niaid.nih.gov
zenodo.org
Updated Jun 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maura Pintor; Daniele Angioni; Angelo Sotgiu; Luca Demetrio; Ambra Demontis; Battista Biggio; Fabio Roli (2022). ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6568777
Explore at:
Dataset updated
Jun 30, 2022
Dataset provided by
University of Genoa, Italy
University of Cagliari, Italy
Authors
Maura Pintor; Daniele Angioni; Angelo Sotgiu; Luca Demetrio; Ambra Demontis; Battista Biggio; Fabio Roli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Adversarial patches are optimized contiguous pixel blocks in an input image that cause a machine-learning model to misclassify it. However, their optimization is computationally demanding and requires careful hyperparameter tuning. To overcome these issues, we propose ImageNet-Patch, a dataset to benchmark machine-learning models against adversarial patches. It consists of a set of patches optimized to generalize across different models and applied to ImageNet data after preprocessing them with affine transformations. This process enables an approximate yet faster robustness evaluation, leveraging the transferability of adversarial perturbations.

We release our dataset as a set of folders indicating the patch target label (e.g., banana), each containing 1000 subfolders as the ImageNet output classes.

An example showing how to use the dataset is shown below.

code for testing robustness of a model

import os.path

from torchvision import datasets, transforms, models import torch.utils.data

class ImageFolderWithEmptyDirs(datasets.ImageFolder): """ This is required for handling empty folders from the ImageFolder Class. """

def find_classes(self, directory): classes = sorted(entry.name for entry in os.scandir(directory) if entry.is_dir()) if not classes: raise FileNotFoundError(f"Couldn't find any class folder in {directory}.") class_to_idx = {cls_name: i for i, cls_name in enumerate(classes) if len(os.listdir(os.path.join(directory, cls_name))) > 0} return classes, class_to_idx

extract and unzip the dataset, then write top folder here

dataset_folder = 'data/ImageNet-Patch'

available_labels = { 487: 'cellular telephone', 513: 'cornet', 546: 'electric guitar', 585: 'hair spray', 804: 'soap dispenser', 806: 'sock', 878: 'typewriter keyboard', 923: 'plate', 954: 'banana', 968: 'cup' }

select folder with specific target

target_label = 954

dataset_folder = os.path.join(dataset_folder, str(target_label)) normalizer = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) transforms = transforms.Compose([ transforms.ToTensor(), normalizer ])

dataset = ImageFolderWithEmptyDirs(dataset_folder, transform=transforms) model = models.resnet50(pretrained=True) loader = torch.utils.data.DataLoader(dataset, shuffle=True, batch_size=5) model.eval()

batches = 10 correct, attack_success, total = 0, 0, 0 for batch_idx, (images, labels) in enumerate(loader): if batch_idx == batches: break pred = model(images).argmax(dim=1) correct += (pred == labels).sum() attack_success += sum(pred == target_label) total += pred.shape[0]

accuracy = correct / total attack_sr = attack_success / total

print("Robust Accuracy: ", accuracy) print("Attack Success: ", attack_sr)
T
imagenet2012
tensorflow.org
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). imagenet2012 [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012
Explore at:
Dataset updated
Jun 1, 2024
Description
ILSVRC 2012, commonly known as 'ImageNet' is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.

The test split contains 100K images but no labels because no labels have been publicly released. We provide support for the test split from 2012 with the minor patch released on October 10, 2019. In order to manually download this data, a user must perform the following operations:

Download the 2012 test split available here.

Download the October 10, 2019 patch. There is a Google Drive link to the patch provided on the same page.

Combine the two tar-balls, manually overwriting any images in the original archive with images from the patch. According to the instructions on image-net.org, this procedure overwrites just a few images.

The resulting tar-ball may then be processed by TFDS.

To assess the accuracy of a model on the ImageNet test split, one must run inference on all images in the split, export those results to a text file that must be uploaded to the ImageNet evaluation server. The maintainers of the ImageNet evaluation server permits a single user to submit up to 2 submissions per week in order to prevent overfitting.

To evaluate the accuracy on the test split, one must first create an account at image-net.org. This account must be approved by the site administrator. After the account is created, one can submit the results to the test server at https://image-net.org/challenges/LSVRC/eval_server.php The submission consists of several ASCII text files corresponding to multiple tasks. The task of interest is "Classification submission (top-5 cls error)". A sample of an exported text file looks like the following:

771 778 794 387 650 363 691 764 923 427 737 369 430 531 124 755 930 755 59 168

The export format is described in full in "readme.txt" within the 2013 development kit available here: https://image-net.org/data/ILSVRC/2013/ILSVRC2013_devkit.tgz Please see the section entitled "3.3 CLS-LOC submission format". Briefly, the format of the text file is 100,000 lines corresponding to each image in the test split. Each line of integers correspond to the rank-ordered, top 5 predictions for each test image. The integers are 1-indexed corresponding to the line number in the corresponding labels file. See labels.txt.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('imagenet2012', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012-5.1.0.png" alt="Visualization" width="500px">
h
car-images
huggingface.co
Updated Jul 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KAI (2025). car-images [Dataset]. https://huggingface.co/datasets/KAI-KratosAI/car-images
Explore at:
Dataset updated
Jul 19, 2025
Authors
KAI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for Car Images Dataset

Dataset Description

This dataset contains a collection of car images designed to support tasks such as image classification, car detection, and autonomous driving research. The images feature cars in various settings, including streets and other environments, captured to provide diverse training data for computer vision models. The dataset aims to enable the development and benchmarking of models that can:

Identify different types… See the full description on the dataset page: https://huggingface.co/datasets/KAI-KratosAI/car-images.
R
Cifar 100 Dataset
universe.roboflow.com
opendatalab.com
+5more
zip
Updated Aug 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Popular Benchmarks (2022). Cifar 100 Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/cifar100
Explore at:
zipAvailable download formats
Dataset updated
Aug 11, 2022
Dataset authored and provided by
Popular Benchmarks
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Animals People CommonObjects
Description
CIFAR-100

The CIFAR-10 and CIFAR-100 dataset contains labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. * More info on CIFAR-100: https://www.cs.toronto.edu/~kriz/cifar.html * TensorFlow listing of the dataset: https://www.tensorflow.org/datasets/catalog/cifar100 * GitHub repo for converting CIFAR-100 tarball files to png format: https://github.com/knjcode/cifar2png

All images were sized 32x32 in the original dataset

The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images [in the original dataset].

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). However, this project does not contain the superclasses. * Superclasses version: https://universe.roboflow.com/popular-benchmarks/cifar100-with-superclasses/

More background on the dataset: https://i.imgur.com/5w8A0Vm.png" alt="CIFAR-100 Dataset Classes and Superclassees">

Version 1 (original-images_Original-CIFAR100-Splits):

Original images, with the original splits for CIFAR-100: train (83.33% of images - 50,000 images) set and test (16.67% of images - 10,000 images) set only.

This version was not trained

Version 2 (original-images_trainSetSplitBy80_20):

Original, raw images, with the train set split to provide 80% of its images to the training set (approximately 40,000 images) and 20% of its images to the validation set (approximately 10,000 images)

Trained from Roboflow Classification Model's ImageNet training checkpoint

https://blog.roboflow.com/train-test-split/ https://i.imgur.com/kSPeKGn.png" alt="Train/Valid/Test Split Rebalancing">

Citation:

@TECHREPORT{Krizhevsky09learningmultiple, author = {Alex Krizhevsky}, title = {Learning multiple layers of features from tiny images}, institution = {}, year = {2009} }
Data from: FISBe: A real-world benchmark dataset for instance segmentation...
zenodo.org
data.niaid.nih.gov
+1more
bin, json +3
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. http://doi.org/10.5281/zenodo.10875063
Explore at:
zip, text/x-python, bin, json, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10875063
Dataset updated
Apr 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Feb 26, 2024
Description
General

For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

Summary

A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains

30 completely labeled (segmented) images

71 partly labeled images

altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)

To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects

A set of metrics and a novel ranking score for respective meaningful method benchmarking

An evaluation of three baseline methods in terms of the above metrics and score

Abstract

Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

Dataset documentation:

We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

>> FISBe Datasheet

Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

Files

fisbe_v1.0_{completely,partly}.zip

contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.

fisbe_v1.0_mips.zip

maximum intensity projections of all samples, for convenience.

sample_list_per_split.txt

a simple list of all samples and the subset they are in, for convenience.

view_data.py

a simple python script to visualize samples, see below for more information on how to use it.

dim_neurons_val_and_test_sets.json

a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.

Readme.md

general information

How to work with the image files

Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
For each image, we provide a pixel-wise instance segmentation for all separable neurons.
Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
The segmentation mask for each neuron is stored in a separate channel.
The order of dimensions is CZYX.

We recommend to work in a virtual environment, e.g., by using conda:

conda create -y -n flylight-env -c conda-forge python=3.9
conda activate flylight-env

How to open zarr files

Install the python zarr package:
pip install zarr

Opened a zarr file with:

import zarr
raw = zarr.open(
seg = zarr.open(

# optional:
import numpy as np
raw_np = np.array(raw)

Zarr arrays are read lazily on-demand.
Many functions that expect numpy arrays also work with zarr arrays.
Optionally, the arrays can also explicitly be converted to numpy arrays.

How to view zarr image files

We recommend to use napari to view the image data.

Install napari:
pip install "napari[all]"

Save the following Python script:

import zarr, sys, napari

raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

viewer = napari.Viewer(ndisplay=3)
for idx, gt in enumerate(gts):
viewer.add_labels(
gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
napari.run()

Execute:
python view_data.py

Metrics

S: Average of avF1 and C

avF1: Average F1 Score

C: Average ground truth coverage

clDice_TP: Average true positives clDice

FS: Number of false splits

FM: Number of false merges

tp: Relative number of true positives

For more information on our selected metrics and formal definitions please see our paper.

Baseline

To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
For detailed information on the methods and the quantitative results please see our paper.

License

The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Citation

If you use FISBe in your research, please use the following BibTeX entry:

@misc{mais2024fisbe, title = {FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures}, author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller}, year = 2024, eprint = {2404.00130}, archivePrefix ={arXiv}, primaryClass = {cs.CV} }

Acknowledgments

We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
discussions.
P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
This work was co-funded by Helmholtz Imaging.

Changelog

There have been no changes to the dataset so far.
All future change will be listed on the changelog page.

Contributing

If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

All contributions are welcome!

Facebook

Twitter

Click to copy link

Link copied

Cite

Stanford Research Computing Center (2018). Zenodo Code Images [Dataset]. https://www.kaggle.com/datasets/stanfordcompute/code-images

Zenodo Code Images

Approximately 3 million images of code snippets, 18 languages to study software

Explore at:

zip(0 bytes)Available download formats

Dataset updated

Jun 18, 2018

Dataset authored and provided by

Stanford Research Computing Center

Description

Code Images

Context

This is a subset of the Zenodo-ML Dinosaur Dataset [Github] that has been converted to small png files and organized in folders by the language so you can jump right in to using machine learning methods that assume image input.

Content

Included are .tar.gz files, each named based on a file extension, and when extracted, will produce a folder of the same name.

 tree -L 1
.
├── c
├── cc
├── cpp
├── cs
├── css
├── csv
├── cxx
├── data
├── f90
├── go
├── html
├── java
├── js
├── json
├── m
├── map
├── md
├── txt
└── xml

And we can peep inside a (somewhat smaller) of the set to see that the subfolders are zenodo identifiers. A zenodo identifier corresponds to a single Github repository, so it means that the png files produced are chunks of code of the extension type from a particular repository.

$ tree map -L 1
map
├── 1001104
├── 1001659
├── 1001793
├── 1008839
├── 1009700
├── 1033697
├── 1034342
...
├── 836482
├── 838329
├── 838961
├── 840877
├── 840881
├── 844050
├── 845960
├── 848163
├── 888395
├── 891478
└── 893858

154 directories, 0 files

Within each folder (zenodo id) the files are prefixed by the zenodo id, followed by the index into the original image set array that is provided with the full dinosaur dataset archive.

$ tree m/891531/ -L 1
m/891531/
├── 891531_0.png
├── 891531_10.png
├── 891531_11.png
├── 891531_12.png
├── 891531_13.png
├── 891531_14.png
├── 891531_15.png
├── 891531_16.png
├── 891531_17.png
├── 891531_18.png
├── 891531_19.png
├── 891531_1.png
├── 891531_20.png
├── 891531_21.png
├── 891531_22.png
├── 891531_23.png
├── 891531_24.png
├── 891531_25.png
├── 891531_26.png
├── 891531_27.png
├── 891531_28.png
├── 891531_29.png
├── 891531_2.png
├── 891531_30.png
├── 891531_3.png
├── 891531_4.png
├── 891531_5.png
├── 891531_6.png
├── 891531_7.png
├── 891531_8.png
└── 891531_9.png

0 directories, 31 files

So what's the difference?

The difference is that these files are organized by extension type, and provided as actual png images. The original data is provided as numpy data frames, and is organized by zenodo ID. Both are useful for different things - this particular version is cool because we can actually see what a code image looks like.

How many images total?

We can count the number of total images:

find "." -type f -name *.png | wc -l
3,026,993

Dataset Curation

The script to create the dataset is provided here. Essentially, we start with the top extensions as identified by this work (excluding actual images files) and then write each 80x80 image to an actual png image, organizing by extension then zenodo id (as shown above).

Saving the Image

I tested a few methods to write the single channel 80x80 data frames as png images, and wound up liking cv2's imwrite function because it would save and then load the exact same content.

import cv2
cv2.imwrite(image_path, image)

Loading the Image

Given the above, it's pretty easy to load an image! Here is an example using scipy, and then for newer Python (if you get a deprecation message) using imageio.

image_path = '/tmp/data1/data/csv/1009185/1009185_0.png'
from imageio import imread

image = imread(image_path)
array([[116, 105, 109, ..., 32, 32, 32],
    [ 48, 44, 48, ..., 32, 32, 32],
    [ 48, 46, 49, ..., 32, 32, 32],
    ..., 
    [ 32, 32, 32, ..., 32, 32, 32],
    [ 32, 32, 32, ..., 32, 32, 32],
    [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8)


image.shape
(80,80)


# Deprecated
from scipy import misc
misc.imread(image_path)

Image([[116, 105, 109, ..., 32, 32, 32],
    [ 48, 44, 48, ..., 32, 32, 32],
    [ 48, 46, 49, ..., 32, 32, 32],
    ..., 
    [ 32, 32, 32, ..., 32, 32, 32],
    [ 32, 32, 32, ..., 32, 32, 32],
    [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8)

Remember that the values in the data are characters that have been converted to ordinal. Can you guess what 32 is?

ord(' ')
32

# And thus if you wanted to convert it back...
chr(32)

So how t...

Clear search

Close search

Google apps

Main menu

Zenodo Code Images

Code Images

Context

Content

Dataset Curation

Saving the Image

Loading the Image

VegeNet - Image datasets and Codes

open_images_v4

Raspberry Turk Project

Context

Content

Scripts for Data Processing

Creator

Data from: Fashion Mnist Dataset

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Authors:

Dataset Obtained From: https://github.com/zalandoresearch/fashion-mnist

All images were sized 28x28 in the original dataset

Version 1 (original-images_Original-FashionMNIST-Splits):

Version 3 (original-images_trainSetSplitBy80_20):

Citation:

Data from: A Neural Approach for Text Extraction from Scholarly Figures

A Neural Approach for Text Extraction from Scholarly Figures

Datasets

Testing

Validation

Training

Code

Learning Privacy from Visual Entities - Curated data sets and pre-computed...

Curated image privacy data sets

Pre-computed visual entitities

Enquiries, questions and comments

Lao Character Image Dataset for Classification

Dataset Description:

Content:

Structure:

Potential Use Cases:

Usage Notes / Data Augmentation:

Mnist Dataset

THE MNIST DATABASE of handwritten digits

Authors:

Dataset Obtained From: http://yann.lecun.com/exdb/mnist/

All images were sized 28x28 in the original dataset

Version 1 (original-images_trainSetSplitBy80_20):

Version 2 (original-images_ModifiedClasses_trainSetSplitBy80_20):

Version 3 (original-images_Original-MNIST-Splits):

Citation:

Visual image reconstruction

Visual image reconstruction

Overview

Task

Dataset

Task event files

Comments added by Openfmri Curators

General Comments

Defacing

Quality Control

Where to discuss the dataset

Known Issues

cards-image-dataset

Data from: OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land...

Data from: Generic Object Decoding (fMRI on ImageNet)

Generic Object Decoding (fMRI on ImageNet)

Original paper

Overview

Dataset

MRI files

Preprocessed fMRI data

Task event files

Image/category labels

Stimulus images

Contact

UTHealth - Fundus and Synthetic OCT-A Dataset (UT-FSOCTA)

Code and example images from: recolorize: An R package for flexible color...

Summary

Software requirements

Overview

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...

code for testing robustness of a model