38 datasets found

USDA ARS Image Gallery
agdatacommons.nal.usda.gov
s.cnmilf.com
+2more
bin
Updated Nov 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDA Agricultural Research Service (2023). USDA ARS Image Gallery [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/USDA_ARS_Image_Gallery/24659814
Explore at:
binAvailable download formats
Dataset updated
Nov 30, 2023
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Authors
USDA Agricultural Research Service
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
This Image Gallery is provided as a complimentary source of high-quality digital photographs available from the Agricultural Research Service information staff. Photos, (over 2,000 .jpegs) in the Image Gallery are copyright-free, public domain images unless otherwise indicated. Resources in this dataset:Resource Title: USDA ARS Image Gallery (Web page) . File Name: Web Page, url: https://www.ars.usda.gov/oc/images/image-gallery/ Over 2000 copyright-free images from ARS staff.
P
LIVE (Public-Domain Subjective Image Quality Database) Dataset
library.toponeai.link
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LIVE (Public-Domain Subjective Image Quality Database) Dataset [Dataset]. https://library.toponeai.link/dataset/live1
Explore at:
Description
The LIVE Public-Domain Subjective Image Quality Database is a resource developed by the Laboratory for Image and Video Engineering at the University of Texas at Austin. It contains a set of images and videos whose quality has been ranked by human subjects. This database is used in Quality Assessment (QA) research, which aims to make quality predictions that align with the subjective opinions of human observers.

The database was created through an extensive experiment conducted in collaboration with the Department of Psychology at the University of Texas at Austin. The experiment involved obtaining scores from human subjects for many images distorted with different distortion types. The QA algorithm may be trained on part of this data set, and tested on the rest.

The database is available to the research community free of charge. If you use these images in your research, the creators kindly ask that you reference their website and their papers. There are two releases of the database. Release 2 includes more distortion types and more subjects than Release 1. The distortions include JPEG-compressed images, JPEG2000-compressed images, Gaussian blur, and white noise.
d
PAO Image Gallery = Public Affairs Photos of EROS Projects: 1972 - 2005.
datadiscoverystudio.org
search.dataone.org
+1more
html
Updated Jun 8, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). PAO Image Gallery = Public Affairs Photos of EROS Projects: 1972 - 2005. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/6855a211c3774a849ed27f77fcb7e43e/html
Explore at:
htmlAvailable download formats
Dataset updated
Jun 8, 2018
Description
description: The EROS Image Gallery collection is composed of a wide variety of images ranging from low altitude aircraft to satellite and NASA imagery; oblique photographs and ground imagery are also included in this primarily USGS collection. These images were used in publications, posters, and special projects. They have been scanned, indexed and are searchable for no-cost downloads to the science community, educators and general public. Included in this gallery are the Earth As Art 1, 2, and 3 collections and Landsat mosaics. Also included are special images of elevation data, natural disasters, images that capture natural beauty of the Earth phenomena, and unique perspectives of rivers, lakes, seas, mountains, icebergs, as well as national parks around the world. These collections were developed more for their aesthetic beauty rather than scientific interpretation. Over time, the EROS Imagery Gallery will continue to grow as data sets from the "Image of the Week" postings are retired to the gallery. The "Image of the Week" posters are scientific observations that highlight current conditions or changes in water resources and land cover over time.; abstract: The EROS Image Gallery collection is composed of a wide variety of images ranging from low altitude aircraft to satellite and NASA imagery; oblique photographs and ground imagery are also included in this primarily USGS collection. These images were used in publications, posters, and special projects. They have been scanned, indexed and are searchable for no-cost downloads to the science community, educators and general public. Included in this gallery are the Earth As Art 1, 2, and 3 collections and Landsat mosaics. Also included are special images of elevation data, natural disasters, images that capture natural beauty of the Earth phenomena, and unique perspectives of rivers, lakes, seas, mountains, icebergs, as well as national parks around the world. These collections were developed more for their aesthetic beauty rather than scientific interpretation. Over time, the EROS Imagery Gallery will continue to grow as data sets from the "Image of the Week" postings are retired to the gallery. The "Image of the Week" posters are scientific observations that highlight current conditions or changes in water resources and land cover over time.
e
Public camera trap image set for grey heron - Package - ERIC
opendata.eawag.ch
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Public camera trap image set for grey heron - Package - ERIC [Dataset]. https://opendata.eawag.ch/dataset/public-camera-trap-image-set-for-grey-heron
Explore at:
Dataset updated
May 26, 2025
Description
DOI not yet active - Publication under review This package contains images from camera traps to monitor streams for the presence of grey heron (Ardea cinerea). The six streams are located around lake Lucerne, each with 3 to 4 cameras. Folders indicate the stream and camera (e.g. GBU1, SBU3, etc.). This image dataset can be analyzed with the code pipeline given in Burkard, Y., Francazi, E., Lavender, E. J. N., Brodersen, J., Volpi, M., Baity Jesi, M., & Moor, H. (2024). Data for: Automated single species identification in camera trap images: architecture choice, training strategies, and the interpretation of performance metrics (Version 1.0). Eawag: Swiss Federal Institute of Aquatic Science and Technology. https://doi.org/10.25678/000DHT NOTE: Images containing humans have been removed from this dataset to enable publication. This was done after the analyses presented in Burkard et al. (https://doi.org/10.25678/000DHT), such that numbers of images given in the Appendix may not exactly match the numbers of images contained in this dataset.
Car Images Dataset
kaggle.com
Updated Apr 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kshitij Kumar (2022). Car Images Dataset [Dataset]. https://www.kaggle.com/datasets/kshitij192/cars-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 5, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kshitij Kumar
License
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Description
This dataset consists of various types of cars. The dataset is organized into 2 folders (train, test) and contains subfolders for each car category. There are 4,165 images (JPG) and 7 classes of cars.

Please give credit to this dataset if you download it.
n
Data from: Public Health Image Library
neuinfo.org
dknet.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Public Health Image Library [Dataset]. http://identifiers.org/RRID:SCR_002463
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002463
Dataset updated
Jan 29, 2022
Description
Database of CDC's pictures organized into hierarchical categories of people, places, and science, presented as single images, image sets, and multimedia files. Much of the information critical to the communication of public health messages is pictorial rather than text-based. Created by a Working Group at the Centers for Disease Control and Prevention (CDC), the PHIL offers an organized, universal electronic gateway to CDC's pictures. Public health professionals, the media, laboratory scientists, educators, students, and the worldwide public are welcome to use this material for reference, teaching, presentation, and public health messages.
D
Labelled Dataset of Retinal Images for Glaucoma detection
test.dataverse.nl
dataverse.nl
txt, zip
Updated Sep 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiapan Guo; Jiapan Guo; George Azzopardi; George Azzopardi; Chenyu Shi; Chenyu Shi; Nomdo Jansonius; Nomdo Jansonius; Nicolai Petkov; Nicolai Petkov (2021). Labelled Dataset of Retinal Images for Glaucoma detection [Dataset]. http://doi.org/10.34894/H2SZSO
Explore at:
zip(97943), zip(87083), zip(41781), zip(1176911), zip(38552), txt(1497), zip(103724), zip(28703)Available download formats
Unique identifier
https://doi.org/10.34894/H2SZSO
Dataset updated
Sep 1, 2021
Dataset provided by
DataverseNL (test)
Authors
Jiapan Guo; Jiapan Guo; George Azzopardi; George Azzopardi; Chenyu Shi; Chenyu Shi; Nomdo Jansonius; Nomdo Jansonius; Nicolai Petkov; Nicolai Petkov
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Fundus photography is a viable option for glaucoma population screening. In order to facilitate the development of computer-aided glaucoma detection systems, we publish this annotation dataset that contains manual annotations of glaucoma features for seven public fundus image data sets. All manual annotations are made by a specialised ophthalmologist. For each of the fundus images in the seven fundus datasets, the upper, the bottom, the left and the right boundary coordinates of the optic disc and the cup are stored in a .mat file with the corresponding fundus image name. The seven public fundus image data sets are: CHASEDB (https://blogs.kingston.ac.uk/retinal/chasedb1/), Diaretdb1_v_1_1 (https://www.it.lut.fi/project/imageret/diaretdb1/), DRINSHTI (http://cvit.iiit.ac.in/projects/mip/drishti-gs/mip-dataset2/Home.php), DRIONS-DB (http://www.ia.uned.es/~ejcarmona/DRIONS-DB.html), DRIVE (https://www.isi.uu.nl/Research/Databases/DRIVE/), HRF (https://www5.cs.fau.de/research/data/fundus-images/), and Messidor (http://www.adcis.net/en/Download-Third-Party/Messidor.html). Researchers are encouraged to use this set to train or validate their systems for automatic glaucoma detection. When you use this set, please cite our published paper: J. Guo, G. Azzopardi, C. Shi, N. M. Jansonius and N. Petkov, "Automatic Determination of Vertical Cup-to-Disc Ratio in Retinal Fundus Images for Glaucoma Screening," in IEEE Access, vol. 7, pp. 8527-8541, 2019, doi: 10.1109/ACCESS.2018.2890544.
Learning Privacy from Visual Entities - Curated data sets and pre-computed...
zenodo.org
zip
Updated May 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessio Xompero; Alessio Xompero; Andrea Cavallaro; Andrea Cavallaro (2025). Learning Privacy from Visual Entities - Curated data sets and pre-computed visual entities [Dataset]. http://doi.org/10.5281/zenodo.15348506
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15348506
Dataset updated
May 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alessio Xompero; Alessio Xompero; Andrea Cavallaro; Andrea Cavallaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the curated image privacy datasets and pre-computed visual entities used in the publication Learning Privacy from Visual Entities by A. Xompero and A. Cavallaro.
[arxiv][code]

Curated image privacy data sets

In the article, we trained and evaluated models on the Image Privacy Dataset (IPD) and the PrivacyAlert dataset. The datasets are originally provided by other sources and have been re-organised and curated for this work.

Our curation organises the datasets in a common structure. We updated the annotations and labelled the splits of the data in the annotation file. This avoids having separated folders of images for each data split (training, validation, testing) and allows a flexible handling of new splits, e.g. created with a stratified K-Fold cross-validation procedure. As for the original datasets (PicAlert and PrivacyAlert), we provide the link to the images in bash scripts to download the images. Another bash script re-organises the images in sub-folders with maximum 1000 images in each folder.

Both datasets refer to images publicly available on Flickr. These images have a large variety of content, including sensitive content, seminude people, vehicle plates, documents, private events. Images were annotated with a binary label denoting if the content was deemed to be public or private. As the images are publicly available, their label is mostly public. These datasets have therefore a high imbalance towards the public class. Note that IPD combines two other existing datasets, PicAlert and part of VISPR, to increase the number of private images already limited in PicAlert. Further details in our corresponding https://doi.org/10.48550/arXiv.2503.12464" target="_blank" rel="noopener">publication.

List of datasets and their original source:

PicAlert [Images occupy 2.4 GB]

VISPR [Images occupy 49.7 GB]

PrivacyAlert [Images occupy 1 GB]

Notes:

For PicAlert and PrivacyAlert, only urls to the original locations in Flickr are available in the Zenodo record

Collector and authors of the PrivacyAlert dataset selected the images from Flickr under Public Domain license

Owners of the photos on Flick could have removed the photos from the social media platform

Running the bash scripts to download the images can incur in the "429 Too Many Requests" status code

Pre-computed visual entitities

Some of the models run their pipeline end-to-end with the images as input, whereas other models require different or additional inputs. These inputs include the pre-computed visual entities (scene types and object types) represented in a graph format, e.g. for a Graph Neural Network. Re-using these pre-computed visual entities allows other researcher to build new models based on these features while avoiding re-computing the same on their own or for each epoch during the training of a model (faster training).

For each image of each dataset, namely PrivacyAlert, PicAlert, and VISPR, we provide the predicted scene probabilities as a .csv file , the detected objects as a .json file in COCO data format, and the node features (visual entities already organised in graph format with their features) as a .json file. For consistency, all the files are already organised in batches following the structure of the images in the datasets folder. For each dataset, we also provide the pre-computed adjacency matrix for the graph data.

Note: IPD is based on PicAlert and VISPR and therefore IPD refers to the scene probabilities and object detections of the other two datasets. Both PicAlert and VISPR must be downloaded and prepared to use IPD for training and testing.

Further details on downloading and organising data can be found in our GitHub repository: https://github.com/graphnex/privacy-from-visual-entities (see ARTIFACT-EVALUATION.md#pre-computed-visual-entitities-)

Enquiries, questions and comments

If you have any enquiries, question, or comments, or you would like to file a bug report or a feature request, use the issue tracker of our GitHub repository.
e
Image Set: Iconography of Venus from the Middle Ages to Modern Times -...
b2find.eudat.eu
Updated Sep 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Image Set: Iconography of Venus from the Middle Ages to Modern Times - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/dca16e61-6497-5e0a-9eb0-01f5b052eb5f
Explore at:
Dataset updated
Sep 4, 2022
Description
The image set is subdivided in nine .zip files according to the first digit of the image id (Ref No in the CSV-files or Ref R in the pdf catalogues) and contains in total 25,682 image files with an average size of 203 kB. The distribution of number of image files is as follows: n°1: 6855 n°2: 6974 n°3: 7984 n°4: 611 n°5: 683 n°6: 586 n°7: 547 n°8: 751 and n°9: 691. The image id-number is followed by the name of the artist. Thus images can also be retreived by name of artist. Given the goal of the Thematic Research Collection, i.e. categorization of the topic of the artwork, nor the size nor the quality of the images was of concern. All sources have been described in the pdf catalogues or in the CSV-files. It is believed that most images are in the public domain. The image set relates to the artworks (sculptures, reliefs, paintings, frescoes, drawings, prints and illustrations) compiled in the six Topical Catalogues of the Iconography of Venus from the Middle Ages to Modern Times and to the Artworks of non-European artists. Some images have duplicates in black-white or of different size.
f
Data from: The Aerial Elephant Dataset: A new public benchmark for aerial...
figshare.com
Updated May 24, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Jochemus Naude (2019). The Aerial Elephant Dataset: A new public benchmark for aerial object detection. [Dataset]. http://doi.org/10.6084/m9.figshare.8003789.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.8003789.v1
Dataset updated
May 24, 2019
Dataset provided by
figshare
Authors
Johannes Jochemus Naude
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
p, li { white-space: pre-wrap; }

The dataset consists of 2101 images containing a total of 15 511 elephants. It is split into training and test subsets with 1649 images containing 12455 elephants in the training set and 452 images containing 3056 elephants in the test set. The resolution of the images varies between 2.4 cm/pixel and 13 cm/pixel, but the nominal resolution for each image is specified in the accompanying metadata, so it is a simple matter to resample images to a consistent GSD. Because acquired images often overlap, the same individuals may sometimes be seen in 2 or 3 consecutive images. Care has been taken with the train/test split to ensure that such clusters of related images are not split, thus maintaining independence of the training and test sets.

These images were acquired over the course of 8 separate campaigns in different environments.
DICOM converted images for the NLM-Visible-Human-Project collection
zenodo.org
bin
Updated Jun 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim (2025). DICOM converted images for the NLM-Visible-Human-Project collection [Dataset]. http://doi.org/10.5281/zenodo.12690050
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12690050
Dataset updated
Jun 6, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim
License
https://www.nlm.nih.gov/databases/download/terms_and_conditions.htmlhttps://www.nlm.nih.gov/databases/download/terms_and_conditions.html
Description
This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: NLM-Visible-Human-Project. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

Collection description

The NLM Visible Human Project [2] has created publicly-available complete, anatomically detailed, three-dimensional representations of a human male body and a human female body. Specifically, the VHP provides a public-domain library of cross-sectional cryosection, CT, and MRI images obtained from one male cadaver and one female cadaver. The Visible Man data set was publicly released in 1994 and the Visible Woman in 1995.

The data sets were designed to serve as (1) a reference for the study of human anatomy, (2) public-domain data for testing medical imaging algorithms, and (3) a test bed and model for the construction of network-accessible image libraries. The VHP data sets have been applied to a wide range of educational, diagnostic, treatment planning, virtual reality, artistic, mathematical, and industrial uses. About 4,000 licensees from 66 countries were authorized to access the datasets. As of 2019, a license is no longer required to access the VHP datasets.

Courtesy of the U.S. National Library of Medicine. Release of this collection by IDC does not indicate or imply that NLM has endorsed its products/services/applications. Please see the Visible Human Project information page to learn more about the images and to obtain any supporting metadata for this collection. Note that this collection may not reflect the most current/accurate data available from NLM.

Citation guidelines can be found on the National Library of Medicine Terms and Conditions information page.

Files included

A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

nlm_visible_human_project-idc_v15-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets

nlm_visible_human_project-idc_v15-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets

nlm_visible_human_project-idc_v15-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

Download instructions

Each of the manifests include instructions in the header on how to download the included files.

To download the files using .s5cmd manifests:

install idc-index package: pip install --upgrade idc-index

download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd.

To download the files using .dcf manifest, see manifest header.

Acknowledgments

Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

References

[1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

[2] Spitzer, V., Ackerman, M. J., Scherzinger, A. L. & Whitlock, D. The visible human male: a technical report. J. Am. Med. Inform. Assoc. 3, 118–130 (1996). https://doi.org/10.1136/jamia.1996.96236280
Z
Toloka Visual Question Answering Dataset
data.niaid.nih.gov
zenodo.org
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ustalov, Dmitry (2023). Toloka Visual Question Answering Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7057740
Explore at:
Dataset updated
Oct 10, 2023
Dataset authored and provided by
Ustalov, Dmitry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Our dataset consists of the images associated with textual questions. One entry (instance) in our dataset is a question-image pair labeled with the ground truth coordinates of a bounding box containing the visual answer to the given question. The images were obtained from a CC BY-licensed subset of the Microsoft Common Objects in Context dataset, MS COCO. All data labeling was performed on the Toloka crowdsourcing platform, https://toloka.ai/.

Our dataset has 45,199 instances split among three subsets: train (38,990 instances), public test (1,705 instances), and private test (4,504 instances). The entire train dataset was available for everyone since the start of the challenge. The public test dataset was available since the evaluation phase of the competition, but without any ground truth labels. After the end of the competition, public and private sets were released.

The datasets will be provided as files in the comma-separated values (CSV) format containing the following columns.

Column Type Description image string URL of an image on a public content delivery network width integer image width height integer image height left integer bounding box coordinate: left top integer bounding box coordinate: top right integer bounding box coordinate: right bottom integer bounding box coordinate: bottom question string question in English

This upload also contains a ZIP file with the images from MS COCO.
r
Public Health Image Library
rrid.site
Updated Jul 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Public Health Image Library [Dataset]. http://identifiers.org/RRID:SCR_002463
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002463
Dataset updated
Jul 26, 2025
Description
Database of CDC's pictures organized into hierarchical categories of people, places, and science, presented as single images, image sets, and multimedia files. Much of the information critical to the communication of public health messages is pictorial rather than text-based. Created by a Working Group at the Centers for Disease Control and Prevention (CDC), the PHIL offers an organized, universal electronic gateway to CDC's pictures. Public health professionals, the media, laboratory scientists, educators, students, and the worldwide public are welcome to use this material for reference, teaching, presentation, and public health messages.
Adult content dataset
figshare.com
zip
Updated Dec 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ali noktedan (2020). Adult content dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13456484.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13456484.v1
Dataset updated
Dec 20, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
ali noktedan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Description The Pornography dataset containing 18,000 different images. For the pornographic class, we have browsed websites which only host that kind of material (solving, in a way, the matter of purpose) and some social media platforms and some images extract from movies. The database consists of several genres of pornography and depicts actors of many ethnicities, including multi-ethnic ones. For the non-pornographic class, we have browsed general-public purpose images platform. In the figure below, we illustrate the diversity of the pornographic images
and the challenges of the non-pornographic ones. The huge diversity of cases in both pornographic and non pornographic images makes this task very challenging.Disclaimer

THIS DATABASE IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The videos, segments, and images provided were produced by third-parties, who may have retained copyrights. They are provided strictly for non-profit research purposes, and limited, controlled distributed, intended to fall under the fair-use limitation. We take no guarantees or responsibilities, whatsoever, arising out of any copyright issue. Use at your own risk.
a
Hand Dataset
academictorrents.com
bittorrent
Updated Sep 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arpit Mittal and Andrew Zisserman and Philip H. S. Torr (2022). Hand Dataset [Dataset]. https://academictorrents.com/details/ddb78dcbe9985b51a397697a6d874b9dbc46300f
Explore at:
bittorrent(250460299)Available download formats
Dataset updated
Sep 5, 2022
Dataset authored and provided by
Arpit Mittal and Andrew Zisserman and Philip H. S. Torr
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
We introduce a comprehensive dataset of hand images collected from various different public image data set sources as listed in Table 1. A total of 13050 hand instances are annotated. Hand instances larger than a fixed area of bounding box (1500 sq. pixels) are considered big enough for detections and are used for evaluation. This gives around 4170 high quality hand instances. While collecting the data, no restriction was imposed on the pose or visibility of people, nor was any constraint imposed on the environment. In each image, all the hands that can be perceived clearly by humans are annotated. The annotations consist of a bounding rectangle, which does not have to be axis aligned, oriented with respect to the wrist.
c
Curated Breast Imaging Subset of Digital Database for Screening Mammography
cancerimagingarchive.net
csv, dicom, n/a
Updated Sep 14, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2017). Curated Breast Imaging Subset of Digital Database for Screening Mammography [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
Explore at:
csv, dicom, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
Dataset updated
Sep 14, 2017
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Sep 14, 2017
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.
For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing an curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.
Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.
For scientific and other inquiries about this dataset, please contact TCIA's Helpdesk.

An Occlusion and Pose Sensitive Image Dataset for Black Ear Recognition

zenodo.org
data.niaid.nih.gov

Updated Apr 24, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Abayomi-Alli Adebayo; Abayomi-Alli Adebayo; Bioku Elizabeth; Bioku Elizabeth; Folorunso Olusegun; Folorunso Olusegun; Dawodu Ganiyu Abayomi; Awotunde Joseph Bamidele; Awotunde Joseph Bamidele; Dawodu Ganiyu Abayomi (2025). An Occlusion and Pose Sensitive Image Dataset for Black Ear Recognition [Dataset]. http://doi.org/10.5281/zenodo.7715970

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.7715970

Dataset updated

Apr 24, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

RESEARCH APPROACH

The research approach adopted for the study consists of seven phases which includes as shown in Figure 1:

Pre-acquisition
data pre-processing
Raw images collection
Image pre-processing
Naming of images
Dataset Repository
Performance Evaluation

The different phases in the study are discussed in the sections below.

PRE-ACQUISITION

The volunteers are given brief orientation on how their data will be managed and used for research purposes only. After the volunteers agrees, a consent form is given to be read and signed. The sample of the consent form filled by the volunteers is shown in Figure 1.

The capturing of images was started with the setup of the imaging device. The camera is set up on a tripod stand in stationary position at the height 90 from the floor and distance 20cm from the subject.

EAR IMAGE ACQUISITION

Image acquisition is an action of retrieving image from an external source for further processing. The image acquisition is purely a hardware dependent process by capturing unprocessed images of the volunteers using a professional camera. This was acquired through a subject posing in front of the camera. It is also a process through which digital representation of a scene can be obtained. This representation is known as an image and its elements are called pixels (picture elements). The imaging sensor/camera used in this study is a Canon E0S 60D professional camera which is placed at a distance of 3 feet form the subject and 20m from the ground.

This is the first step in this project to achieve the project’s aim of developing an occlusion and pose sensitive image dataset for black ear recognition. (OPIB ear dataset). To achieve the objectives of this study, a set of black ear images were collected mostly from undergraduate students at a public University in Nigeria.

The image dataset required is captured in two scenarios:

1. uncontrolled environment with a surveillance camera

The image dataset captured is purely black ear with partial occlusion in a constrained and unconstrained environment.

2. controlled environment with professional cameras

The ear images captured were from black subjects in controlled environment. To make the OPIB dataset pose invariant, the volunteers stand on a marked positions on the floor indicating the angles at which the imaging sensor was captured the volunteers’ ear. The capturing of the images in this category requires that the subject stand and rotates in the following angles 60^o, 30^o and 0^o towards their right side to capture the left ear and then towards the left to capture the right ear (Fernando et al., 2017) as shown in Figure 4. Six (6) images were captured per subject at angles 60^o, 30^o and 0^o for the left and right ears of 152 volunteers making a total of 907 images (five volunteers had 5 images instead of 6, hence folders 34, 22, 51, 99 and 102 contain 5 images).

To make the OPIB dataset occlusion and pose sensitive, partial occlusion of the subject’s ears were simulated using rings, hearing aid, scarf, earphone/ear pods, etc. before the images are captured.

CONSENT FORM

This form was designed to obtain participant’s consent on the project titled: An Occlusion and Pose Sensitive Image Dataset for Black Ear Recognition (OPIB). The information is purely needed for academic research purposes and the ear images collected will curated anonymously and the identity of the volunteers will not be shared with anyone. The images will be uploaded on online repository to aid research in ear biometrics.

The participation is voluntary, and the participant can withdraw from the project any time before the final dataset is curated and warehoused.

Kindly sign the form to signify your consent.

I consent to my image being recorded in form of still images or video surveillance as part of the OPIB ear images project.

Tick as appropriate:

GENDER Male Female

AGE (18-25) (26-35) (36-50)

………………………………..

SIGNED

Figure 1: Sample of Subject’s Consent Form for the OPIB ear dataset

RAW IMAGE COLLECTION

The ear images were captured using a digital camera which was set to JPEG because if the camera format is set to raw, no processing will be applied, hence the stored file will contain more tonal and colour data. However, if set to JPEG, the image data will be processed, compressed and stored in the appropriate folders.

IMAGE PRE-PROCESSING

The aim of pre-processing is to improve the quality of the images with regards to contrast, brightness and other metrics. It also includes operations such as: cropping, resizing, rescaling, etc. which are important aspect of image analysis aimed at dimensionality reduction. The images are downloaded on a laptop for processing using MATLAB.

Image Cropping

The first step in image pre-processing is image cropping. Some irrelevant parts of the image can be removed, and the image Region of Interest (ROI) is focused. This tool provides a user with the size information of the cropped image. MATLAB function for image cropping realizes this operation interactively by waiting for a user to specify the crop rectangle with the mouse and operate on the current axes. The output images of the cropping process are of the same class as the input image.

Naming of OPIB Ear Images

The OPIB ear images were labelled based on the naming convention formulated from this study as shown in Figure 5. The images are given unique names that specifies the subject, the side of the ear (left or right) and the angle of capture. The first and second letters (SU) in the image names is block letter simply representing subject for subject 1-to-n in the dataset, while the left and right ears is distinguished using L1, L2, L3 and R1, R2, R3 for angles 60⁰, 30⁰ and 0⁰_,respectively as shown in Table 1.

Table 1: Naming Convention for OPIB ear images

NAMING CONVENTION

Label

Degrees 60⁰ 30⁰ 0⁰

No of the degree 1 2 3

Subject 1 indicates (first image in dataset) SU₁

Subject n indicates (last image in dataset) SU_n

Left Image 1 L 1

Left image n L n

Right Image 1 R 1

Right Image n R n

SU1L₁ SU1R_I

SU1L₂ SU1R₂

SU1L₃ SU1R₃

OPIB EAR DATASET EVALUATION

The prominent challenges with the current evaluation practices in the field of ear biometrics are the use of different databases, different evaluation matrices, different classifiers that mask the feature extraction performance and the time spent developing framework (Abaza et al., 2013; Emeršič et al., 2017).

The toolbox provides environment in which the evaluation of methods for person recognition based on ear biometric data is simplified. It executes all the dataset reads and classification based on ear descriptors.

DESCRIPTION OF OPIB EAR DATASET

OPIB ear dataset was organised into a structure with each folder containing 6 images of the same person. The images were captured with both left and right ear at angle 0, 30 and 60 degrees. The images were occluded with earing, scarves and headphone etc. The collection of the dataset was done both indoor and outdoor. The dataset was gathered through the student at a public university in Nigeria. The percentage of female (40.35%) while Male (59.65%). The ear dataset was captured through a profession camera Nikon D 350. It was set-up with a camera stand where an individual captured in a process order. A total number of 907 images was gathered.

The challenges encountered in term of gathering students for capturing, processing of the images and annotations. The volunteers were given a brief orientation on what their ear could be used for before, it was captured, for processing. It was a great task in arranging the ear (dataset) into folders and naming accordingly.

Table 2: Overview of the OPIB Ear Dataset

Location

Both Indoor and outdoor

c
Data from The Lung Image Database Consortium (LIDC) and Image Database...
cancerimagingarchive.net
dev.cancerimagingarchive.net
dicom, n/a, xls, xlsx +1
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive, Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans [Dataset]. http://doi.org/10.7937/K9/TCIA.2015.LO9QL9SX
Explore at:
xlsx, xls, n/a, xml and zip, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2015.LO9QL9SX
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Sep 21, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Lung Image Database Consortium image collection (LIDC-IDRI) consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions. It is a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process.
Seven academic centers and eight medical imaging companies collaborated to create this data set which contains 1018 cases. Each subject includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm"). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus.
Note : The TCIA team strongly encourages users to review pylidc and the Standardized representation of the TCIA LIDC-IDRI annotations using DICOM (DICOM-LIDC-IDRI-Nodules) of the annotations/segmentations included in this dataset before developing custom tools to analyze the XML version.
brain tumor dataset
figshare.com
zip
Updated Dec 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jun Cheng (2024). brain tumor dataset [Dataset]. http://doi.org/10.6084/m9.figshare.1512427.v8
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1512427.v8
Dataset updated
Dec 21, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Jun Cheng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This brain tumor dataset contains 3064 T1-weighted contrast-inhanced images with three kinds of brain tumor. Detailed information of the dataset can be found in the readme file.The README file is updated:Add image acquisition protocolAdd MATLAB code to convert .mat file to jpg images
m
Devanagari Handwritten CAPTCHA - Dataset of 90 K Images : A Challenge Test
data.mendeley.com
ieee-dataport.org
Updated Nov 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SANJAY PATE (2023). Devanagari Handwritten CAPTCHA - Dataset of 90 K Images : A Challenge Test [Dataset]. http://doi.org/10.17632/v9wwkvdjmm.1
Explore at:
Unique identifier
https://doi.org/10.17632/v9wwkvdjmm.1
Dataset updated
Nov 30, 2023
Authors
SANJAY PATE
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Captcha stands for Completely Automated Public Turing Tests to Distinguish Between Humans and Computers. This test cannot be successfully completed by current computer systems; only humans can. It is applied in several contexts for machine and human identification. The most common kind found on websites are text-based CAPTCHAs.A CAPTCHA is made up of a series of alphabets or numbers that are linked together in a certain order. Random lines, blocks, grids, rotations, and other sorts of noise have been used to distort this image.It is difficult for rural residents who only speak their local tongues to pass the test because the majority of the letters in this protected CAPTCHA script are in English. Machine identification of Devanagari characters is significantly more challenging due to their higher character complexity compared to normal English characters and numeral-based CAPTCHAs. The vast majority of official Indian websites exclusively provide content in Devanagari. Regretfully, websites do not employ CAPTCHAs in Devanagari. Because of this, we have developed a brand-new text-based CAPTCHA using Devanagari writing.A canvas was created using Python. This canvas code is distributed to more than one hundred (100+) Devanagari native speakers of all ages, including both left- and right-handed computer users. Each user writes 440 characters (44 characters multiplied by 10) on the canvas and saves it on their computers. All user data is then gathered and compiled. The character on the canvas is black with a white background. No noise in the image is a benefit of using canvas. The final data set contains a total of 44,000 digitized images, 10,000 numerals, 4000 vowels, and 30,000 consonants. This dataset was published for research scholars for recognition and other applications on Mendeley (Mendeley Data, DOI: 10.17632/yb9rmfjzc2.1, dated October 5, 2022) and the IEEE data port (DOI: 10.21227/9zpv-3194, dated October 6, 2022).We have designed our own algorithm to design the Handwritten Devanagari CAPTCHA. We used the above-created handwritten character set. General CAPTCHA generation principles are used to add noise to the image using digital image processing techniques. The size of each CAPTCHA image is 250 x 90 pixels. Three (03) types of character sets are used: handwritten alphabets, handwritten digits, and handwritten alphabets and digits combined. For 09 Classes X 10,000 images , a Devanagari CAPTCHA data set of 90,0000 images was created using Python. All images are stored in CSV format for easy use to researchers. To make the CAPTCHA image less recognized or not easily broken. Passing a test identifying Devanagari alphabets is difficult. It is beneficial to researchers who are investigating captcha recognition in this area. This dataset is helpful to researchers in designing OCR to recognize Devanagari CAPTCHA and break it. If you are able to successfully bypass the CAPTCHA, please acknowledge us by sending an email to sanjayepate@gmail.com.

Facebook

Twitter

Click to copy link

Link copied

Cite

USDA Agricultural Research Service (2023). USDA ARS Image Gallery [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/USDA_ARS_Image_Gallery/24659814

USDA ARS Image Gallery

Explore at:

15 scholarly articles cite this dataset (View in Google Scholar)

binAvailable download formats

Dataset updated

Nov 30, 2023

Dataset provided by

Agricultural Research Servicehttps://www.ars.usda.gov/

Authors

USDA Agricultural Research Service

License

U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically

Description

This Image Gallery is provided as a complimentary source of high-quality digital photographs available from the Agricultural Research Service information staff. Photos, (over 2,000 .jpegs) in the Image Gallery are copyright-free, public domain images unless otherwise indicated. Resources in this dataset:Resource Title: USDA ARS Image Gallery (Web page) . File Name: Web Page, url: https://www.ars.usda.gov/oc/images/image-gallery/ Over 2000 copyright-free images from ARS staff.

Clear search

Close search

Google apps

Main menu

USDA ARS Image Gallery

LIVE (Public-Domain Subjective Image Quality Database) Dataset

PAO Image Gallery = Public Affairs Photos of EROS Projects: 1972 - 2005.

Public camera trap image set for grey heron - Package - ERIC

Car Images Dataset

Data from: Public Health Image Library

Labelled Dataset of Retinal Images for Glaucoma detection

Learning Privacy from Visual Entities - Curated data sets and pre-computed...

Curated image privacy data sets

Pre-computed visual entitities

Enquiries, questions and comments

Image Set: Iconography of Venus from the Middle Ages to Modern Times -...

Data from: The Aerial Elephant Dataset: A new public benchmark for aerial...

DICOM converted images for the NLM-Visible-Human-Project collection

Collection description

Files included

Download instructions

Acknowledgments

References

Toloka Visual Question Answering Dataset

Public Health Image Library

Adult content dataset

Hand Dataset

Curated Breast Imaging Subset of Digital Database for Screening Mammography

An Occlusion and Pose Sensitive Image Dataset for Black Ear Recognition

Data from The Lung Image Database Consortium (LIDC) and Image Database...

brain tumor dataset

Devanagari Handwritten CAPTCHA - Dataset of 90 K Images : A Challenge Test

USDA ARS Image GallerySee More Versions

USDA ARS Image Gallery