100+ datasets found

Scientific Image Classification Dataset
kaggle.com
Updated Apr 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rushil Prajapati (2024). Scientific Image Classification Dataset [Dataset]. https://www.kaggle.com/datasets/rushilprajapati/data-final
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 13, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rushil Prajapati
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Overview

This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.

Dataset Description

The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:

Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.

FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.

Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.

Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.

Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.

Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.

Use Cases

This dataset is ideal for developing and benchmarking image classification models that can be applied to:

Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.

Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.

Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.

References

[1] https://ieeexplore.ieee.org/document/9710731

[2] https://github.com/vimal-isi-edu/BioFors

[3] https://link.springer.com/chapter/10.1007/978-3-031-53085-2_26

[4] https://www.nationalacademies.org/news/2022/05/automated-research-workflows-are-speeding-pace-of-scientific-discovery-new-report-offers-recommendations-to-advance-their-development

[5] https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke (Histopathology images)

[6] https://www.kaggle.com/datasets/chopinforest/esophageal-endoscopy-images (Macroscopy)

[7] https://www.kaggle.com/datasets/safurahajiheidari/kidney-stone-images (Macroscopy)

[8] https://www.kaggle.com/datasets/alifrahman/covid19-chest-xray-image-dataset (Macroscopy)

[9] https://www.kaggle.com/datasets/vitaliykinakh/stable-imagenet1k (Non-scientific images)

[10] https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic (Macroscopy)

[11] https://www.kaggle.com/datasets/sunedition/graphs-dataset (Non–scientific images)
h
Intel-Image-Classification
huggingface.co
Updated Jun 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Angelo Panique (2025). Intel-Image-Classification [Dataset]. https://huggingface.co/datasets/resolverkatla/Intel-Image-Classification
Explore at:
Dataset updated
Jun 1, 2025
Authors
Christian Angelo Panique
Description
resolverkatla/Intel-Image-Classification dataset hosted on Hugging Face and contributed by the HF Datasets community
Vehicle Image Classification
kaggle.com
zip
Updated Aug 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed Maher (2024). Vehicle Image Classification [Dataset]. https://www.kaggle.com/datasets/mohamedmaher5/vehicle-classification
Explore at:
zip(866783573 bytes)Available download formats
Dataset updated
Aug 9, 2024
Authors
Mohamed Maher
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Overview: This dataset is designed for vehicle classification tasks and contains a total of 5,600 images distributed across seven categories. Each category represents a different type of vehicle.

Structure:

Main Folder: Vehicles

Subfolders:

Auto Rickshaws (800 images)

Bikes (800 images)

Cars (800 images)

Motorcycles (800 images)

Planes (800 images)

Ships (800 images)

Trains (800 images)

Image Format: All images are in JPEG format with the .jpg extension.

Size: 5,600 images in total.

Usage: Ideal for building and testing image classification models to distinguish between different types of vehicles.
c
Intel Image Classification Dataset
cubig.ai
zip
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Intel Image Classification Dataset [Dataset]. https://cubig.ai/store/products/301/intel-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 28, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Intel Image Classification Dataset contains natural scene images from various locations around the world and is labeled across six distinct categories.

2) Data Utilization (1) Characteristics of the Intel Image Classification Dataset: • The dataset features a diverse range of scenes, including buildings, forests, glaciers, mountains, seas, and streets, allowing for testing model generalization across multiple real-world environments. • The data is organized into separate sets for training, testing, and prediction, making it straightforward to use for supervised learning tasks.

(2) Applications of the Intel Image Classification Dataset: • Development of scene classification models: This dataset is suitable for training and evaluating deep learning models that can automatically classify different types of natural scenes, supporting applications in automated photo organization, environmental monitoring, and geolocation tasks.
i
Street Image Classification Dataset
images.cv
zip
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Street Image Classification Dataset [Dataset]. https://images.cv/dataset/street-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
Nov 27, 2025
License
https://images.cv/licensehttps://images.cv/license
Description
Labeled Street images suitable for training and evaluating computer vision and deep learning models.
c
Shells or Pebbles: An Image Classification Dataset
cubig.ai
zip
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Shells or Pebbles: An Image Classification Dataset [Dataset]. https://cubig.ai/store/products/299/shells-or-pebbles-an-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 28, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Shells or Pebbles: An Image Classification Dataset is a computer vision dataset designed for a binary classification task that distinguishes between shells and pebbles. The dataset consists of two classes (Shells and Pebbles), and each image is used to determine whether the object is a shell or a pebble.

2) Data Utilization (1) Characteristics of the Shells or Pebbles: An Image Classification Dataset: • The dataset is designed to help models learn and distinguish subtle visual differences between shells and pebbles, which often share similar shapes and textures. • It contains images captured under varied backgrounds and conditions, making it suitable for training models with strong generalization capabilities.

(2) Applications of the Shells or Pebbles: An Image Classification Dataset: • Development of binary classification models (Shell vs. Pebble): The dataset can be used to train deep learning models that classify images as either shell or pebble. • Educational use for visual recognition tasks: This dataset is also suitable for training in shape-, texture-, and edge-based feature extraction and pattern recognition, making it a valuable resource for teaching and experimentation in computer vision.
Cats and Dogs image classification
kaggle.com
zip
Updated Dec 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Cortinhas (2022). Cats and Dogs image classification [Dataset]. https://www.kaggle.com/datasets/samuelcortinhas/cats-and-dogs-image-classification
Explore at:
zip(67566406 bytes)Available download formats
Dataset updated
Dec 20, 2022
Authors
Samuel Cortinhas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Over a 1000 images of cats and dogs scraped off of google images. The problem statement is to build a model that can classify between a cat and a dog in an image as accurately as possible.

Image sizes range from roughly 100x100 pixels to 2000x1000 pixels.

Image format is jpeg.

Duplicates have been removed.
c
Data from: Satellite Image Classification Dataset
cubig.ai
zip
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Satellite Image Classification Dataset [Dataset]. https://cubig.ai/store/products/290/satellite-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 28, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Satellite Image Classification Dataset is a benchmark image classification dataset constructed using satellite remote sensing imagery. It includes a total of four land surface classes—cloudy, desert, green_area, and water—collected from various sensor-based images and Google Maps snapshots. The dataset is designed for training and evaluating image-based scene recognition models.

2) Data Utilization (1) Characteristics of the Satellite Image Classification Dataset: • The dataset was collected with the aim of automatic interpretation of satellite imagery and consists of a combination of sensor-based images and map snapshots, offering a realistic representation of real-world conditions. • All images are of fixed resolution and include diverse landform features, making the dataset suitable for classification experiments across different environments and for evaluating model generalization performance.

(2) Applications of the Satellite Image Classification Dataset: • Land surface classification model training: Can be used in experiments to classify various types of terrain such as buildings, farmland, and roads. • Research and application in geospatial information analysis: Useful for developing models that support spatial decision-making through tasks such as land use monitoring, urban structure analysis, and land surface inference.
g
TIME – Image Dataset – Classification
gts.ai
json
Updated Apr 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). TIME – Image Dataset – Classification [Dataset]. https://gts.ai/dataset-download/time-image-dataset-classification/
Explore at:
jsonAvailable download formats
Dataset updated
Apr 21, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A diverse image dataset containing clock faces with varying styles, angles, and hand positions, split into training, testing, and validation subsets for accurate time recognition and image classification tasks.
R
Cats And Dogs Image Classification Dataset
universe.roboflow.com
zip
Updated Mar 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Workspace1 (2023). Cats And Dogs Image Classification Dataset [Dataset]. https://universe.roboflow.com/workspace1-aalti/cats-and-dogs-image-classification
Explore at:
zipAvailable download formats
Dataset updated
Mar 1, 2023
Dataset authored and provided by
Workspace1
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Cats And Dogs
Description
Cats And Dogs Image Classification

## Overview Cats And Dogs Image Classification is a dataset for classification tasks - it contains Cats And Dogs annotations for 2,000 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
h
nsfw-image-classification
huggingface.co
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Man (2024). nsfw-image-classification [Dataset]. https://huggingface.co/datasets/DarkyMan/nsfw-image-classification
Explore at:
Dataset updated
Feb 4, 2024
Authors
Man
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
THIS DATASET IS PROVIDED TO ANYONE WHO WISHES TO USE IT. THERE ARE NO RESTRICTIONS ON ITS USE.
c
Sports balls multiclass image classification Dataset
cubig.ai
zip
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Sports balls multiclass image classification Dataset [Dataset]. https://cubig.ai/store/products/437/sports-balls-multiclass-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
Jun 12, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Sports balls - multiclass image classification Dataset is a computer vision dataset for multi-class image classification, designed to classify images of balls used in various sports. The dataset consists of 15 categories, including basketballs, footballs (soccer), rugby balls, table tennis balls, and more.

2) Data Utilization (1) Characteristics of the Sports balls - multiclass image classification Dataset: • Some balls in the dataset feature intentional visual alterations (e.g., balls painted to resemble other types), enabling a precise evaluation of a model’s generalization and discrimination capabilities.

(2) Applications of the Sports balls - multiclass image classification Dataset: • Sports Ball Classification Model Development: This dataset can be used to train deep learning-based image classification models that automatically recognize and categorize various types of sports equipment. • Development of Sports-related Applications: The dataset is suitable for building sports equipment recognition systems, AR-based educational tools, and video-based sports analysis systems.
g
Tree Nuts Image Classification Dataset
gts.ai
json
Updated Apr 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). Tree Nuts Image Classification Dataset [Dataset]. https://gts.ai/dataset-download/tree-nuts-image-classification/
Explore at:
jsonAvailable download formats
Dataset updated
Apr 21, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Tree Nuts Image Classification Dataset includes over 1,300 high-quality RGB images of 10 different tree nut types, including almonds, walnuts, pecans, and hazelnuts. Designed for computer vision and machine learning tasks, it helps improve classification accuracy and quality control in the nut industry.
R
Full Image Classification Dataset
universe.roboflow.com
zip
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SP T8T9 Outdoor Farm (2025). Full Image Classification Dataset [Dataset]. https://universe.roboflow.com/sp-t8t9-outdoor-farm/full-image-classification
Explore at:
zipAvailable download formats
Dataset updated
Mar 26, 2025
Dataset authored and provided by
SP T8T9 Outdoor Farm
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Plants GfZG
Description
Full Image Classification

## Overview Full Image Classification is a dataset for classification tasks - it contains Plants GfZG annotations for 382 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Big Cats Image Classification Dataset 🦁
kaggle.com
zip
Updated Mar 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
iulia (2023). Big Cats Image Classification Dataset 🦁 [Dataset]. https://www.kaggle.com/datasets/patriciabrezeanu/big-cats-image-classification-dataset
Explore at:
zip(532304917 bytes)Available download formats
Dataset updated
Mar 29, 2023
Authors
iulia
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset, contains a curated collection of images featuring four distinct big cat species: lions, tigers, leopards, and cheetahs. The images were sourced using the DuckDuckGo search engine and are organized into separate directories for each animal. This dataset is ideal for machine learning and computer vision projects focused on image classification and species recognition. With this dataset, you can train and validate your models to accurately differentiate between these majestic big cats.
i
Harvester Image Classification Dataset
images.cv
zip
Updated Nov 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Harvester Image Classification Dataset [Dataset]. https://images.cv/dataset/harvester-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
Nov 28, 2025
License
https://images.cv/licensehttps://images.cv/license
Description
Labeled Harvester images suitable for training and evaluating computer vision and deep learning models.
t
Text, Tabular and Image Classification - Dataset - LDM
service.tib.eu
resodate.org
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Text, Tabular and Image Classification - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/text--tabular-and-image-classification
Explore at:
Dataset updated
Dec 16, 2024
Description
Text, tabular and image classification datasets
R
News Image Classification Dataset
universe.roboflow.com
zip
Updated Oct 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
cnsh (2025). News Image Classification Dataset [Dataset]. https://universe.roboflow.com/cnsh/news-image-classification-qyopy
Explore at:
zipAvailable download formats
Dataset updated
Oct 12, 2025
Dataset authored and provided by
cnsh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
News
Description
News Image Classification

## Overview News Image Classification is a dataset for classification tasks - it contains News annotations for 1,992 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
i
Classroom Image Classification Dataset
images.cv
zip
Updated Dec 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Classroom Image Classification Dataset [Dataset]. https://images.cv/dataset/classroom-image-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
Dec 13, 2021
License
https://images.cv/licensehttps://images.cv/license
Description
Labeled Classroom images suitable for training and evaluating computer vision and deep learning models.
Amazon Image Classification Dataset
universe.roboflow.com
zip
Updated Apr 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
object detection (2025). Amazon Image Classification Dataset [Dataset]. https://universe.roboflow.com/object-detection-kv6yb/amazon-image-classification
Explore at:
zipAvailable download formats
Dataset updated
Apr 3, 2025
Dataset provided by
Object detection
Authors
object detection
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Bounding Boxes
Description
Amazon Image Classification

## Overview Amazon Image Classification is a dataset for object detection tasks - it contains Objects annotations for 3,619 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).

Facebook

Twitter

Click to copy link

Link copied

Cite

Rushil Prajapati (2024). Scientific Image Classification Dataset [Dataset]. https://www.kaggle.com/datasets/rushilprajapati/data-final

Scientific Image Classification Dataset

A Comprehensive Repository of Scientific Imagery

Explore at:

18 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 13, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Rushil Prajapati

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Overview

This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.

Dataset Description

The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:

Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.

FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.

Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.

Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.

Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.

Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.

Use Cases

This dataset is ideal for developing and benchmarking image classification models that can be applied to:

Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.

Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.

Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.