Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.
The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:
Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.
FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.
Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.
Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.
Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.
Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.
This dataset is ideal for developing and benchmarking image classification models that can be applied to:
Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.
Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.
Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.
[1] https://ieeexplore.ieee.org/document/9710731
[2] https://github.com/vimal-isi-edu/BioFors
[3] https://link.springer.com/chapter/10.1007/978-3-031-53085-2_26
[5] https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke (Histopathology images)
[6] https://www.kaggle.com/datasets/chopinforest/esophageal-endoscopy-images (Macroscopy)
[7] https://www.kaggle.com/datasets/safurahajiheidari/kidney-stone-images (Macroscopy)
[8] https://www.kaggle.com/datasets/alifrahman/covid19-chest-xray-image-dataset (Macroscopy)
[9] https://www.kaggle.com/datasets/vitaliykinakh/stable-imagenet1k (Non-scientific images)
[10] https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic (Macroscopy)
[11] https://www.kaggle.com/datasets/sunedition/graphs-dataset (Non–scientific images)
Facebook
Twitterresolverkatla/Intel-Image-Classification dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview: This dataset is designed for vehicle classification tasks and contains a total of 5,600 images distributed across seven categories. Each category represents a different type of vehicle.
Structure:
Image Format: All images are in JPEG format with the .jpg extension.
Size: 5,600 images in total.
Usage: Ideal for building and testing image classification models to distinguish between different types of vehicles.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Intel Image Classification Dataset contains natural scene images from various locations around the world and is labeled across six distinct categories.
2) Data Utilization (1) Characteristics of the Intel Image Classification Dataset: • The dataset features a diverse range of scenes, including buildings, forests, glaciers, mountains, seas, and streets, allowing for testing model generalization across multiple real-world environments. • The data is organized into separate sets for training, testing, and prediction, making it straightforward to use for supervised learning tasks.
(2) Applications of the Intel Image Classification Dataset: • Development of scene classification models: This dataset is suitable for training and evaluating deep learning models that can automatically classify different types of natural scenes, supporting applications in automated photo organization, environmental monitoring, and geolocation tasks.
Facebook
Twitterhttps://images.cv/licensehttps://images.cv/license
Labeled Street images suitable for training and evaluating computer vision and deep learning models.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Shells or Pebbles: An Image Classification Dataset is a computer vision dataset designed for a binary classification task that distinguishes between shells and pebbles. The dataset consists of two classes (Shells and Pebbles), and each image is used to determine whether the object is a shell or a pebble.
2) Data Utilization (1) Characteristics of the Shells or Pebbles: An Image Classification Dataset: • The dataset is designed to help models learn and distinguish subtle visual differences between shells and pebbles, which often share similar shapes and textures. • It contains images captured under varied backgrounds and conditions, making it suitable for training models with strong generalization capabilities.
(2) Applications of the Shells or Pebbles: An Image Classification Dataset: • Development of binary classification models (Shell vs. Pebble): The dataset can be used to train deep learning models that classify images as either shell or pebble. • Educational use for visual recognition tasks: This dataset is also suitable for training in shape-, texture-, and edge-based feature extraction and pattern recognition, making it a valuable resource for teaching and experimentation in computer vision.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Over a 1000 images of cats and dogs scraped off of google images. The problem statement is to build a model that can classify between a cat and a dog in an image as accurately as possible.
Image sizes range from roughly 100x100 pixels to 2000x1000 pixels.
Image format is jpeg.
Duplicates have been removed.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Satellite Image Classification Dataset is a benchmark image classification dataset constructed using satellite remote sensing imagery. It includes a total of four land surface classes—cloudy, desert, green_area, and water—collected from various sensor-based images and Google Maps snapshots. The dataset is designed for training and evaluating image-based scene recognition models.
2) Data Utilization (1) Characteristics of the Satellite Image Classification Dataset: • The dataset was collected with the aim of automatic interpretation of satellite imagery and consists of a combination of sensor-based images and map snapshots, offering a realistic representation of real-world conditions. • All images are of fixed resolution and include diverse landform features, making the dataset suitable for classification experiments across different environments and for evaluating model generalization performance.
(2) Applications of the Satellite Image Classification Dataset: • Land surface classification model training: Can be used in experiments to classify various types of terrain such as buildings, farmland, and roads. • Research and application in geospatial information analysis: Useful for developing models that support spatial decision-making through tasks such as land use monitoring, urban structure analysis, and land surface inference.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A diverse image dataset containing clock faces with varying styles, angles, and hand positions, split into training, testing, and validation subsets for accurate time recognition and image classification tasks.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Cats And Dogs Image Classification is a dataset for classification tasks - it contains Cats And Dogs annotations for 2,000 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
THIS DATASET IS PROVIDED TO ANYONE WHO WISHES TO USE IT. THERE ARE NO RESTRICTIONS ON ITS USE.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Sports balls - multiclass image classification Dataset is a computer vision dataset for multi-class image classification, designed to classify images of balls used in various sports. The dataset consists of 15 categories, including basketballs, footballs (soccer), rugby balls, table tennis balls, and more.
2) Data Utilization (1) Characteristics of the Sports balls - multiclass image classification Dataset: • Some balls in the dataset feature intentional visual alterations (e.g., balls painted to resemble other types), enabling a precise evaluation of a model’s generalization and discrimination capabilities.
(2) Applications of the Sports balls - multiclass image classification Dataset: • Sports Ball Classification Model Development: This dataset can be used to train deep learning-based image classification models that automatically recognize and categorize various types of sports equipment. • Development of Sports-related Applications: The dataset is suitable for building sports equipment recognition systems, AR-based educational tools, and video-based sports analysis systems.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Tree Nuts Image Classification Dataset includes over 1,300 high-quality RGB images of 10 different tree nut types, including almonds, walnuts, pecans, and hazelnuts. Designed for computer vision and machine learning tasks, it helps improve classification accuracy and quality control in the nut industry.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Full Image Classification is a dataset for classification tasks - it contains Plants GfZG annotations for 382 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset, contains a curated collection of images featuring four distinct big cat species: lions, tigers, leopards, and cheetahs. The images were sourced using the DuckDuckGo search engine and are organized into separate directories for each animal. This dataset is ideal for machine learning and computer vision projects focused on image classification and species recognition. With this dataset, you can train and validate your models to accurately differentiate between these majestic big cats.
Facebook
Twitterhttps://images.cv/licensehttps://images.cv/license
Labeled Harvester images suitable for training and evaluating computer vision and deep learning models.
Facebook
TwitterText, tabular and image classification datasets
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
News Image Classification is a dataset for classification tasks - it contains News annotations for 1,992 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterhttps://images.cv/licensehttps://images.cv/license
Labeled Classroom images suitable for training and evaluating computer vision and deep learning models.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Amazon Image Classification is a dataset for object detection tasks - it contains Objects annotations for 3,619 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.
The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:
Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.
FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.
Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.
Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.
Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.
Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.
This dataset is ideal for developing and benchmarking image classification models that can be applied to:
Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.
Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.
Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.
[1] https://ieeexplore.ieee.org/document/9710731
[2] https://github.com/vimal-isi-edu/BioFors
[3] https://link.springer.com/chapter/10.1007/978-3-031-53085-2_26
[5] https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke (Histopathology images)
[6] https://www.kaggle.com/datasets/chopinforest/esophageal-endoscopy-images (Macroscopy)
[7] https://www.kaggle.com/datasets/safurahajiheidari/kidney-stone-images (Macroscopy)
[8] https://www.kaggle.com/datasets/alifrahman/covid19-chest-xray-image-dataset (Macroscopy)
[9] https://www.kaggle.com/datasets/vitaliykinakh/stable-imagenet1k (Non-scientific images)
[10] https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic (Macroscopy)
[11] https://www.kaggle.com/datasets/sunedition/graphs-dataset (Non–scientific images)