100+ datasets found
  1. Scientific Image Classification Dataset

    • kaggle.com
    Updated Apr 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rushil Prajapati (2024). Scientific Image Classification Dataset [Dataset]. https://www.kaggle.com/datasets/rushilprajapati/data-final
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rushil Prajapati
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview

    This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.

    Dataset Description

    The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:

    Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.

    FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.

    Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.

    Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.

    Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.

    Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.

    Use Cases

    This dataset is ideal for developing and benchmarking image classification models that can be applied to:

    Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.

    Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.

    Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.

    References

    [1] https://ieeexplore.ieee.org/document/9710731

    [2] https://github.com/vimal-isi-edu/BioFors

    [3] https://link.springer.com/chapter/10.1007/978-3-031-53085-2_26

    [4] https://www.nationalacademies.org/news/2022/05/automated-research-workflows-are-speeding-pace-of-scientific-discovery-new-report-offers-recommendations-to-advance-their-development

    [5] https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke (Histopathology images)

    [6] https://www.kaggle.com/datasets/chopinforest/esophageal-endoscopy-images (Macroscopy)

    [7] https://www.kaggle.com/datasets/safurahajiheidari/kidney-stone-images (Macroscopy)

    [8] https://www.kaggle.com/datasets/alifrahman/covid19-chest-xray-image-dataset (Macroscopy)

    [9] https://www.kaggle.com/datasets/vitaliykinakh/stable-imagenet1k (Non-scientific images)

    [10] https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic (Macroscopy)

    [11] https://www.kaggle.com/datasets/sunedition/graphs-dataset (Non–scientific images)

  2. h

    Intel-Image-Classification

    • huggingface.co
    Updated Jun 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Angelo Panique (2025). Intel-Image-Classification [Dataset]. https://huggingface.co/datasets/resolverkatla/Intel-Image-Classification
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Christian Angelo Panique
    Description

    resolverkatla/Intel-Image-Classification dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. Vehicle Image Classification

    • kaggle.com
    zip
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Maher (2024). Vehicle Image Classification [Dataset]. https://www.kaggle.com/datasets/mohamedmaher5/vehicle-classification
    Explore at:
    zip(866783573 bytes)Available download formats
    Dataset updated
    Aug 9, 2024
    Authors
    Mohamed Maher
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview: This dataset is designed for vehicle classification tasks and contains a total of 5,600 images distributed across seven categories. Each category represents a different type of vehicle.

    Structure:

    • Main Folder: Vehicles
    • Subfolders:
      • Auto Rickshaws (800 images)
      • Bikes (800 images)
      • Cars (800 images)
      • Motorcycles (800 images)
      • Planes (800 images)
      • Ships (800 images)
      • Trains (800 images)

    Image Format: All images are in JPEG format with the .jpg extension.

    Size: 5,600 images in total.

    Usage: Ideal for building and testing image classification models to distinguish between different types of vehicles.

  4. c

    Intel Image Classification Dataset

    • cubig.ai
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Intel Image Classification Dataset [Dataset]. https://cubig.ai/store/products/301/intel-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Intel Image Classification Dataset contains natural scene images from various locations around the world and is labeled across six distinct categories.

    2) Data Utilization (1) Characteristics of the Intel Image Classification Dataset: • The dataset features a diverse range of scenes, including buildings, forests, glaciers, mountains, seas, and streets, allowing for testing model generalization across multiple real-world environments. • The data is organized into separate sets for training, testing, and prediction, making it straightforward to use for supervised learning tasks.

    (2) Applications of the Intel Image Classification Dataset: • Development of scene classification models: This dataset is suitable for training and evaluating deep learning models that can automatically classify different types of natural scenes, supporting applications in automated photo organization, environmental monitoring, and geolocation tasks.

  5. i

    Street Image Classification Dataset

    • images.cv
    zip
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Street Image Classification Dataset [Dataset]. https://images.cv/dataset/street-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 27, 2025
    License

    https://images.cv/licensehttps://images.cv/license

    Description

    Labeled Street images suitable for training and evaluating computer vision and deep learning models.

  6. c

    Shells or Pebbles: An Image Classification Dataset

    • cubig.ai
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Shells or Pebbles: An Image Classification Dataset [Dataset]. https://cubig.ai/store/products/299/shells-or-pebbles-an-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Shells or Pebbles: An Image Classification Dataset is a computer vision dataset designed for a binary classification task that distinguishes between shells and pebbles. The dataset consists of two classes (Shells and Pebbles), and each image is used to determine whether the object is a shell or a pebble.

    2) Data Utilization (1) Characteristics of the Shells or Pebbles: An Image Classification Dataset: • The dataset is designed to help models learn and distinguish subtle visual differences between shells and pebbles, which often share similar shapes and textures. • It contains images captured under varied backgrounds and conditions, making it suitable for training models with strong generalization capabilities.

    (2) Applications of the Shells or Pebbles: An Image Classification Dataset: • Development of binary classification models (Shell vs. Pebble): The dataset can be used to train deep learning models that classify images as either shell or pebble. • Educational use for visual recognition tasks: This dataset is also suitable for training in shape-, texture-, and edge-based feature extraction and pattern recognition, making it a valuable resource for teaching and experimentation in computer vision.

  7. Cats and Dogs image classification

    • kaggle.com
    zip
    Updated Dec 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Cortinhas (2022). Cats and Dogs image classification [Dataset]. https://www.kaggle.com/datasets/samuelcortinhas/cats-and-dogs-image-classification
    Explore at:
    zip(67566406 bytes)Available download formats
    Dataset updated
    Dec 20, 2022
    Authors
    Samuel Cortinhas
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Over a 1000 images of cats and dogs scraped off of google images. The problem statement is to build a model that can classify between a cat and a dog in an image as accurately as possible.

    Image sizes range from roughly 100x100 pixels to 2000x1000 pixels.

    Image format is jpeg.

    Duplicates have been removed.

  8. c

    Data from: Satellite Image Classification Dataset

    • cubig.ai
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Satellite Image Classification Dataset [Dataset]. https://cubig.ai/store/products/290/satellite-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Satellite Image Classification Dataset is a benchmark image classification dataset constructed using satellite remote sensing imagery. It includes a total of four land surface classes—cloudy, desert, green_area, and water—collected from various sensor-based images and Google Maps snapshots. The dataset is designed for training and evaluating image-based scene recognition models.

    2) Data Utilization (1) Characteristics of the Satellite Image Classification Dataset: • The dataset was collected with the aim of automatic interpretation of satellite imagery and consists of a combination of sensor-based images and map snapshots, offering a realistic representation of real-world conditions. • All images are of fixed resolution and include diverse landform features, making the dataset suitable for classification experiments across different environments and for evaluating model generalization performance.

    (2) Applications of the Satellite Image Classification Dataset: • Land surface classification model training: Can be used in experiments to classify various types of terrain such as buildings, farmland, and roads. • Research and application in geospatial information analysis: Useful for developing models that support spatial decision-making through tasks such as land use monitoring, urban structure analysis, and land surface inference.

  9. g

    TIME – Image Dataset – Classification

    • gts.ai
    json
    Updated Apr 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). TIME – Image Dataset – Classification [Dataset]. https://gts.ai/dataset-download/time-image-dataset-classification/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 21, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A diverse image dataset containing clock faces with varying styles, angles, and hand positions, split into training, testing, and validation subsets for accurate time recognition and image classification tasks.

  10. R

    Cats And Dogs Image Classification Dataset

    • universe.roboflow.com
    zip
    Updated Mar 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Workspace1 (2023). Cats And Dogs Image Classification Dataset [Dataset]. https://universe.roboflow.com/workspace1-aalti/cats-and-dogs-image-classification
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 1, 2023
    Dataset authored and provided by
    Workspace1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Cats And Dogs
    Description

    Cats And Dogs Image Classification

    ## Overview
    
    Cats And Dogs Image Classification is a dataset for classification tasks - it contains Cats And Dogs annotations for 2,000 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  11. h

    nsfw-image-classification

    • huggingface.co
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Man (2024). nsfw-image-classification [Dataset]. https://huggingface.co/datasets/DarkyMan/nsfw-image-classification
    Explore at:
    Dataset updated
    Feb 4, 2024
    Authors
    Man
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    THIS DATASET IS PROVIDED TO ANYONE WHO WISHES TO USE IT. THERE ARE NO RESTRICTIONS ON ITS USE.

  12. c

    Sports balls multiclass image classification Dataset

    • cubig.ai
    zip
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sports balls multiclass image classification Dataset [Dataset]. https://cubig.ai/store/products/437/sports-balls-multiclass-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Sports balls - multiclass image classification Dataset is a computer vision dataset for multi-class image classification, designed to classify images of balls used in various sports. The dataset consists of 15 categories, including basketballs, footballs (soccer), rugby balls, table tennis balls, and more.

    2) Data Utilization (1) Characteristics of the Sports balls - multiclass image classification Dataset: • Some balls in the dataset feature intentional visual alterations (e.g., balls painted to resemble other types), enabling a precise evaluation of a model’s generalization and discrimination capabilities.

    (2) Applications of the Sports balls - multiclass image classification Dataset: • Sports Ball Classification Model Development: This dataset can be used to train deep learning-based image classification models that automatically recognize and categorize various types of sports equipment. • Development of Sports-related Applications: The dataset is suitable for building sports equipment recognition systems, AR-based educational tools, and video-based sports analysis systems.

  13. g

    Tree Nuts Image Classification Dataset

    • gts.ai
    json
    Updated Apr 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Tree Nuts Image Classification Dataset [Dataset]. https://gts.ai/dataset-download/tree-nuts-image-classification/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 21, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Tree Nuts Image Classification Dataset includes over 1,300 high-quality RGB images of 10 different tree nut types, including almonds, walnuts, pecans, and hazelnuts. Designed for computer vision and machine learning tasks, it helps improve classification accuracy and quality control in the nut industry.

  14. R

    Full Image Classification Dataset

    • universe.roboflow.com
    zip
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SP T8T9 Outdoor Farm (2025). Full Image Classification Dataset [Dataset]. https://universe.roboflow.com/sp-t8t9-outdoor-farm/full-image-classification
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 26, 2025
    Dataset authored and provided by
    SP T8T9 Outdoor Farm
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Plants GfZG
    Description

    Full Image Classification

    ## Overview
    
    Full Image Classification is a dataset for classification tasks - it contains Plants GfZG annotations for 382 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  15. Big Cats Image Classification Dataset 🦁

    • kaggle.com
    zip
    Updated Mar 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    iulia (2023). Big Cats Image Classification Dataset 🦁 [Dataset]. https://www.kaggle.com/datasets/patriciabrezeanu/big-cats-image-classification-dataset
    Explore at:
    zip(532304917 bytes)Available download formats
    Dataset updated
    Mar 29, 2023
    Authors
    iulia
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset, contains a curated collection of images featuring four distinct big cat species: lions, tigers, leopards, and cheetahs. The images were sourced using the DuckDuckGo search engine and are organized into separate directories for each animal. This dataset is ideal for machine learning and computer vision projects focused on image classification and species recognition. With this dataset, you can train and validate your models to accurately differentiate between these majestic big cats.

  16. i

    Harvester Image Classification Dataset

    • images.cv
    zip
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Harvester Image Classification Dataset [Dataset]. https://images.cv/dataset/harvester-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 28, 2025
    License

    https://images.cv/licensehttps://images.cv/license

    Description

    Labeled Harvester images suitable for training and evaluating computer vision and deep learning models.

  17. t

    Text, Tabular and Image Classification - Dataset - LDM

    • service.tib.eu
    • resodate.org
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Text, Tabular and Image Classification - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/text--tabular-and-image-classification
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    Text, tabular and image classification datasets

  18. R

    News Image Classification Dataset

    • universe.roboflow.com
    zip
    Updated Oct 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cnsh (2025). News Image Classification Dataset [Dataset]. https://universe.roboflow.com/cnsh/news-image-classification-qyopy
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 12, 2025
    Dataset authored and provided by
    cnsh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    News
    Description

    News Image Classification

    ## Overview
    
    News Image Classification is a dataset for classification tasks - it contains News annotations for 1,992 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  19. i

    Classroom Image Classification Dataset

    • images.cv
    zip
    Updated Dec 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Classroom Image Classification Dataset [Dataset]. https://images.cv/dataset/classroom-image-classification-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 13, 2021
    License

    https://images.cv/licensehttps://images.cv/license

    Description

    Labeled Classroom images suitable for training and evaluating computer vision and deep learning models.

  20. Amazon Image Classification Dataset

    • universe.roboflow.com
    zip
    Updated Apr 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    object detection (2025). Amazon Image Classification Dataset [Dataset]. https://universe.roboflow.com/object-detection-kv6yb/amazon-image-classification
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    Object detection
    Authors
    object detection
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Amazon Image Classification

    ## Overview
    
    Amazon Image Classification is a dataset for object detection tasks - it contains Objects annotations for 3,619 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rushil Prajapati (2024). Scientific Image Classification Dataset [Dataset]. https://www.kaggle.com/datasets/rushilprajapati/data-final
Organization logo

Scientific Image Classification Dataset

A Comprehensive Repository of Scientific Imagery

Explore at:
18 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 13, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rushil Prajapati
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Overview

This dataset is a comprehensive collection of scientific images curated for the advancement of image classification algorithms in the scientific domain. It comprises a diverse set of images across six distinct classes, providing a unique challenge for machine learning enthusiasts and researchers. The base source of the data is derived from the Biofors dataset, with additional images incorporated to enhance variety and complexity. All images are either in .JPG or .PNG formats.

Dataset Description

The dataset is organized into six primary classes, each representing a different aspect of scientific imaging:

Blot-Gel: Images of various blotting techniques and gel electrophoresis results used in molecular biology.

FACS (Fluorescence-Activated Cell Sorting): Flow cytometry images showcasing cell populations based on fluorescent labeling.

Histopathology: High-resolution images of tissue sections stained to reveal cellular structures and patterns indicative of pathological states.

Macroscopy: Images captured without magnification, highlighting the gross features and details of biological specimens.

Microscopy: A collection of microscopic images that reveal the intricate details of cells and microorganisms.

Non-scientific: A control group of images unrelated to scientific inquiry, included to test the robustness of classification models. It mainly consists images from ImageNet dataset.

Use Cases

This dataset is ideal for developing and benchmarking image classification models that can be applied to:

Image Falsification and Fabrication Detection: This dataset serves as a foundation for developing forensic tools to combat image falsification and fabrication in scientific publications. With the Biofors dataset as a base, participants have the opportunity to create models that can detect unethical manipulations, thereby safeguarding the credibility of scientific research. The challenge lies in identifying subtle alterations that may indicate misconduct, such as duplicated, spliced, or artificially enhanced images. Success in this area has far-reaching implications, potentially preventing the spread of misinformation and preserving the integrity of scientific literature.

Automated Analysis of Scientific Experiments: The dataset facilitates the development of models for automated analysis in scientific experiments, which can significantly accelerate the pace of discovery. Automated research workflows, integrating computation, laboratory automation, and AI tools, are transforming how experiments are designed, conducted, and analyzed.

Diagnostic Tools in Medicine: In the medical field, diagnostic tools are essential for achieving diagnostic excellence, which involves making correct and timely diagnoses while maximizing patient experience and managing uncertainty. AI in healthcare is revolutionizing diagnostics, from analyzing medical images to identifying disease patterns and predicting patient outcomes.

References

[1] https://ieeexplore.ieee.org/document/9710731

[2] https://github.com/vimal-isi-edu/BioFors

[3] https://link.springer.com/chapter/10.1007/978-3-031-53085-2_26

[4] https://www.nationalacademies.org/news/2022/05/automated-research-workflows-are-speeding-pace-of-scientific-discovery-new-report-offers-recommendations-to-advance-their-development

[5] https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke (Histopathology images)

[6] https://www.kaggle.com/datasets/chopinforest/esophageal-endoscopy-images (Macroscopy)

[7] https://www.kaggle.com/datasets/safurahajiheidari/kidney-stone-images (Macroscopy)

[8] https://www.kaggle.com/datasets/alifrahman/covid19-chest-xray-image-dataset (Macroscopy)

[9] https://www.kaggle.com/datasets/vitaliykinakh/stable-imagenet1k (Non-scientific images)

[10] https://www.kaggle.com/datasets/nodoubttome/skin-cancer9-classesisic (Macroscopy)

[11] https://www.kaggle.com/datasets/sunedition/graphs-dataset (Non–scientific images)

Search
Clear search
Close search
Google apps
Main menu