8 datasets found

R
Car Highway Dataset
universe.roboflow.com
zip
Updated Sep 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Sep 13, 2023
Dataset authored and provided by
Sallar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Vehicles Bounding Boxes
Description
Car-Highway Data Annotation Project

Introduction

In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

Project Goals

Collect a diverse dataset of car images from highway scenes.

Annotate the dataset to identify and label cars within each image.

Organize and format the annotated data for machine learning model training.

Tools and Technologies

For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

Annotation Process

Upload the raw car images to the Roboflow platform.

Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.

Label each bounding box with the corresponding class (e.g., car).

Review and validate the annotations for accuracy.

Data Augmentation

Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

Data Export

Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

Milestones

Data Collection and Preprocessing

Annotation of Car Images

Data Augmentation

Data Export

Model Training

Conclusion

By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.
Apple Leaf Disease Detection Using Vision Transformer
zenodo.org
text/x-python
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amreen Batool; Amreen Batool (2025). Apple Leaf Disease Detection Using Vision Transformer [Dataset]. http://doi.org/10.5281/zenodo.15702007
Explore at:
text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15702007
Dataset updated
Jun 20, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Amreen Batool; Amreen Batool
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains a Python script for classifying apple leaf diseases using a Vision Transformer (ViT) model. The dataset used is the Plant Village dataset, which contains images of apple leaves with four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.

Table of Contents

Introduction

Code Explanation

Steps for Implementation

Example Usage

Conclusion

Introduction

The goal of this project is to classify apple leaf diseases using a Vision Transformer (ViT) model. The dataset is divided into four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.

Code Explanation

1. Importing Libraries

The script starts by importing necessary libraries such as matplotlib, seaborn, numpy, pandas, tensorflow, and sklearn. These libraries are used for data visualization, data manipulation, and building/training the deep learning model.

2. Visualizing the Dataset

The walk_through_dir function is used to explore the dataset directory structure and count the number of images in each class.

The dataset is divided into Train, Val, and Test directories, each containing subdirectories for the four classes.

3. Data Augmentation

The script uses ImageDataGenerator from Keras to apply data augmentation techniques such as rotation, horizontal flipping, and rescaling to the training data. This helps in improving the model's generalization ability.

Separate generators are created for training, validation, and test datasets.

4. Patch Visualization

The script defines a Patches layer that extracts patches from the images. This is a crucial step in Vision Transformers, where images are divided into smaller patches that are then processed by the transformer.

The script visualizes these patches for different patch sizes (32x32, 16x16, 8x8) to understand how the image is divided.

5. Model Training

The script defines a Vision Transformer (ViT) model using TensorFlow and Keras. The model is compiled with the Adam optimizer and categorical cross-entropy loss.

The model is trained for a specified number of epochs, and the training history is stored for later analysis.

6. Model Evaluation

After training, the model is evaluated on the test dataset. The script generates a confusion matrix and a classification report to assess the model's performance.

The confusion matrix is visualized using seaborn to provide a clear understanding of the model's predictions.

7. Visualizing Misclassified Images

The script includes functionality to visualize misclassified images, which helps in understanding where the model is making errors.

8. Fine-Tuning and Learning Rate Adjustment

The script demonstrates how to fine-tune the model by adjusting the learning rate and re-training the model.

Steps for Implementation

Dataset Preparation

Ensure that the dataset is organized into Train, Val, and Test directories, with each directory containing subdirectories for each class (Healthy, Apple Scab, Black Rot, Cedar Apple Rust).

Install Required Libraries

Install the necessary Python libraries using pip:

pip install tensorflow matplotlib seaborn numpy pandas scikit-learn

Run the Script

Execute the script in a Python environment. The script will automatically:

Load and preprocess the dataset.

Apply data augmentation.

Train the Vision Transformer model.

Evaluate the model and generate performance metrics.

Analyze Results

Review the confusion matrix and classification report to understand the model's performance.

Visualize misclassified images to identify potential areas for improvement.

Fine-Tuning

Experiment with different patch sizes, learning rates, and data augmentation techniques to improve the model's accuracy.
Training dataset for object detection - Penguins from UAV
data.aad.gov.au
researchdata.edu.au
+1more
Updated Feb 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BELYAEV, OLEG (2023). Training dataset for object detection - Penguins from UAV [Dataset]. http://doi.org/10.26179/s10z-da41
Explore at:
Unique identifier
https://doi.org/10.26179/s10z-da41
Dataset updated
Feb 21, 2023
Dataset provided by
Australian Antarctic Divisionhttps://www.antarctica.gov.au/
Australian Antarctic Data Centre
Authors
BELYAEV, OLEG
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Feb 8, 2021
Area covered

Description
On February 8, 2021, Deception Island Chinstrap penguin colonies were photographed during the PiMetAn Project XXXIV Spanish Antarctic campaign using unmanned aerial vehicles (UAV) at a height of 30m. From the obtained imagery, a training dataset for penguin detection from aerial perspective was generated.

The penguin species is the Chinstrap penguin (Pygoscelis antarcticus).

The dataset consists of three folders: "train", containing 531 images, intended for model training; "valid", containing 50 images, intended for model validation; and "test", containing 25 images, intended for model testing. In each of the three folders, an additional .csv file is located, containing labels (x,y positions and class names for every penguin in the images), annotated in Tensorflow Object Detection format.

There is only one annotation class: Penguin.

All 606 images are 224x224 px in size, and 96 dpi.

The following augmentation was applied to create 3 versions of each source image: * Random shear of between -18° to +18° horizontally and -11° to +11° vertically

This dataset was annotated and exported via www.roboflow.com

The model Faster R-CNN64 with ResNet-101 backbone was used to perform object detection tasks. Training and evaluation tasks were performed using the TensorFlow 2.0 machine learning platform by Google.
Cancer Detection dataset
kaggle.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manikandan (2025). Cancer Detection dataset [Dataset]. https://www.kaggle.com/datasets/mani11111111111/cancer-detection-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Manikandan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🩺 Cancer Cell Detection Dataset

📌 Overview

This dataset contains high-resolution microscopic images of cancerous and non-cancerous cells. It is designed for deep learning-based cancer detection models, specifically for binary classification (Benign vs. Malignant).

📂 Dataset Structure

The dataset is organized into two main folders:

📁 train/ – Labeled images for training:
- 0/ (Benign) → Non-cancerous cell images
- 1/ (Malignant) → Cancerous cell images

📁 test/ – Contains unlabeled images for model evaluation.

📸 Image Details

Format: .jpg / .png

Resolution: 150x150 pixels (can be resized)

Color Mode: RGB (3-channel images)

🔍 Use Cases

✅ Cancer detection using Convolutional Neural Networks (CNNs)
✅ Image classification & feature extraction
✅ Transfer learning with VGG16, ResNet, etc.
✅ Medical AI research

📈 Model Performance Benchmark

Trained using a CNN model, achieving 92% accuracy on the validation set.

Data augmentation and advanced architectures can further improve performance.

🚀 Future Enhancements

✅ Data Augmentation to improve generalization
✅ Transfer Learning using pre-trained models
✅ Web App Deployment for real-time detection

📜 License

📌 MIT License – Free to use, modify, and distribute with proper attribution.

💡 How to Use?

1️⃣ Download the dataset from Kaggle.
2️⃣ Preprocess images (rescale, normalize).
3️⃣ Train a CNN using TensorFlow/Keras or PyTorch.
4️⃣ Evaluate the model using the test set.

📢 Acknowledgments

This dataset is inspired by medical AI research and deep learning applications. Special thanks to OpenAI, TensorFlow, and Kaggle for resources.
Fast Food Classification Dataset - V2 | 20k Images
kaggle.com
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DeepNets (2022). Fast Food Classification Dataset - V2 | 20k Images [Dataset]. https://www.kaggle.com/datasets/utkarshsaxenadn/fast-food-classification-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 6, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
DeepNets
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Version 2

Version 2 extends the version 1 of the fastfood classification data set and introduces some new classes with new images. These new classes are : * Baked Potato * Crispy Chicken * Fries * Taco * Taquito

The data set is divided into 4 parts, the Tensorflow Records, Training DataValidation Data** and Testing Data. The tensorflow records directory is further divided into 3 parts, the Train, Valid and Test. These images are resized to 256 by 256 pixels. No other augmentation is applied. While loading the tensorflow records files, you can apply any augmentation you want.

Train : Contains 15,000 training images, with each class having 1,500 images.

Valid : Contains 3,500 validation images, with each class having 400 images.

Test : Contains 1,500 validation images, with each class having 100/200 images.

Unlike the Tensorflow records data, the Training data, validation data and testing data contains direct images. These are raw images. So any kind of augmentation, and specially resizing, can be applied on them.

Training Data : This directory contains 5 subdirectories. Each directory representing a class. Each class have 1,500 training images.

Validation Data : This directory also contains 10 subdirectories. Each directory representing a class. Each **class have 400 images for monitoring model's performance.

Testing Data : This directory also contains 10 subdirectories. Each directory representing a class. Each **class have 100 /200 images for evaluating model's performance.

Version 1

This is Fast Food Classification data set containing images of 5 different types of fast food. Each directory represents a class, and each class represents a food type. The Classes are : * Burger * Donut * Hot Dog * Pizza * Sandwich

The data set is divided into 3 parts, the Tensorflow records, Training data set and Validation data set. * The tensorflow records directory is further divided into 2 parts, the training images and the validation images.These images are resized to 256 by 256 pixels. No other augmentation is applied. While loading the tensorflow records files, you can apply any augmentation you want. * Training Images : Contains 7,500 training images, with each class having 1,500 images. * Validation Images : Contains 2,500 validation images, with each class having 500 images.

Unlike the Tensorflow records data, the Training data and validation data contains direct images. These are raw images. So any kind of augmentation, and specially resizing, can be applied on them.

Training Data : This directory contains 5 subdirectories. Each directory representing a class. Each class have 1,500 training images.

Validation Data : This directory also contains 5 subdirectories. Each directory representing a class. Each **class have 500 images for monitoring model's performance.
h
imagenet_sketch
huggingface.co
opendatalab.com
+1more
Updated May 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Songwei Ge (2024). imagenet_sketch [Dataset]. https://huggingface.co/datasets/songweig/imagenet_sketch
Explore at:
Dataset updated
May 25, 2024
Authors
Songwei Ge
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
ImageNet-Sketch data set consists of 50000 images, 50 images for each of the 1000 ImageNet classes. We construct the data set with Google Image queries "sketch of _", where _ is the standard class name. We only search within the "black and white" color scheme. We initially query 100 images for every class, and then manually clean the pulled images by deleting the irrelevant images and images that are for similar but different classes. For some classes, there are less than 50 images after manually cleaning, and then we augment the data set by flipping and rotating the images.
P
Animal Species Classification Dataset Dataset
paperswithcode.com
Updated Mar 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Animal Species Classification Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/animal-species-classification-dataset
Explore at:
Dataset updated
Mar 30, 2025
Description
Description:

👉 Download the dataset here

The Animal Species Classification Dataset is meticulously design to support the development and training of machine learning models for multi-class image recognition tasks. The dataset encompasses a wide variety of animal species, making it an essential resource for projects focused on biodiversity, wildlife conservation, and zoological studies. Regular updates ensure that the dataset remains comprehensive, providing a diverse and evolving collection of animal images for accurate species classification.

Download Dataset

Dataset Composition:

The dataset is structured into six key directories, each serving a specific purpose within the machine learning pipeline:

Interesting Data:

• This directory contains 5 unique and challenging images per species class. These "interesting" images are selected to test the model's ability to make accurate predictions in complex scenarios. Evaluating model performance on these images offers insights into its understanding and classification capabilities.

Testing Data:

• A randomly populate directory with images from each species class, specifically curate for model testing. This dataset is essential for evaluating the performance and generalization of the model after it has been train.

TFRecords Data:

• This directory includes the dataset formatted as TensorFlow records. All images have been preprocessed, resized to 256x256 pixels, and normalized. These ready-to-use files facilitate seamless integration into TensorFlow-based machine learning workflows.

Train Augmented:

• To enhance model training, this directory contains augmented versions of the original training images. For each original image, 5 augmented variations are generated, resulting in a total of 10,000 images per species class. This augmentation strategy is crucial for increasing dataset size and diversity, which in turn helps the model learn more robust features.

Training Images:

• This directory is dedicated to the core training data, with each species class containing 2,000 images. The images have been uniformly resized to 256x256 pixels and normalized to a pixel range of 0 to 1. These images form the backbone of the dataset, enabling the model to learn distinguishing features for each species.

Validation Images:

• The validation directory contains 100 to 200 images per species class. These images are used during the training process to monitor the model's performance and adjust hyperparameters accordingly. By providing a separate validation set, the dataset ensures that the model's accuracy and reliability are rigorously evaluate.

Species Classes:

The dataset includes images from the following 15 animal species:

Beetle Butterfly Cat Cow Dog Elephant Gorilla Hippo Lizard Monkey Mouse Panda Spider Tiger Zebra

Each class is carefully curated to provide a balance and comprehensive representation of the species, making this dataset suitable for various image classification and recognition tasks.

Application:

This dataset is ideal for building machine learning models aim at classifying animal species. It serves as a valuable resource for academic research, conservation efforts, and applications in wildlife monitoring. By leveraging the diverse and augment images, models train on this dataset can achieve high accuracy and robustness in real-world classification tasks.

This dataset is sourced from Kaggle.
f
research on soyabean leaves
figshare.com
pdf
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prajwal Bawankar (2025). research on soyabean leaves [Dataset]. http://doi.org/10.6084/m9.figshare.28797590.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28797590.v1
Dataset updated
Apr 15, 2025
Dataset provided by
figshare
Authors
Prajwal Bawankar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This project focuses on developing an intelligent system capable of detecting and classifying diseases in plant leaves using image processing and deep learning techniques. Leveraging Convolutional Neural Networks (CNNs) and transfer learning, the system analyzes leaf images to identify signs of infection with high accuracy. It supports smart agriculture by enabling early disease detection, reducing crop loss, and providing actionable insights to farmers. The project uses datasets such as PlantVillage and integrates frameworks like TensorFlow, Keras, and PyTorch. The model can be deployed as a web or mobile application, offering a real-time solution for plant health monitoring in agricultural environments.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway/dataset/1

Car Highway Dataset

car-highway

car-highway-dataset

Explore at:

zipAvailable download formats

Dataset updated

Sep 13, 2023

Dataset authored and provided by

Sallar

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured

Vehicles Bounding Boxes

Description

Car-Highway Data Annotation Project

Introduction

In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

Project Goals

Collect a diverse dataset of car images from highway scenes.
Annotate the dataset to identify and label cars within each image.
Organize and format the annotated data for machine learning model training.

Tools and Technologies

For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

Annotation Process

Upload the raw car images to the Roboflow platform.
Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.
Label each bounding box with the corresponding class (e.g., car).
Review and validate the annotations for accuracy.

Data Augmentation

Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

Data Export

Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

Milestones

Data Collection and Preprocessing
Annotation of Car Images
Data Augmentation
Data Export
Model Training

Conclusion

By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.

Clear search

Close search

Google apps

Main menu

Car Highway Dataset

Car-Highway Data Annotation Project

Introduction

Project Goals

Tools and Technologies

Annotation Process

Data Augmentation

Data Export

Milestones

Conclusion

Apple Leaf Disease Detection Using Vision Transformer

Table of Contents

Introduction

Code Explanation

1. Importing Libraries

2. Visualizing the Dataset

3. Data Augmentation

4. Patch Visualization

5. Model Training

6. Model Evaluation

7. Visualizing Misclassified Images

8. Fine-Tuning and Learning Rate Adjustment

Steps for Implementation

Training dataset for object detection - Penguins from UAV

Cancer Detection dataset

🩺 Cancer Cell Detection Dataset

📌 Overview

📂 Dataset Structure

📸 Image Details

🔍 Use Cases

📈 Model Performance Benchmark

🚀 Future Enhancements

📜 License

💡 How to Use?

📢 Acknowledgments

Fast Food Classification Dataset - V2 | 20k Images

Version 2

Version 1

imagenet_sketch

Animal Species Classification Dataset Dataset

research on soyabean leaves

Car Highway Dataset

car-highway

car-highway-dataset

Car-Highway Data Annotation Project

Introduction

Project Goals

Tools and Technologies

Annotation Process

Data Augmentation

Data Export

Milestones

Conclusion