https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The complete set of images have been classified among two classes i.e. healthy and diseased. First, the acquired images are classified and labeled conferring to the plants. The plants were named ranging from P0 to P11. Then the entire dataset has been divided among 22 subject categories ranging from 0000 to 0022. The classes labeled with 0000 to 0011 were marked as a healthy class and ranging from 0012 to 0022 were labeled diseased class. This is a collection of about 4503 images of which contains 2278 images of healthy leaf and 2225 images of the diseased leaf. Twelve plants named as Mango, Arjun, Alstonia Scholaris, Guava, Bael, Jamun, Jatropha, Pongamia Pinnata, Basil, Pomegranate, Lemon, and Chinar have been selected. Leaf images of these plants in healthy and diseased condition have been acquired and divided between two separate modules.
Dataset Title: Plant Leaf Image Dataset
Description:
The Plant Leaf Image Dataset is a collection of high-quality images focusing on various plant leaves, aimed at supporting research and development in plant health monitoring, disease detection, and species identification. This dataset contains images that capture different plant species under varying conditions, allowing for diverse applications in agriculture, botany, and AI-based plant recognition.
Key Features: - Diversity: The dataset includes images from multiple plant species, providing a broad range for identifying and categorizing plant types. - Image Quality: High-resolution images ensure that the leaf textures, colors, and unique patterns are clear, making the dataset suitable for machine learning tasks. - Potential Use Cases: The dataset can be used for building and training models for plant species identification, disease detection, leaf classification, and agricultural monitoring tools.
Applications:
This dataset is particularly valuable for AI practitioners and researchers focused on agriculture-related projects, especially for those developing models in plant recognition, disease classification, and monitoring plant health. With the right preprocessing techniques, it can also serve as a base for projects aiming to improve crop management, sustainability, and yield predictions.
Format:
- Image files in standard formats (e.g., JPG or PNG).
- Organized into folders based on plant type or condition for easy access and utilization.
This dataset is ready for integration into machine learning pipelines for training and evaluation in various agriculture and plant-related AI applications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pl@ntNet-300K is an image dataset aimed at evaluating set-valued classification. It was built from the database of Pl@ntnet citizen observatory and consists of 306146 images, covering 1081 species. We highlight two particular features of the dataset, inherent to the way the images are acquired and to the intrinsic diversity of plants morphology:
i) The dataset exhibits a strong class imbalance, meaning that a few species represent most of the images.
ii) Many species are visually similar, making identification difficult even for the expert eye.
These two characteristics make the present dataset a good candidate for the evaluation of set-valued classification methods and algorithms. Therefore, we recommend two set-valued evaluation metrics associated with the dataset (top-K and average-K) and we provide the results of a baseline approach based on a resnet50 trained with a cross-entropy loss. The full description of the dataset can be found in (to be provided soon).
The scientific publication (NEURIPS 2022) describing the dataset and providing baseline results can be found here: https://openreview.net/forum?id=eLYinD0TtIt
Utilities to load the data and train models with pytorch can be found here: https://github.com/plantnet/PlantNet-300K/
Fruit and vegetable plants are vulnerable to diseases that can negatively affect crop yield, causing planters to incur significant losses. These diseases can affect the plants at various stages of growth. Planters must be on constant watch to prevent them early, or infestation can spread and become severe and irrecoverable. There are many types of pest infestations of fruits and vegetables, and identifying them manually for appropriate preventive measures is difficult and time-consuming.This pretrained model can be deployed to identify plant diseases efficiently for carrying out suitable pest control. The training data for the model primarily includes images of leaves of diseased and healthy fruit and vegetable plants. It can classify the multiple categories of plant infestation or healthy plants from the images of the leaves.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using the Train Deep Learning Model tool. Follow the guide to fine-tune this model.Input8 bit, 3-band (RGB) image. Recommended image size is 224 x 224 pixels. Note: Input images should have grey or solid color background with one full leaf per image. OutputClassified image of the leaf with any of the plant disease, healthy leaf, or background classes as in the Plant Leaf Diseases dataset.Applicable geographiesThis model is expected to work well in all regions globally. However, results can vary for images that are statistically dissimilar to training data.Model architectureThis model uses the ResNet50 model architecture implemented in ArcGIS API for Python.Accuracy metricsThis model has an overall accuracy of 97.88 percent. The confusion matrix below summarizes the performance of the model on the validation dataset. Sample resultsHere are a few results from the model:Ground truth: Apple_black_rot / Prediction: Apple_black_rotGround truth: Potato_early_blight / Prediction: Potato_early_bightGround truth: Raspberry_healthy / Prediction: Raspberry_healthyGround truth: Strawberry_leaf_scorch / Prediction: Strawberry_leaf_scorch
Description: 👉 Download the dataset here This dataset offers an extensive collection of images and corresponding labels representing a wide array of plant diseases. Carefully curated from publicly available sources, it serves as a valuable resource for developing and evaluating machine learning models, particularly in the realms of image classification and plant disease detection. Dataset Composition: • Images: The dataset comprises high-quality images organized by plant species and disease… See the full description on the dataset page: https://huggingface.co/datasets/gtsaidata/Plant-Disease-Image-Dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There are two datasets and one table uploaded in this platform under the title "MED117_Medicinal Plant Leaf Dataset & Name Table". A folder is created with title "MED 117 Leaf Species". Inside this two sub folders with titles " Raw leaf image set of medicinal plants_v2" and "Segmented leaf set using UNET segmentation" are created. Raw leaf image set consists of leaf images of 117 medicinal plants found in Assam. All the samples are collected by visiting different (Govt, Public and Private) medicinal gardens situated in different places of Assam and some other general places where they are mostly found. Videos of 10 to 15 seconds duration were taken for two to three leaves of every species on a white background and video recording was done using a SLR Canon Camera. Individual videos were segregated into image frames and thus were able to get around 77,700 jpg image frames from the videos. The Raw leaf image set consists of folders with scientific name and common name within bracket. Second folder with title "Segmented leaf set using UNET segmentation" consists of 115 medicinal plant species with their segmented leaf image samples using UNET segmentation technique. Here two species are excluded from the original dataset due to small unpredictable size of the samples, so total 115 subfolders inside the segmented folder is achieved. Thirdly a table in doc format with title "Medicinal Plant Name Table" is uploaded and it includes Scientific name, Common name and Assamese name of the plants listed in the folders in the same sequence. The whole contribution is absolutely original and new, collected from different sources then processed for segmentation and prepared the table by discussing with taxonomy experts from Botany department of Gauhati University, Guwahati, Assam. India.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A dataset of 61,486 images of plant leaves and backgrounds, with each image labeled with the disease or pest that is present. The dataset was created by researchers at the University of Wisconsin-Madison and is used for research in machine learning and computer vision tasks such as plant disease detection and pest identification.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Egyptian Plant Leaf Image Dataset (EPLID) is a plant leaf image-based computer vision dataset developed for the purposes of monitoring plant health, detecting diseases, and identifying plant species. It consists of eight distinct classes — Apple, Berry, Fig, Guava, Orange, Palm, Persimmon, and Tomato — each organized into separate folders for convenient labeling and model training.
2) Data Utilization (1) Characteristics of the Egyptian Plant Leaf Image Dataset (EPLID): • The dataset contains images captured in real-world conditions, making it well-suited for the development of AI models that can be applied in practical agricultural environments. • Each image clearly presents the leaf's texture, color, and venation, enabling high-precision applications in plant recognition and disease detection tasks.
(2) Applications of the Egyptian Plant Leaf Image Dataset (EPLID): • Development of AI models for plant disease detection: The dataset can be used to train deep learning models that automatically identify plant diseases by learning abnormal leaf patterns such as spots, discoloration, and surface damage. • Construction of crop classification and cultivar identification systems: The dataset can support the development of models that classify different crop types and identify plant varieties based on their leaf characteristics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a comprehensive version of the Eggplant Leaf Image Dataset, designed to support machine learning and deep learning research in agriculture, plant pathology, and computer vision. This dataset addresses class imbalance and model generalization challenges by including a significantly expanded collection of images through controlled data augmentation.
The dataset includes a total of 2,180 high-resolution images (6000×4000 pixels), categorized into six disease or health classes of Solanum melongena (eggplant) leaves:
Class | Original Images | Augmented Images | Total Images |
---|---|---|---|
Healthy | 80 | 320 | 400 |
Insect-Pest | 40 | 320 | 360 |
Leaf-Spot | 50 | 300 | 350 |
Mosaic-Virus | 15 | 345 | 360 |
Small-Leaf | 20 | 340 | 360 |
Wilt | 50 | 300 | 350 |
All original images were captured using a Canon EOS 1300D DSLR camera under consistent natural lighting conditions. Files are saved in JPG format, and image resolution is preserved within ±5% of the original dimensions to maintain visual fidelity.
To improve dataset usability for robust model training and generalization, controlled data augmentation was applied using the Albumentations library. The transformations include random rotation, horizontal flipping, brightness/contrast adjustments, slight color shifts, and padding to maintain aspect ratio. All augmentation procedures were consistently applied and seeded for reproducibility. Augmentation parameters are documented in detail in the metadata.
The metadata.csv file provides a class-wise summary including original image count, augmented image count, augmentation ratios, and the exact augmentation pipeline used. The augmentation was seeded for reproducibility.
Note: The original and augmented images are stored in separate folders under the "Original" and "Augmented" directories, respectively. Each directory is organized into six class-specific subfolders: Healthy, Insect-Pest, Leaf-Spot, Mosaic-Virus, Small-Leaf, and Wilt. Augmented images are clearly distinguishable by the inclusion of the substring "_aug_" in their filenames. This clear separation ensures reproducibility, transparency in data provenance, and ease of use for researchers who may wish to train models using only original, only augmented, or both types of data.
Files:
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This plant image dataset consists of 14,790 images categorized into 47 distinct plant species classes. The dataset was compiled by collecting images from Bing Images and manually curating them, although not by professional biologist. I collected this images for a project aimed at classifying plant species as either toxic or safe for cats. Key Features:
Total Images: 14,790 Number of Classes: 47 Image Source: Collected from Bing Images Curation Method: Manual cleaning by non-expert
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kakasher/house-plant-species.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 United States License. # Data origins The dataset is originally hosted at PlantVillage Disease Classification Challenge. We use the modified version in this github repository to do controlled experiments. We only use the raw color images dataset and delete the unconventional characters in the classes directory name and .csv
filenames. # Directory explanation The 80-20
direcotry has multiple .txt
files which contain the training (~80%), validation(~10%) and testing (~10%) datasets instances filenames and the corresponding label indexes. The validation dataset quantity is 5430
in all data separation. In our experiment code (not included in this archive), the validation and testing dataset are merged together. # Data usage ## Replicate our experiments We have used this dataset in writing our paper. The reference information can be seen at https://gitlab.com/huix/leaf-disease-plant-village. ### Steps 1. cd
to the direcotry (e.g. /home/usrname/plantvillage_deeplearning_paper_dataset
) that contains the color
directory. 2. run python change_filename_prefix.py --prefix /home/usrname/plantvillage_deeplearning_paper_dataset
to modify the prefix path (which is /home/h/plantvillage_deeplearning_paper_dataset
in our former generated datasets). 3. Fin. You can use our opens ource codes repository to do the later experiments. ## Generate your own training/validation/testing datasets This data separation generating code isn't included in the dataset archive, it is in our open source code. Please see our open source code repository for the detailed information. If you have any questions, you can contact the author through email. The email address is a QR code in the archive. {"references": ["Hughes, D.P., Salathe, M.: An open access repository of images on plant health to enable the development of mobile disease diagnostics. ArXiv e-prints (2015). 1511.08060", "Mohanty, S.P., Hughes, D.P., Salath\u00e9, M.: Using deep learning for image-based plant disease detection. Frontiers in Plant Science 7, 1419 (2016). doi:10.3389/fpls.2016.01419"]} https://gitlab.com/huix/leaf-disease-plant-village
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
PlantVillage dataset containing 38 classes of plants with around 54,303 images
Plant Village dataset is a public dataset of 54,305 images of diseased and healthy plant leaves collected under controlled conditions ( PlantVillage Dataset). The images cover 14 species of crops, including: apple, blueberry, cherry, grape, orange, peach, pepper, potato, raspberry, soy, squash, strawberry and tomato. It contains images of 17 basic diseases, 4 bacterial diseases, 2 diseases caused by mold (oomycete), 2 viral diseases and 1 disease caused by a mite. 12 crop species also have healthy leaf images that are not visibly affected by disease.
This dataset was gotten from spMohanty's GitHub Repo
The dataset was created for use in my Plants Disease Detection using TF
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was collected as part of a research project focused on detecting leaf diseases in pea plants using deep learning and computer vision techniques. It contains labeled images of healthy and diseased pea plant leaves collected under real-world conditions between May 2023 and August 2025 in Urmar Payan, near Peshawar, Khyber Pakhtunkhwa, Pakistan.The goal of this dataset is to support AI-based solutions in agriculture, including disease classification, yield improvement, and sustainable crop monitoring. The images are suitable for training and testing machine learning models, particularly convolutional neural networks (CNNs). This dataset was used in the author's final year undergraduate project in AI and plant health monitoring.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RoseLeafInsight is a meticulously curated high-resolution image dataset designed for the classification and recognition of various rose leaf conditions using machine learning and computer vision techniques. This dataset includes four categories of rose leaves: Healthy, Black Spot, Insect Hole, and Yellow Mosaic Virus, providing a diverse set of images for disease detection and automated plant health monitoring. Each category is well-represented, ensuring a balanced dataset suitable for developing deep learning models for classification, segmentation, and disease detection tasks.
Dataset Composition: The dataset consists of a total of 3,228 high-resolution images, distributed across the following categories: 1. Healthy Leaf: 1,686 images 2. Black Spot: 409 images 3. Insect Hole: 453 images 4. Yellow Mosaic Virus: 680 images
The extended dataset (by augmentation) consists of a total of 12,000 high-resolution images, distributed across the following categories: 1. Healthy Leaf: 3,000 images 2. Black Spot: 3,000 images 3. Insect Hole: 3,000 images 4. Yellow Mosaic Virus: 3,000 images
Geographical Location of Data Collection: The rose leaf images were collected from two distinct locations in Bangladesh, ensuring diversity in environmental conditions and plant health variations: 1. Zailla, Singair, Manikganj - Latitude: 23°47'46.11"N - Longitude: 90°13'15.73"E
2. Golap Gram, Sadullapur-Komolapur, Road Birulia Bridge, Dhaka 1216
- Latitude: 23°50'6.108''N
- Longitude: 90°18'31.5108''E
These locations are known for their extensive rose cultivation, making them ideal for collecting a dataset that captures real-world variations in rose leaf health and disease conditions.
Preprocessing Details: To enhance model performance and standardize input images, the following preprocessing steps were applied: • Resizing: All images were resized to 3000 × 3000 pixels for uniformity. • Background Removal: Unwanted backgrounds were eliminated to focus on leaf features. • Brightness Enhancement: The brightness of each image was adjusted by a factor of 1.2 to improve visibility and contrast.
Potential Applications: RoseLeafInsight is ideal for training and evaluating machine learning and deep learning models in various applications, including: • Automated plant disease detection systems • Smart agriculture and precision farming • Image-based disease diagnosis for plant pathology research • Transfer learning and fine-tuning deep learning models for plant health classification
This dataset provides a valuable resource for researchers, agronomists, and AI practitioners seeking to develop robust solutions for real-time rose leaf disease detection.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this research, we present a technique aimed at identifying the evolving sections of plants utilizing RGB-D data, with the aim of automating the detection of plant growth within an extraterrestrial experimental setting. As humanity entertains the prospect of inhabiting space in the future, the cultivation of plants in outer space becomes imperative for sustaining food supplies. However, the feasibility of growing plants in space akin to terrestrial methods remains uncertain, necessitating exploration through cultivation experiments conducted aboard international space stations and similar platforms. The observation of plant growth in space is constrained by human resources and available measurement space, further compounded by the exorbitant transportation costs, which escalate with weight. Consequently, there is a preference for lightweight equipment. Traditional automatic plant growth measurement techniques often rely on bulky equipment or require a significant amount of measurement space, rendering them impractical for space applications. In this investigation, we propose a methodology for identifying growing plant sections employing just one RGB-D camera. This approach enables the construction of a measurement system utilizing only a single camera and a laptop for image storage and connection, thereby ensuring lightweight portability. Moreover, the fixed positioning of the camera for plant capture minimizes spatial requirements and reduces the need for manpower. Our proposed technique entails leaf segmentation through depth data and the detection of growing sections via local feature matching. Experimental trials using a model plant corroborated the effectiveness of our method in leaf segmentation and growing part detection. Additionally, the experimental outcomes showcased the capability of the proposed approach in pinpointing the growing sections by refining the matching areas based on segmentation outcomes and appropriate observation intervals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains images of widely grown crops in Bangladesh. This dataset contains images of leaf diseases and fresh leaves for 6 vegetables.
The vegetables are Bitter Gourd with 2223 images, Bottle Gourd with 1803 images, Tomatoes with 2449 images, Eggplants with a total of 2944 images, Cauliflowers with 1598 images, and Cucumbers with 1626 images.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of five subsets with annotated images in COCO format, designed for object detection and tracking plant growth: 1. Cucumber_Train Dataset (for Faster R-CNN) - Includes training, validation, and test images of cucumbers from different angles. - Annotations: Bounding boxes in COCO format for object detection tasks.
Annotations: Bounding boxes in COCO format.
Pepper Dataset
Contains images of pepper plants for 24 hours at hourly intervals from a fixed angle.
Annotations: Bounding boxes in COCO format.
Cannabis Dataset
Contains images of cannabis plants for 24 hours at hourly intervals from a fixed angle.
Annotations: Bounding boxes in COCO format.
Cucumber Dataset
Contains images of cucumber plants for 24 hours at hourly intervals from a fixed angle.
Annotations: Bounding boxes in COCO format.
This dataset supports training and evaluation of object detection models across diverse crops.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Brazil
Lophaterum gracile (233 images)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Invasive Plant Species Detection is a dataset for classification tasks - it contains Species annotations for 1,398 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The complete set of images have been classified among two classes i.e. healthy and diseased. First, the acquired images are classified and labeled conferring to the plants. The plants were named ranging from P0 to P11. Then the entire dataset has been divided among 22 subject categories ranging from 0000 to 0022. The classes labeled with 0000 to 0011 were marked as a healthy class and ranging from 0012 to 0022 were labeled diseased class. This is a collection of about 4503 images of which contains 2278 images of healthy leaf and 2225 images of the diseased leaf. Twelve plants named as Mango, Arjun, Alstonia Scholaris, Guava, Bael, Jamun, Jatropha, Pongamia Pinnata, Basil, Pomegranate, Lemon, and Chinar have been selected. Leaf images of these plants in healthy and diseased condition have been acquired and divided between two separate modules.