Fruit and vegetable plants are vulnerable to diseases that can negatively affect crop yield, causing planters to incur significant losses. These diseases can affect the plants at various stages of growth. Planters must be on constant watch to prevent them early, or infestation can spread and become severe and irrecoverable. There are many types of pest infestations of fruits and vegetables, and identifying them manually for appropriate preventive measures is difficult and time-consuming.This pretrained model can be deployed to identify plant diseases efficiently for carrying out suitable pest control. The training data for the model primarily includes images of leaves of diseased and healthy fruit and vegetable plants. It can classify the multiple categories of plant infestation or healthy plants from the images of the leaves.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using the Train Deep Learning Model tool. Follow the guide to fine-tune this model.Input8 bit, 3-band (RGB) image. Recommended image size is 224 x 224 pixels. Note: Input images should have grey or solid color background with one full leaf per image. OutputClassified image of the leaf with any of the plant disease, healthy leaf, or background classes as in the Plant Leaf Diseases dataset.Applicable geographiesThis model is expected to work well in all regions globally. However, results can vary for images that are statistically dissimilar to training data.Model architectureThis model uses the ResNet50 model architecture implemented in ArcGIS API for Python.Accuracy metricsThis model has an overall accuracy of 97.88 percent. The confusion matrix below summarizes the performance of the model on the validation dataset. Sample resultsHere are a few results from the model:Ground truth: Apple_black_rot / Prediction: Apple_black_rotGround truth: Potato_early_blight / Prediction: Potato_early_bightGround truth: Raspberry_healthy / Prediction: Raspberry_healthyGround truth: Strawberry_leaf_scorch / Prediction: Strawberry_leaf_scorch
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The PlantDoc dataset was originally published by researchers at the Indian Institute of Technology, and described in depth in their paper. One of the paper’s authors, Pratik Kayal, shared the object detection dataset available on GitHub.
PlantDoc is a dataset of 2,569 images across 13 plant species and 30 classes (diseased and healthy) for image classification and object detection. There are 8,851 labels. Read more about how the version available on Roboflow improves on the original version here.
And here's an example image:
https://i.imgur.com/fGlQ0kG.png" alt="Tomato Blight">
Fork
this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 416x416 export.
As the researchers from IIT stated in their paper, “plant diseases alone cost the global economy around US$220 billion annually.” Training models to recognize plant diseases earlier dramatically increases yield potential.
The dataset also serves as a useful open dataset for benchmarks. The researchers trained both object detection models like MobileNet and Faster-RCNN and image classification models like VGG16, InceptionV3, and InceptionResnet V2.
The dataset is useful for advancing general agriculture computer vision tasks, whether that be health crop classification, plant disease classification, or plant disease objection.
This dataset follows Creative Commons 4.0 protocol. You may use it commercially without Liability, Trademark use, Patent use, or Warranty.
Provide the following citation for the original authors:
@misc{singh2019plantdoc,
title={PlantDoc: A Dataset for Visual Plant Disease Detection},
author={Davinder Singh and Naman Jain and Pranjali Jain and Pratik Kayal and Sudhakar Kumawat and Nipun Batra},
year={2019},
eprint={1911.10317},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.
Pl@ntNet-300K is an image dataset aimed at evaluating set-valued classification. It was built from the database of Pl@ntnet citizen observatory and consists of 306146 images, covering 1081 species. We highlight two particular features of the dataset, inherent to the way the images are acquired and to the intrinsic diversity of plants morphology: i) The dataset exhibits a strong class imbalance, meaning that a few species represent most of the images. ii) Many species are visually similar, making identification difficult even for the expert eye. These two characteristics make the present dataset a good candidate for the evaluation of set-valued classification methods and algorithms. Therefore, we recommend two set-valued evaluation metrics associated with the dataset (top-K and average-K) and we provide the results of a baseline approach based on a resnet50 trained with a cross-entropy loss. The full description of the dataset can be found in (to be provided soon). The scientific publication (NEURIPS 2022) describing the dataset and providing baseline results can be found here: https://openreview.net/forum?id=eLYinD0TtIt Utilities to load the data and train models with pytorch can be found here: https://github.com/plantnet/PlantNet-300K/
Description: 👉 Download the dataset here This dataset offers an extensive collection of images and corresponding labels representing a wide array of plant diseases. Carefully curated from publicly available sources, it serves as a valuable resource for developing and evaluating machine learning models, particularly in the realms of image classification and plant disease detection. Dataset Composition: • Images: The dataset comprises high-quality images organized by plant species and disease… See the full description on the dataset page: https://huggingface.co/datasets/gtsaidata/Plant-Disease-Image-Dataset.
This is the dataset that I used in my iOS and Android plant disease detection app, PlantifyDr. You can check out my full open-source project here: https://github.com/lavaman131/PlantifyDr
The dataset contains over 125,000 jpg images of 10 different plant types: Apple, Bell pepper, Cherry, Citrus, Corn, Grape, Peach, Potato, Strawberry, and Tomato. The total number of plant diseases is 37. Augmentations have already been applied to the data, but feel free to add your own augmentations if you like.
Special thanks to: https://data.mendeley.com/datasets/tywbtsjrjv/1 https://www.kaggle.com/vipoooool/new-plant-diseases-dataset https://github.com/pratikkayal/PlantDoc-Dataset https://data.mendeley.com/datasets/3f83gxmv57/2
for the data.
The Food and Agriculture Organization of the United Nations (FAO) estimates that annually between 20 to 40 percent of global crop production is lost. Each year, plant diseases cost the global economy around $220 billion. I hoped to use deep learning to solve this problem and be able to better educate farmers and the public with the necessary knowledge to treat their plants.
Automated leaf segmentation is a challenging area in computer vision. Recent advances in machine learning approaches allowed to achieve better results than traditional image processing techniques; however, training such systems often require large annotated data sets. To contribute with annotated data sets and help to overcome this bottleneck in plant phenotyping research, here we provide a novel photometric stereo (PS) data set with annotated leaf masks. This data set forms part of the work done in the BBSRC Tools and Resources Development project BB/N02334X/1.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This plant image dataset consists of 14,790 images categorized into 47 distinct plant species classes. The dataset was compiled by collecting images from Bing Images and manually curating them, although not by professional biologist. I collected this images for a project aimed at classifying plant species as either toxic or safe for cats. Key Features:
Total Images: 14,790 Number of Classes: 47 Image Source: Collected from Bing Images Curation Method: Manual cleaning by non-expert
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/kakasher/house-plant-species.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset comprises 4,361 high-quality images with 4 different classes of turmeric leaves, each class giving an exact description of the unhealthy leaf and the healthy leaf.
Original Dataset Aphids Disease: 221 Blotch: 238 Leaf Spot:193 Healthy Leaf:213 Total: 865
Augmented Dataset Aphids Disease: 847 Blotch: 909 Leaf Spot: 919 Healthy Leaf: 821 Total 3496
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this research, we present a technique aimed at identifying the evolving sections of plants utilizing RGB-D data, with the aim of automating the detection of plant growth within an extraterrestrial experimental setting. As humanity entertains the prospect of inhabiting space in the future, the cultivation of plants in outer space becomes imperative for sustaining food supplies. However, the feasibility of growing plants in space akin to terrestrial methods remains uncertain, necessitating exploration through cultivation experiments conducted aboard international space stations and similar platforms. The observation of plant growth in space is constrained by human resources and available measurement space, further compounded by the exorbitant transportation costs, which escalate with weight. Consequently, there is a preference for lightweight equipment. Traditional automatic plant growth measurement techniques often rely on bulky equipment or require a significant amount of measurement space, rendering them impractical for space applications. In this investigation, we propose a methodology for identifying growing plant sections employing just one RGB-D camera. This approach enables the construction of a measurement system utilizing only a single camera and a laptop for image storage and connection, thereby ensuring lightweight portability. Moreover, the fixed positioning of the camera for plant capture minimizes spatial requirements and reduces the need for manpower. Our proposed technique entails leaf segmentation through depth data and the detection of growing sections via local feature matching. Experimental trials using a model plant corroborated the effectiveness of our method in leaf segmentation and growing part detection. Additionally, the experimental outcomes showcased the capability of the proposed approach in pinpointing the growing sections by refining the matching areas based on segmentation outcomes and appropriate observation intervals.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is from this repository contributed by Pratik Kayal and Naman Jain. It's important to note that this dataset focuses on classification and does not include bounding boxes or other object recognition elements. files names has been formatted.
The Cropped-PlantDoc dataset was used for benchmarking classification models in the paper titled "PlantDoc: A Dataset for Visual Plant Disease Detection" which was accepted in the Research Track at ACM India Joint International Conference on Data Science and Management of Data (CoDS-COMAD 2020).
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9522896%2F8b0a4e5e91bb6e48ca447b0f18e964cd%2FPlantDoc_Examples.png?generation=1698555101222210&alt=media" alt="">
India loses 35% of the annual crop yield due to plant diseases. Early detection of plant diseases remains difficult due to the lack of lab infrastructure and expertise. In this paper, we explore the possibility of computer vision approaches for scalable and early plant disease detection. The lack of availability of sufficiently large-scale non-lab data set remains a major challenge for enabling vision based plant disease detection. Against this background, we present PlantDoc: a dataset for visual plant disease detection. Our dataset contains 2,598 data points in total across 13 plant species and up to 17 classes of diseases, involving approximately 300 human hours of effort in annotating internet scraped images. To show the efficacy of our dataset, we learn 3 models for the task of plant disease classification. Our results show that modelling using our dataset can increase the classification accuracy by up to 31%. We believe that our dataset can help reduce the entry barrier of computer vision techniques in plant disease detection.
For full paper, refer Arxiv and ACM
Davinder Singh*, Naman Jain*, Pranjali Jain*, Pratik Kayal*, Sudhakar Kumawat and Nipun Batra
@inproceedings{10.1145/3371158.3371196,
author = {Singh, Davinder and Jain, Naman and Jain, Pranjali and Kayal, Pratik and Kumawat, Sudhakar and Batra, Nipun},
title = {PlantDoc: A Dataset for Visual Plant Disease Detection},
year = {2020},
isbn = {9781450377386},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3371158.3371196},
doi = {10.1145/3371158.3371196},
booktitle = {Proceedings of the 7th ACM IKDD CoDS and 25th COMAD},
pages = {249–253},
numpages = {5},
keywords = {Deep Learning, Object Detection, Image Classification},
location = {Hyderabad, India},
series = {CoDS COMAD 2020}
}
Creative Commons Attribution 4.0 International Link
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 United States License.
# Data origins
The dataset is originally hosted at PlantVillage Disease Classification Challenge.
We use the modified version in this github repository to do controlled experiments.
We only use the raw color images dataset and delete the unconventional characters in the classes directory name and `.csv` filenames.
# Directory explanation
The `80-20` direcotry has multiple `.txt` files which contain the training (~80%), validation(~10%) and testing (~10%) datasets instances filenames and the corresponding label indexes. The validation dataset quantity is `5430` in all data separation. In our experiment code (not included in this archive), the validation and testing dataset are merged together.
# Data usage
## Replicate our experiments
We have used this dataset in writing our paper. The reference information can be seen at https://gitlab.com/huix/leaf-disease-plant-village.
### Steps
1. `cd` to the direcotry (e.g. `/home/usrname/plantvillage_deeplearning_paper_dataset`) that contains the `color` directory.
2. run `python change_filename_prefix.py --prefix /home/usrname/plantvillage_deeplearning_paper_dataset` to modify the prefix path (which is `/home/h/plantvillage_deeplearning_paper_dataset` in our former generated datasets).
3. Fin. You can use our opens ource codes repository to do the later experiments.
## Generate your own training/validation/testing datasets
This data separation generating code isn't included in the dataset archive, it is in our open source code. Please see our open source code repository for the detailed information.
If you have any questions, you can contact the author through email.
The email address is a QR code in the archive.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There are two datasets and one table uploaded in this platform under the title "MED117_Medicinal Plant Leaf Dataset & Name Table". A folder is created with title "MED 117 Leaf Species". Inside this two sub folders with titles " Raw leaf image set of medicinal plants_v2" and "Segmented leaf set using UNET segmentation" are created. Raw leaf image set consists of leaf images of 117 medicinal plants found in Assam. All the samples are collected by visiting different (Govt, Public and Private) medicinal gardens situated in different places of Assam and some other general places where they are mostly found. Videos of 10 to 15 seconds duration were taken for two to three leaves of every species on a white background and video recording was done using a SLR Canon Camera. Individual videos were segregated into image frames and thus were able to get around 77,700 jpg image frames from the videos. The Raw leaf image set consists of folders with scientific name and common name within bracket. Second folder with title "Segmented leaf set using UNET segmentation" consists of 115 medicinal plant species with their segmented leaf image samples using UNET segmentation technique. Here two species are excluded from the original dataset due to small unpredictable size of the samples, so total 115 subfolders inside the segmented folder is achieved. Thirdly a table in doc format with title "Medicinal Plant Name Table" is uploaded and it includes Scientific name, Common name and Assamese name of the plants listed in the folders in the same sequence. The whole contribution is absolutely original and new, collected from different sources then processed for segmentation and prepared the table by discussing with taxonomy experts from Botany department of Gauhati University, Guwahati, Assam. India.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Brazil
PlantDoc is a dataset for visual plant disease detection. The dataset contains 2,598 data points in total across 13 plant species and up to 17 classes of diseases, involving approximately 300 human hours of effort in annotating internet scraped images.
Description:
This comprehensive dataset contains a total of 14,790 images, each carefully categorized into one of 47 plant species. Originally curated for a project focused on determining whether various plant species are toxic or safe for cats, the dataset serves as a valuable resource for plant classification models. While the images were sourced from Bing, the data has been manually curated by non-experts, ensuring a balanced and diverse range of species, although the dataset’s accuracy may benefit from further refinement by botanical experts.
Download Dataset
Key Features:
Total Images: 14,790
Species Categories: 47 distinct plant types
Image Source: Bing Images
Curation: Non-expert manual cleaning
Dataset Breakdown:
Class sizes vary; popular species include Monstera Deliciosa (547 images), Dumb Cane (541 images), and Chinese Evergreen (514 images).
Species with fewer representations include Yucca (66 images), Kalanchoe (130 images), and Asparagus Fern (169 images).
Image Characteristics:
Quality & Resolution: Images in the dataset exhibit a range of resolutions, with some variance in quality due to the nature of the collection process. Despite these inconsistencies, the
variety offers opportunities for models to learn from different image qualities.
Types of Images: The dataset includes both whole-plant images and detailed close-ups of plant parts, providing multi-dimensional perspectives of each species.
Environmental Diversity: Plants are photographed in various settings, both indoors and outdoors, to account for different visual contexts.
Organized Structure: Each species is neatly organized into its respective folder, allowing for easy access and classification.
This dataset is sourced from Kaggle.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Chili Plant is a dataset for instance segmentation tasks - it contains Plant annotations for 425 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
A collection of non-framed multi-spectral images of tomato plants infected with the Tuta Absoluta leafminer disease.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images and data for the study "Plant image identification application demonstrates high accuracy in Northern Europe"
Details: Jaak Pärtel, Meelis Pärtel, Jana Wäldchen, Plant image identification application demonstrates high accuracy in Northern Europe, AoB PLANTS, Volume 13, Issue 4, August 2021, plab050, https://doi.org/10.1093/aobpla/plab050
The data table displays Flora Incognita's identification results together with species and observations characteristics. All (3199) used images are included.
The study was conducted in two parts: database and field study.
Database study images have been taken from eBiodiversity database (https://elurikkus.ee/en) under Creative Commons Attribution 4.0 International (CC BY 4.0) licence (https://creativecommons.org/licenses/by/4.0/). Please cite the original source for the images as well when using the dataset.
Field study images were taken by Jaak Pärtel in 2020 in field conditions from different habitats across Estonia.
This dataset consists of high-resolution images of eggplant (Solanum melongena) leaves affected by various diseases and healthy specimens. It is designed for use in agricultural research, plant pathology, and machine learning applications. The dataset aims to support the development of disease detection and classification algorithms for sustainable agriculture and better crop management.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Together with the Agriculture University, we compiled a database of plant images and omics data. The dataset contains images of four distinct plant maladies, including powdery mildew, rust, leaf spot, and blight, as well as gene expression and metabolite data. Using a high resolution camera in a controlled environment at the facility of the Agriculture University of Peshawar, we captured 8,000 images of plants, with 2,000 images for each disease type. Each image was labeled with the disease type corresponding to it. The images were preprocessed by resizing them to 224x224 pixels and standardizing the pixel values. The dataset was divided into 70:15:15 training, validation, and testing sets, correspondingly. In addition to collecting images of the same plants, we also collected gene expression and metabolite data. We extracted RNA from the plant leaves using a commercial reagent and sequenced it on an Illumina HiSeq 4000 platform. The average length of the 100 million paired-end readings obtained was 150 base pairs. The unprocessed reads were trimmed with Trimmomatic and aligned with STAR against the reference genome. We counted the number of reads that mapped to each gene using featureCounts, and then identified differentially expressed genes between healthy and diseased plants using the DESeq2 package in R. Using gas chromatography-mass spectrometry (GC-MS), we gathered additional metabolite information. Using a methanol-water extraction protocol, we extracted metabolites from the plant leaves and analyzed the extracts using GC-MS. We obtained 500 metabolite characteristics, including amino acids, organic acids, and sugars.If you use the dataset mentioned here, please make sure to give credit to the researchers by citing their paper titled 'Deep Learning for Plant Bioinformatics: An Explainable Gradient-Based Approach for Disease Detection.'ReferenceShoaib, M., Shah, B., Sayed, N., Ali, F., Ullah, R., & Hussain, I. (2023). Deep learning for plant bioinformatics: an explainable gradient-based approach for disease detection. Frontiers in Plant Science, 14(October), 1–17. https://doi.org/10.3389/fpls.2023.1283235
Fruit and vegetable plants are vulnerable to diseases that can negatively affect crop yield, causing planters to incur significant losses. These diseases can affect the plants at various stages of growth. Planters must be on constant watch to prevent them early, or infestation can spread and become severe and irrecoverable. There are many types of pest infestations of fruits and vegetables, and identifying them manually for appropriate preventive measures is difficult and time-consuming.This pretrained model can be deployed to identify plant diseases efficiently for carrying out suitable pest control. The training data for the model primarily includes images of leaves of diseased and healthy fruit and vegetable plants. It can classify the multiple categories of plant infestation or healthy plants from the images of the leaves.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using the Train Deep Learning Model tool. Follow the guide to fine-tune this model.Input8 bit, 3-band (RGB) image. Recommended image size is 224 x 224 pixels. Note: Input images should have grey or solid color background with one full leaf per image. OutputClassified image of the leaf with any of the plant disease, healthy leaf, or background classes as in the Plant Leaf Diseases dataset.Applicable geographiesThis model is expected to work well in all regions globally. However, results can vary for images that are statistically dissimilar to training data.Model architectureThis model uses the ResNet50 model architecture implemented in ArcGIS API for Python.Accuracy metricsThis model has an overall accuracy of 97.88 percent. The confusion matrix below summarizes the performance of the model on the validation dataset. Sample resultsHere are a few results from the model:Ground truth: Apple_black_rot / Prediction: Apple_black_rotGround truth: Potato_early_blight / Prediction: Potato_early_bightGround truth: Raspberry_healthy / Prediction: Raspberry_healthyGround truth: Strawberry_leaf_scorch / Prediction: Strawberry_leaf_scorch