Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wheat is the basis of the diet of a large part of humanity. Therefore, this cereal is widely studied by scientists to ensure food security. A tedious, yet important part of this research is the measurement of different characteristics of the plants, also known as Plant Phenotyping.
Monitoring plant architectural characteristics allow breeders to grow better varieties and farmers to make better decisions, but this critical step is still done manually. The emergence of UAV, camera and smartphone makes in-field RGB images more available and could be a solution to manual measurement. For instance, the counting of the wheat head can be done with Deep Learning. However, this task can be visually challenging. There is often an overlap of dense wheat plants, and the wind can blur the photographs, making identifying single heads difficult. Additionally, appearances vary due to maturity, color, genotype, and head orientation. Finally, because wheat is grown worldwide, different varieties, planting densities, patterns, and field conditions must be considered.
To end manual counting, a robust algorithm must be created to address all these issues. The task is to localize the wheat head contained in each image. The goal is to obtain a model which is robust to variation in shape, illumination, sensor and locations.
~ Excerpts from the dataset source webpage
This dataset contains 6515 png wheat images. There are more than 300k wheat heads and associated bounding boxes.
The images are from 12 countries: Switzerland, UK, Belgium, Norway, France, Canada, US, Mexico, Japan, China, Australia and Sudan
This dataset is an expanded version of the GWHD_2020 dataset that was used in the Kaggle Global Wheat Detection competition: - GWHD_2021 is bigger, less noisy and more diverse - There are new countries, additional images and additional wheat heads - The sub-datasets have been further broken down by wheat development stage - Poor quality images have been removed
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1086574%2F5ac61982a61672c6f90350128cb63d4b%2Fimage_w_bboxes.png?generation=1673247086554846&alt=media" alt="">
The BoxesString column contains the bounding boxes. Each row contains all bounding boxes that appear on one image. The entry is a string. The coordinates for each bounding box are separated by a semi-colon e.g.
'99 692 160 764;641 27 697 115;935 978 1012 1020'
The format is: [x_min,y_min, x_max,y_max]
If there is no bounding box, BoxesString is set to "no_box".
This notebook shows how to parse the data: https://www.kaggle.com/code/vbookshelf/gwhd-how-to-parse-the-data
The original dataset can also be downloaded from here: https://zenodo.org/record/5092309#.Y7ksF-xBzUL
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods https://arxiv.org/abs/2105.07660
@article{david2020global, title={Global Wheat Head Detection (GWHD) dataset: a large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat head detection methods}, author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul A and others}, journal={Plant Phenomics}, volume={2020}, year={2020}, publisher={Science Partner Journal} }
2021 Kaggle competition https://www.kaggle.com/competitions/global-wheat-detection/overview
Tutorials and more info https://www.aicrowd.com/challenges/global-wheat-challenge-2021
Header image by 652234 on Pixabay
https://pixabay.com/photos/nature-spike-grain-field-plant-3450440/
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset comprised wheat kernels belonging to three different varieties of wheat: Kama, Rosa and Canadian, 70 elements each. The data set can be used for the tasks of classification and cluster analysis.All of these parameters were real-valued continuous
To construct the data, seven geometric parameters of wheat kernels were measured:
Facebook
TwitterThis dataset was created by Sandeep Bansode
Facebook
TwitterWheat is the basis of the diet of a large part of humanity. Therefore, this cereal is widely studied by scientists to ensure food security. A tedious, yet important part of this research is the measurement of different characteristics of the plants, also known as Plant Phenotyping. Monitoring plant architectural characteristics allow the breeders to grow better varieties and the farmers to make better decisions, but this critical step is still done manually. The emergence of UAV, cameras and smartphones makes in-field RGB images more available and could be a solution to manual measurement. For instance, the counting of the wheat head can be done with Deep Learning. However, this task can be visually challenging. There is often an overlap of dense wheat plants, and the wind can blur the photographs, making identify single heads difficult. Additionally, appearances vary due to maturity, colour, genotype, and head orientation. Finally, because wheat is grown worldwide, different varieties, planting densities, patterns, and field conditions must be considered. To end manual counting, a robust algorithm must be created to address all these issues.
Current detection methods involve one-stage and two-stage detectors (Yolo-V3 and Faster-RCNN), but even when trained with a large dataset, there remains a bias to the training region remains. The goal of the competition is to understand such bias and build a robust solution. This is to be done using train and test dataset that cover different regions, such as the Global Wheat Dataset. If successful, researchers can accurately estimate the density and size of wheat heads in different varieties. With improved detection farmers can better assess their crops, ultimately bringing cereal, toast, and other favorite dishes to your table.
The dataset is composed of more than 6000 images of 1024x1024 pixels containing 300k+ unique wheat heads, with the corresponding bounding boxes. The images come from 11 countries and covers 44 unique measurement sessions. A measurement session is a set of images acquired at the same location, during a coherent timestamp (usually a few hours), with a specific sensor. In comparison to the 2020 competition on Kaggle, it represents 4 new countries, 22 new measurements sessions, 1200 new images and 120k new wheat heads. This amount of new situations will help to reinforce the quality of the test dataset. The 2020 dataset was labelled by researchers and students from 9 institutions across 7 countries. The additional data have been labelled by Human in the Loop, an ethical AI labelling company. We hope these changes will help in finding the most robust algorithms possible!
The task is to localize the wheat head contained in each image. The goal is to obtain a model which is robust to variation in shape, illumination, sensor and locations. A set of boxes coordinates is provided for each image.
The training dataset will be the images acquired in Europe and Canada, which cover approximately 4000 images and the test dataset will be composed of the images from North America (except Canada), Asia, Oceania and Africa and covers approximately 2000 images. It represents 7 new measurements sessions available for training but 17 new measurements sessions for the test!
The metrics used for the evaluation of the task will be the Average Domain Accuracy. Accuracy for one image
Accuracy is calculated for each image with Accuracy =
https://user-images.githubusercontent.com/17668390/177893218-394a5acf-b053-46d1-81df-89d232ffc7e0.png" alt="uio">
where:
TP is true Positive is a ground truth box matched with one predicted box
FP a False Positive (FP) a prediction box that matches no ground truth box
FN a False Negative (FN) a ground truth box that matches no box.
Two boxes are matched if their Intersection over Union (IoU) is higher than a threshold of 0.5 .
The accuracy of all images from one domain is averaged to give the domain accuracy.
The final score, called Average Domain Accuracy, is the average of all domain accuracies.
If there is no bounding box in the ground truth, and at least one box is predicted, accuracy is equal to 0, else it is equal to 1
train.zip -This zip contains the training dataset with a csv file containing the bounding boxes of the train images.
test.zip - This zip will be used for actual evaluation for the leaderboard, it contains the images for which bounding boxes needs to be predicted.
image_name, BoxesString and domainimage_name is the name of the image, without the suffix. All images have a .png extensionBoxesString is a string containing all predicted boxes with the format `[x_min,y_min, x_max,y_m...
Facebook
TwitterSynthetic Images of Wheat with BBox using Style Transfer and Pix2Pix. This data was generated during my participation in Global Wheat Detection Competition.
1. Style Transfer Images: This was created using 25 different styles.
CSV: style_transfer_images.csv
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1231059%2F3e9aee54932b22f9e9fa88ae36c812c4%2Fimage1.jpg?generation=1595935399995664&alt=media" alt="">
2. Pix2Pix:
i. Single Generation:
CSV: pix2pix_2_synthetic.csv
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1231059%2Ffe8ea4ed6e11882aaa7d607242dd45a4%2Fc1.jpg?generation=1595936643201876&alt=media" alt="">
ii. Mosiac Generation:
CSV: pix2pix_1_synthetic.csv
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1231059%2F00badf1d8e729bd00e6dc71e14606c5c%2Fc2.jpg?generation=1595936725841363&alt=media" alt="">
3. Corrected box CSV of Global wheat detection data train.csv : CSV: corrected_train.csv
imagesimages also contains original images from Global wheat detection dataImages used to train Style Transfer and Pix2Pix : Global wheat detection
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The list contains details on the historical annual prices from 1960-2022.
File name: wheat_prices.csv
Photo by Melissa Askew on Unsplash
Facebook
TwitterThis dataset provides data on crop yields, harvested areas, and production quantities for wheat, maize, rice, and soybeans. Crop yields are the harvested production per unit of harvested area for crop products. In most cases yield data are not recorded but are obtained by dividing the production data by the data on the area harvested. The actual yield that is captured on a farm depends on several factors such as the crop's genetic potential, the amount of sunlight, water, and nutrients absorbed by the crop, the presence of weeds and pests. This indicator is presented for wheat, maize, rice, and soybean. Crop production is measured in tonnes per hectare.
This dataset includes information on crop production from 2010-2016
https://www.kaggle.com/usda/crop-production
Crop production is an important economic activity that affects commodity prices and macroeconomic uncertainty. This dataset provides data on crop yields, harvested areas, and production quantities for wheat, maize, rice, and soybeans. The data are presented in tonnes per hectare, in thousand hectares, and in thousand tonnes.
This dataset can be used to examine the effect of different crops on climate change and to compare yields between different climates
This dataset provides data on crop yields, harvested areas, and production quantities for wheat, maize, rice, and soybeans. The data are presented in tonnes per hectare
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: crop_production.csv | Column name | Description | |:---------------|:------------------------------------------------------------| | LOCATION | The country or region where the crop is grown. (String) | | INDICATOR | The indicator used to measure the crop production. (String) | | SUBJECT | The subject of the indicator. (String) | | MEASURE | The measure of the indicator. (String) | | FREQUENCY | The frequency of the data. (String) | | TIME | The time period of the data. (String) | | Value | The value of the indicator. (Float) | | Flag Codes | The flag codes of the data. (String) |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is modified from the Wheat-Ears-Detection-Dataset to the format of the dataset provided by the Global Wheat Detection (https://www.kaggle.com/c/global-wheat-detection). This dataset can be easily used in any kernels for the Global Wheat Detection as an external data source.
https://github.com/simonMadec/Wheat-Ears-Detection-Dataset Madec, S., Jin, X., Lu, H., De Solan, B., Liu, S., Duyme, F., et al. (2019). Ear density estimation from high resolution RGB imagery using deep learning technique. Agric. For. Meteorol. 264, 225–234. doi:10.1016/j.agrformet.2018.10.013.
The original dataset was posted by the authors above. It has 236 6000px*4000px images. We cut each image to 6 3000px*3000px images with strides of 1500px for width and 1000px for height. Then we resize them to 1024px*1024px. This dataset is compatible with the dataset provided by the Global Wheat Detection (https://www.kaggle.com/c/global-wheat-detection). The size of bounding boxes are similar as well due to the resize operation.
train.csv - the training data train - (folder) training images
image_id - the unique image ID width, height - the width and height of the images bbox - a bounding box, formatted as a Python-style list of [xmin, ymin, width, height] etc.
See https://www.kaggle.com/c/global-wheat-detection/data for more information.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Here’s a detailed description for updating and improving your crop recommendation system based on soil data:
A crop recommendation system helps farmers select the best crops to grow based on the specific properties of their soil. This system uses soil characteristics and environmental factors to determine the crops that are most likely to thrive. Recommendations are provided to improve crop yield, optimize resource use, and ensure sustainable farming practices.
The system should consider the following soil parameters and external factors to make accurate recommendations:
Soil Nutrients:
Soil pH:
Organic Matter:
Moisture Level:
Temperature:
Rainfall:
Geographical Factors:
Dynamic Soil Profiles:
Crop Rotation Insights:
Fertilizer Suggestions:
Weather and Climate Integration:
Regional Crop Suitability:
Based on soil and environmental data: - Soil Parameters: - pH: 6.8 (neutral) - Nitrogen: Medium - Phosphorus: Low - Potassium: High - Moisture: Moderate - Recommendations: - Primary Crops: Wheat, Maize, Barley. - Secondary Crops (Improving Soil Health): Lentils, Chickpeas (for nitrogen fixation). - Fertilizer Recommendation: Use phosphorus-rich fertilizers (e.g., DAP).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The ‘Images’ folder contains 14,253 images, each with a unique ID. Train.csv contains Image_IDs and the associated label. The labels indicate the growth stage of the crop in the image Growth stages indicate the maturity of the wheat plants and are represented as a number from 1 to 7 (mature crop). The sample submission file contains the Image_IDs of the test set - you must predict the growth stage for the crops in each of these images.
It is important to note that some of the labels have been determined by experts, and may be more reliable than the other labels which have been indicated by the farmers themselves. All the test images have reliable labels. The ‘label_quality’ column in Train.csv indicates whether a label is high quality (2) or potentially less reliable (1).
Background to the challenge
The images were collected as part of field trials focusing on the Rabi (winter) growing season in two states of India: Punjab (with data collection in Fatehgarh, Ludhiana and Patiala districts) and Haryana (Fatehabad, Sirsa and Yamunanagar districts). Most villages in the field trials were located in a hot arid steppe climate. Punjab and Haryana fields are typically double-cropped with rice (or cotton) planted during the Kharif monsoon (June - October), and wheat planted in the Rabi season (October - March). Smallholder agriculture in this area is largely mechanized and is heavily reliant on irrigation
Over two growing seasons a total of 1685 farmers agreed to participate in the PBI studies. For these farmers, the study team listed all plots on which the farmer was planning to grow wheat, and randomly selected one field per farmer to be included in the study. Farmers were asked to take repeat pictures throughout the season, always from approximately the same location as an initial northward oriented picture, and with approximately the same view angle.
Image acquisitions were facilitated using a custom Android application (WheatCam). The farmer set up an observation site by taking an initial geo-referenced image of a field. Subsequent images were referenced relative to the initial “ghosted” image (a mildly transparent image of the initial picture). The application allowed the farmer to frame nearly identical repeat pictures relative to landscape features (or one or two installed reference poles in the first year). A fixed white balance between images was used to minimize in-camera adjustment of illumination and RGB ratios. All pictures were uploaded to a server for further processing.
Before further processing we manually screened all images to ensure that no people were present in the image scenes, to guarantee their privacy. In addition, we removed images which were mistakenly taken indoors, or other accidental acquisitions. We further screened for images which were excessively blurred or discoloured, covered by a finger or otherwise not contained little vegetation or taken during crop cutting or application development. We anonymized the dataset by masking most non-vegetation details which might provide clues to the exact position of a farmers' field, while selecting the vegetation of interest for processing (see below).
The Region-of-Interest (ROI) was delineated automatically on an image-by-image basis using a horizon detection algorithm. The algorithm first resizes the image to 640 pixels along the x-axis, scaling the y-axis proportionally. The algorithm finds change points in the blue channel along the vertical axis of the images using the Pruned Exact Linear Time (PELT) method, approximating the location of the horizon. We then define a trapezoid ROI defined by the median horizon locations for the left and right half of the image, padded by 15% of the image height and 10% of the image width along y and x-axis directions respectively. Similarly, the two bottom corner points were defined by padding the bottom and sides of the image by 10% of the image width and height.
We use this ROI to exclude most other features from the original image which do not pertain to the area evaluated. Areas of no interest are set to black and the image is saved to disk. In addition, we manually screened all processed images and made manual corrections to guarantee the privacy of volunteer farmers where necessary.
Growth Stage definitions:
Growth Phase | Common Name| 1 | Crown Root| 2 | Tillering| 3 | Mid Vegetative Phase| 4 |Booting | 5 |Heading| 6 |Anthesis| 7 | Milking|
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is extracted from https://en.wikipedia.org/wiki/International_wheat_production_statistics. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Climate change has a profound impact on global agriculture, affecting crop yields, soil health, and farming sustainability. This synthetic dataset is designed to simulate real-world agricultural data, enabling researchers, data scientists, and policymakers to explore how climate variations influence food production across different regions.
🔍 Key Features: ✔️ Climate Variables – Simulated data on temperature changes, precipitation levels, and extreme weather events ✔️ Crop Productivity – Modeled impact of climate shifts on yields of key crops like wheat, rice, and corn ✔️ Regional Insights – Includes various geographic regions to analyze diverse climate-agriculture interactions ✔️ Ideal for Predictive Modeling – Supports climate risk assessment, food security studies, and sustainability research
📊 Dataset Overview: This dataset has been synthetically generated and does not contain real-world agricultural records. It is intended for academic learning, climate impact analysis, and machine learning applications in environmental studies.
📖 Columns Description: Region – Simulated geographic region Year – Modeled year of data collection Average_Temperature – Simulated temperature levels (°C) Precipitation – Modeled annual rainfall (mm) Crop_Yield – Synthetic yield data for selected crops (tons/hectare) Extreme_Weather_Events – Number of modeled extreme weather occurrences per year ⚠️ Disclaimer: This dataset is completely synthetic and should not be used for real-world climate policy decisions or agricultural forecasting. It is meant for educational purposes, research, and data science applications.
🔹 Use this dataset to analyze climate trends, build predictive models, and explore solutions for sustainable agriculture! 🌱📊
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset provides a comprehensive and up-to-date collection of futures related to corn, oat, and other grains. Futures are financial contracts obligating the buyer to purchase and the seller to sell a specified amount of a particular grain at a predetermined price on a future date.
Use Cases: 1. Crop Yield Predictions: Use machine learning models to correlate grain futures prices with historical data, predicting potential harvest yields. 2. Impact Analysis of Weather Events: Implement deep learning techniques to understand the relationship between grain price movements and significant weather patterns. 3. Grain Price Forecasting: Develop time-series forecasting models to predict future grain prices, assisting traders and stakeholders in decision-making.
Dataset Image Source: Photo by Pixabay: https://www.pexels.com/photo/agriculture-arable-barley-bread-265242/
Column Descriptions: 1. Date: The date when the data was recorded. Format: YYYY-MM-DD. 2. Open: Market's opening price for the day. 3. High: Maximum price reached during the trading session. 4. Low: Minimum traded price during the day. 5. Close: Market's closing price. 6. Volume: Number of contracts traded during the session. 7. Ticker: Unique market quotation symbol for the grain future. 8. Commodity: Specifies the type of grain the future contract represents (e.g., corn, oat).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is extracted from https://en.wikipedia.org/wiki/List_of_countries_by_wheat_exports. Context: There s a story behind every dataset and heres your opportunity to share yours.Content: What s inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. Acknowledgements:We wouldn t be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.Inspiration: Your data will be in front of the world s largest data science community. What questions do you want to see answered?
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides a comprehensive overview of agricultural production trends in the fictional region of Oceania, with a primary focus on wheat production in **Samoa, Papua New Guinea ,Fiji ,New Zealand ,Australia ** from 1961 onward. Each row represents a yearly record with production figures, estimation flags, units, and year-over-year (YoY) changes.
Value: Production quantityUnit: Measurement unit (e.g., tonnes)Flag: Whether the value is official or estimatedYoY Change: Year-over-year percentage changeMetric & Domain: Contextual categorization of the data| Column | Description |
|---|---|
Year | The calendar year of record |
Value | Quantity of wheat produced |
Unit | Unit of measurement (e.g., tonnes) |
Flag | Indicates if the value is Estimated or Official |
Country | Country or region represented (Samoa in this case) |
Item | The crop or product being measured (Wheat) |
Domain | Sector classification (e.g., Production) |
Metric | Metric type (e.g., Production quantity) |
YoY Change | Year-over-year change in production (%) |
.csv)Contect info:
You can contect me for more data sets if you want any type of data to scrape
-X
Facebook
Twitterhttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
This dataset contains data of total production of different types of foods in per year in each country from 1961-2023 of all country.
The foods are:
1. Maize
2. Rice
3. Yams
4. Wheat
5. Tomatoes
6. Tea
7. Sweet potatoes
8. Sunflower seed
9. Sugar cane
10. Soybeans
11. Rye
12. Potatoes
13. Oranges
14. Peas dry
15. Palm oil
16. Grapes
17. Coffee green
18. Cocoa beans
19. Meat chicken
20. Bananas
21. Avocados
22. Apples
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains detailed information on the prices of various Indian crops, including rice, wheat, corn, lentils, and more. The data spans a specific period and provides a comprehensive view of market trends, supply, and demand for each crop. The dataset is structured for ease of use and includes features such as crop prices, dates, locations, and other relevant metrics.
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
The Global Hyperspectral Imaging Spectral-library of Agricultural crops (GHISA) is a comprehensive compilation, collation, harmonization, and standardization of hyperspectral signatures of agricultural crops of the world. This hyperspectral library of agricultural crops is developed for all major world crops and was collected by United States Geological Survey (USGS) and partnering volunteer agencies from around the world. Crops include wheat, rice, barley, corn, soybeans, cotton, sugarcane, potatoes, chickpeas, lentils, and pigeon peas, which together occupy about 65% of all global cropland areas. The GHISA spectral libraries were collected and collated using spaceborne, airborne (e.g., aircraft and drones), and ground based hyperspectral imaging spectroscopy.
The GHISA for the Conterminous United States (GHISACONUS) Version 1 product provides dominant crop data in different growth stages for various agroecological zones (AEZs) of the United States. The GHISA hyperspectral library of the five major agricultural crops (e.g., winter wheat, rice, corn, soybeans, and cotton) for CONUS was developed using Earth Observing-1 (EO-1) Hyperion hyperspectral data acquired from 2008 through 2015 from different AEZs of CONUS using the United States Department of Agriculture (USDA) Cropland Data Layer (CDL) as reference data.
GHISACONUS is comprised of seven AEZs throughout the United States covering the major agricultural crops in six different growth stages: emergence/very early vegetative (Emerge VEarly), early and mid vegetative (Early Mid), late vegetative (Late), critical, maturing/senescence (Mature Senesc), and harvest. The crop growth stage data were derived using crop calendars generated by the Center for Sustainability and the Global Environment (SAGE), University of Wisconsin-Madison.
Provided in the CSV file is the spectral library including image information, geographic coordinates, corresponding agroecological zone, crop type labels, and crop growth stage labels for the United States.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset simulates real-world smart farming operations powered by IoT sensors and satellite data. It captures environmental and operational variables that affect crop yield across 500 farms located in regions like India, the USA, and Africa.
Designed to reflect modern agritech systems, the data is ideal for: - Predictive modeling using ML/AI - Time-series analysis - Sensor-based optimization - Environmental data visualizations - Crop health analytics
| Column Name | Description |
|---|---|
farm_id | Unique ID for each smart farm (e.g., FARM0001) |
region | Geographic region (e.g., North India, South USA) |
crop_type | Crop grown: Wheat, Rice, Maize, Cotton, Soybean |
soil_moisture_% | Soil moisture content in percentage |
soil_pH | Soil pH level (5.5–7.5 typical range) |
temperature_C | Average temperature during crop cycle (in °C) |
rainfall_mm | Total rainfall received in mm |
humidity_% | Average humidity level in percentage |
sunlight_hours | Average sunlight hours received per day |
irrigation_type | Type of irrigation: Drip, Sprinkler, Manual, None |
fertilizer_type | Fertilizer used: Organic, Inorganic, Mixed |
pesticide_usage_ml | Daily pesticide usage in milliliters |
sowing_date | Date when crop was sown |
harvest_date | Date when crop was harvested |
total_days | Crop growth duration (harvest - sowing) |
yield_kg_per_hectare | 🌾 Target variable: Crop yield in kilograms per hectare |
sensor_id | ID of the IoT sensor reporting the data |
timestamp | Random in-cycle timestamp when the data snapshot was recorded |
latitude | Farm location latitude (10.0 - 35.0 range) |
longitude | Farm location longitude (70.0 - 90.0 range) |
NDVI_index | Normalized Difference Vegetation Index (0.3 - 0.9) |
crop_disease_status | Crop disease status: None, Mild, Moderate, Severe |
If you build a notebook, model, or dashboard using this dataset — feel free to tag me or leave a comment. Happy growing! 🌱🚜
Facebook
TwitterThis dataset offers a detailed look into brewery operations, capturing the intricate process of beer production from fermentation to bottling. It includes comprehensive metrics on brewing parameters, quality scores, sales performance, and operational efficiency across multiple locations. Originally compiled for a data science Capstone Project, this dataset is ideal for enthusiasts and analysts interested in exploring the intersection of manufacturing, quality control, and market trends in the craft beer industry.
Column Name Description Data Type Units Notes Batch_ID Unique identifier for each brewing batch. Numeric/String - Could be numeric or alphanumeric depending on the system's ID generation method. Brew_Date Date and time when the brewing process started for the batch. Date/Time - Format appears to be "MM/DD/YYYY HH:MM". May require consistent parsing if formats vary in the larger dataset. Time component seems to be granular down to minutes. Beer_Style Style or type of beer being brewed (e.g., Wheat Beer, Sour, Ale, Stout, Lager, Pilsner, IPA, Porter). Text/Categorical - Categorical values representing different beer styles. SKU Stock Keeping Unit. Potentially a code representing the packaging type. (e.g., Kegs, Cans, Pints, Bottles). Text/Categorical - Seems to describe packaging form, similar to 'Form' in the previous data dictionary but potentially more product-centric. Location Location associated with the brewing process, potentially the brewery location or intended sales region (e.g., Whitefield, Malleswaram, Rajajinagar, Marathahalli, Electronic City, Indiranagar, Koramangala). Text/Categorical - Likely refers to brewery or distribution location. Needs context to understand if it represents production site or intended market. Fermentation_Time Duration of the fermentation process. Numeric (Integer) Hours Integer values representing time in hours. This is the time the wort ferments. Temperature Temperature during fermentation. Numeric (Float) Degrees Celsius (°C) Float values likely in degrees Celsius. Crucial parameter for fermentation control. pH_Level pH level of the brew during fermentation. Numeric (Float) pH Units Float values representing pH, a measure of acidity/alkalinity. Important for yeast activity and beer quality. Gravity Specific Gravity of the wort before fermentation (Original Gravity - OG). Numeric (Float) SG Units Float values representing Specific Gravity, a measure of sugar concentration in the wort. Used to estimate potential alcohol content. Typically represented as values like 1.xxx. Alcohol_Content Final Alcohol content of the beer. Numeric (Float) Percentage (%) Float values representing alcohol by volume (ABV) as a percentage. Bitterness Perceived bitterness of the beer, measured in International Bitterness Units (IBUs). Numeric (Integer) IBUs Integer values representing International Bitterness Units. Higher IBU means more bitter beer. Color Color of the beer, often measured on the Standard Reference Method (SRM) scale or similar color scale. Numeric (Integer) SRM (or similar) Integer values representing beer color intensity. Higher value means darker beer. Scale might be SRM or EBC - need context to confirm, but SRM is common in US brewing. Ingredient_Ratio Ratio of key ingredients used in the brew. Format appears to be "1:X:Y", possibly representing Malt:Hops:Yeast ratios or similar key ingredient proportions. Text/Categorical Ratio Text-based ratio. Needs further decoding to understand what 'X' and 'Y' represent in the ratio. Common ratios in brewing might involve malt types, hop varieties, yeast strains, or water-to-grain ratios. Volume_Produced Total volume of beer produced in this batch. Numeric (Integer) Liters/Gallons Integer values. Units are likely Liters or Gallons. Context needed to determine which volume unit is used. Could also be in Barrels (US or UK). Total_Sales Total sales revenue generated from this batch of beer. Numeric (Float) Currency Units Float values representing revenue. Currency units will depend on the context (e.g., USD, EUR, INR). Quality_Score Overall quality score of the beer batch, possibly based on sensory evaluation, lab tests, or a combination. Numeric (Float) Score/Points Float values representing a quality score. Scale and meaning of the score (higher is better? range?) needs to be defined by the quality assessment process used. Brewhouse_Efficiency Efficiency of the brewhouse operation, indicating how effectively sugars are extracted from grains during mashing and lautering. Numeric (Float) Percentage (%) Float values in percentage. Higher efficiency is generally better, indicating less sugar loss during the mashing process. Loss_During_Brewing Percentage of volume lost during the brewing process (pre-fermentation). Numeric (Float) Percentage (%) Float values in percentage. Represents losses during wort production - e.g., evapora...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wheat is the basis of the diet of a large part of humanity. Therefore, this cereal is widely studied by scientists to ensure food security. A tedious, yet important part of this research is the measurement of different characteristics of the plants, also known as Plant Phenotyping.
Monitoring plant architectural characteristics allow breeders to grow better varieties and farmers to make better decisions, but this critical step is still done manually. The emergence of UAV, camera and smartphone makes in-field RGB images more available and could be a solution to manual measurement. For instance, the counting of the wheat head can be done with Deep Learning. However, this task can be visually challenging. There is often an overlap of dense wheat plants, and the wind can blur the photographs, making identifying single heads difficult. Additionally, appearances vary due to maturity, color, genotype, and head orientation. Finally, because wheat is grown worldwide, different varieties, planting densities, patterns, and field conditions must be considered.
To end manual counting, a robust algorithm must be created to address all these issues. The task is to localize the wheat head contained in each image. The goal is to obtain a model which is robust to variation in shape, illumination, sensor and locations.
~ Excerpts from the dataset source webpage
This dataset contains 6515 png wheat images. There are more than 300k wheat heads and associated bounding boxes.
The images are from 12 countries: Switzerland, UK, Belgium, Norway, France, Canada, US, Mexico, Japan, China, Australia and Sudan
This dataset is an expanded version of the GWHD_2020 dataset that was used in the Kaggle Global Wheat Detection competition: - GWHD_2021 is bigger, less noisy and more diverse - There are new countries, additional images and additional wheat heads - The sub-datasets have been further broken down by wheat development stage - Poor quality images have been removed
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1086574%2F5ac61982a61672c6f90350128cb63d4b%2Fimage_w_bboxes.png?generation=1673247086554846&alt=media" alt="">
The BoxesString column contains the bounding boxes. Each row contains all bounding boxes that appear on one image. The entry is a string. The coordinates for each bounding box are separated by a semi-colon e.g.
'99 692 160 764;641 27 697 115;935 978 1012 1020'
The format is: [x_min,y_min, x_max,y_max]
If there is no bounding box, BoxesString is set to "no_box".
This notebook shows how to parse the data: https://www.kaggle.com/code/vbookshelf/gwhd-how-to-parse-the-data
The original dataset can also be downloaded from here: https://zenodo.org/record/5092309#.Y7ksF-xBzUL
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methods https://arxiv.org/abs/2105.07660
@article{david2020global, title={Global Wheat Head Detection (GWHD) dataset: a large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat head detection methods}, author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul A and others}, journal={Plant Phenomics}, volume={2020}, year={2020}, publisher={Science Partner Journal} }
2021 Kaggle competition https://www.kaggle.com/competitions/global-wheat-detection/overview
Tutorials and more info https://www.aicrowd.com/challenges/global-wheat-challenge-2021
Header image by 652234 on Pixabay
https://pixabay.com/photos/nature-spike-grain-field-plant-3450440/