Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Please ensure to cite the paper when utilizing the dataset in a research study. Refer to the paper link or BibTeX provided below.
This repository contains comprehensive datasets for soil classification and recognition research. The Original Dataset comprises soil images sourced from various online repositories, which have been meticulously cleaned and preprocessed to ensure data quality and consistency. To enhance the dataset's size and diversity, we employed Generative Adversarial Networks (GANs), specifically the CycleGAN architecture, to generate synthetic soil images. This augmented collection is referred to as the CyAUG Dataset. Both datasets are specifically designed to advance research in soil classification and recognition using state-of-the-art deep learning methodologies.
This dataset was curated as part of the research study titled "An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations" by Farhan Sheth, Priya Mathur, Amit Kumar Gupta, and Sandeep Chaurasia, published in Engineering Applications of Artificial Intelligence.
Application produced by this research is available at:
Note: If you are using any part of this project; dataset, code, application, then please cite the work as mentioned in the Citation section below.
Both dataset consists of images of 7 different soil types.
The Soil Classification Dataset is structured to facilitate the classification of various soil types based on images. The dataset includes images of the following soil types:
The dataset is organized into folders, each named after a specific soil type, containing images of that soil type. The images vary in resolution and quality, providing a diverse set of examples for training and testing classification models.
If you are using any of the derived dataset, please cite the following paper:
@article{SHETH2025111425,
title = {An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations},
journal = {Engineering Applications of Artificial Intelligence},
volume = {158},
pages = {111425},
year = {2025},
issn = {0952-1976},
doi = {https://doi.org/10.1016/j.engappai.2025.111425},
url = {https://www.sciencedirect.com/science/article/pii/S0952197625014277},
author = {Farhan Sheth and Priya Mathur and Amit Kumar Gupta and Sandeep Chaurasia},
keywords = {Soil classification, Crop recommendation, Vision transformers, Convolutional neural network, Transfer learning, Fuzzy logic}
}
Facebook
TwitterWe developed a comprehensive, gridded Global Soil Dataset for use in Earth System Models (GSDE) and other applications as well. GSDE provides soil information including soil particle-size distribution, organic carbon, and nutrients, etc. and quality control information in terms of confidence level. GSDE is based on the Soil Map of the World and various regional and national soil databases, including soil attribute data and soil maps. We used a standardized data structure and data processing procedures to harmonize the data collected from various sources. We then used a soil type linkage method (i.e. taxotransfer rules) and the polygon linkage method to derive the spatial distribution of soil properties. To aggregate the attributes of different compositions of a mapping unit, we used three mapping approaches: area-weighting method, the dominant soil type method and the dominant binned soil attribute method. In the released gridded dataset, we used the area-weighting method as it will meet the demands of most applications. The dataset can be also aggregate to a lower resolution. The resolution is 30 arc-seconds (about 1 km at the equator). The vertical variation of soil property was captured by eight layers to the depth of 2.3 m (i.e. 0- 0.045, 0.045- 0.091, 0.091- 0.166, 0.166- 0.289, 0.289- 0.493, 0.493- 0.829, 0.829- 1.383 and 1.383- 2.296 m).
Facebook
TwitterThis data set is a digital soil survey and generally is the most detailed level of soil geographic data developed by the National Cooperative Soil Survey. The information was prepared by digitizing maps, by compiling information onto a planimetric correct base and digitizing, or by revising digitized maps using remotely sensed and other information. This data set consists of georeferenced digital map data and computerized attribute data. The map data are in a soil survey area extent format and include a detailed, field verified inventory of soils and miscellaneous areas that normally occur in a repeatable pattern on the landscape and that can be cartographically shown at the scale mapped. A special soil features layer (point and line features) is optional. This layer displays the location of features too small to delineate at the mapping scale, but they are large enough and contrasting enough to significantly influence use and management. The soil map units are linked to attributes in the National Soil Information System relational database, which gives the proportionate extent of the component soils and their properties.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In this dataset, there are soil data analyses with properties such as pH, organic matter (OM), salinity (EC), etc., major elements (N, P, K, Mg) as well as some microelements (Fe, Zn, Mn, Cu, B) with significant impact on plant nutrition.
Agricultural Soil
Panagiotis Tziachris
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Soil Type is a dataset for classification tasks - it contains Soil annotations for 158 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterThe National Cooperative Soil Survey - Soil Characterization Database (NCSS-SCD) contains laboratory data for more than 65,000 locations (i.e. xy coordinates) throughout the United States and its Territories, and about 2,100 locations from other countries. It is a compilation of data from the Kellogg Soil Survey Laboratory (KSSL) and several cooperating laboratories. The data steward and distributor is the National Soil Survey Center (NSSC). Information contained within the database includes physical, chemical, biological, mineralogical, morphological, and mid infrared reflectance (MIR) soil measurements, as well a collection of calculated values. The intended use of the data is to support interpretations related to soil use and management. Data Usage Access to the data is provided via the following user interfaces: 1. Interactive Web Map 2. Lab Data Mart (LDM) for querying data and generating reports 3. Soil Data Access (SDA) web services for querying data 5. Direct download of the entire database in several formats Data at each location includes measurements at multiple depths (e.g. soil horizons). However, not all analyses have been conducted for each location and depth. Typically, a suite of measurements was collected based upon assumed or known conditions regarding the soil being analyzed. For example, soils of arid environments are routinely analyzed for salts and carbonates as part of the standard analysis suite. Standard morphological soil descriptions are available for about 60,000 of these locations. Mid-infrared (MIR) spectroscopy is available for about 7,000 locations. Soil fertility measurements, such as those made by Agricultural Experiment Stations, were not made. Most of the data were obtained over the last 40 years, with about 4,000 locations before 1960, 25,000 from 1960-1990, 27,000 from 1990-2010, and 13,000 from 2010 to 2021. Generally, the number of measurements recorded per location has increased over time. Typically, the data were collected to represent a soil series or map unit component concept. They may also have been sampled to determine the range of variation within a given landscape. Although strict quality-control measures are applied, the NSSC does not warrant that the data are error free. Also, in some cases the measurements are not within the applicability range of the laboratory methods. For example, dispersion of clay is incomplete in some soils by the standard method used for determining particle-size distribution. Soils producing incomplete dispersion include those that are derived from volcanic materials or that have a high content of iron oxides, gypsum, carbonates, or other cementing materials. Also note that determination of clay minerals by x-ray diffraction is relative. Measurements of very high or very low quantities by any method are not very precise. Other measurements have other limitations in some kinds of soils. Such data are retained in the database for research purposes. Also, some of the data for were obtained from cooperating laboratories within the NCSS. The accuracy of the location coordinates has not been quantified but can be inferred from the precision of their decimal degrees and the presence of a map datum. Some older records may correspond to a county centroid. When the map datum is missing it can be assumed that data prior to 1990 was recorded using NAD27 and with WGS84 after 1995. For detailed information about methods used in the KSSL and other laboratories refer to "Soil Survey Investigation Report No. 42". For information on the application of laboratory data, refer to "Soil Survey Investigation Report No. 45". If you are unfamiliar with any terms or methods feel free to consult your NRCS State Soil Scientist. Terms of Use This dataset is not designed for use as a primary regulatory tool in permitting or citing decisions but may be used as a reference source. This is public information and may be interpreted by organizations, agencies, units of government, or others based on needs; however, they are responsible for the appropriate application. Federal, State, or local regulatory bodies are not to reassign to the Natural Resources Conservation Service or the National Cooperative Soil Survey any authority for the decisions that they make. The Natural Resources Conservation Service will not perform any evaluations of these data for purposes related solely to State or local regulatory programs.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a synthetic dataset designed for training and evaluating machine learning models that classify whether certain soil and climate conditions are compatible for growing a given crop. The data includes environmental and soil features for a variety of crops commonly grown in India.
This dataset simulates the relationship between crop types, soil properties, and climate conditions. The goal is to predict whether a given combination of soil, weather, and crop factors is compatible for cultivation — making it ideal for binary classification tasks.
It is especially useful for: - Crop recommendation systems - Soil-climate compatibility prediction - Educational ML applications in agriculture
soil_climate_crop_data.csv: Main dataset file with synthetic records| Column Name | Description |
|---|---|
Crop_Type | Crop name (19 commonly grown crops) |
Soil_Type | Soil classification (4 common Indian soil types) |
Farm_Size_Acres | Size of the farm (in acres) |
Irrigation_Available | Boolean (Yes/No) representing irrigation access |
Soil_pH | Soil pH level |
Soil_Nitrogen | Nitrogen level in soil (ppm) |
Soil_Organic_Matter | Organic matter content (%) |
Temperature | Average temperature (°C) |
Rainfall | Annual rainfall (mm) |
Humidity | Relative humidity (%) |
Compatible | Binary label: 1 = Compatible, 0 = Not Compatible |
Wheat, Rice, Maize, Soybean, Niger, Urd, Summer Paddy, Gram, Tiwra, Millets, Arhar, Mustard, Jwar, Moong, Kulthi, Groundnut, Masoor, Til, Pea
This dataset is synthetically generated using domain-inspired rules and randomized inputs.
It is not based on real-world data but reflects typical patterns observed in Indian agricultural settings. The labels for compatibility are assigned based on plausible thresholds and combinations of soil, climate, and crop requirements.
License: CC0 1.0 Universal (Public Domain Dedication)
You are free to use, modify, and share this dataset for any purpose without restrictions.
Created by: Rajeev
Open to feedback and collaboration!
Facebook
TwitterThis data set is a digital soil survey and generally is the most detailed level of soil geographic data developed by the National Cooperative Soil Survey. The information was prepared by digitizing maps, by compiling information onto a planimetric correct base and digitizing, or by revising digitized maps using remotely sensed and other information. This data set consists of georeferenced digital map data and computerized attribute data. The map data are in a soil survey area extent format and include a detailed, field verified inventory of soils and miscellaneous areas that normally occur in a repeatable pattern on the landscape and that can be cartographically shown at the scale mapped. A special soil features layer (point and line features) is optional. This layer displays the location of features too small to delineate at the mapping scale, but they are large enough and contrasting enough to significantly influence use and management. The soil map units are linked to attributes in the National Soil Information System relational database, which gives the proportionate extent of the component soils and their properties.
Facebook
Twitter(Link to Metadata) This data set is a digital soil survey and generally is the most detailed level of soil geographic data developed by the National Cooperative Soil Survey. The information was prepared by digitizing maps, by compiling information onto a planimetric correct base and digitizing, or by revising digitized maps using remotely sensed and other information. This data set consists of georeferenced digital map data and computerized attribute data. The map data are in a soil survey area extent format and include a detailed, field verified inventory of soils and miscellaneous areas that normally occur in a repeatable pattern on the landscape and that can be cartographically shown at the scale mapped. A special soil features layer (point and line features) is optional. This layer displays the location of features too small to delineate at the mapping scale, but they are large enough and contrasting enough to significantly influence use and management. The soil map units are linked to attributes in the National Soil Information System relational database, which gives the proportionate extent of the component soils and their properties. Survey Dates - https://www.nrcs.usda.gov/wps/portal/nrcs/surveylist/soils/survey/state/?stateId=VT
Facebook
TwitterThe International Soil Reference and Information Centre-World Inventory of Soil Emission Potentials (ISRIC-WISE) international soil profile data set consists of a homogenized, global set of 1,125 soil profiles for use by global modelers. These profiles provided the basis for the Global Pedon Database (GPDB) of the International Geosphere-Biosphere Programme (IGBP) - Data and Information System (DIS). The data set consists of a selection of 665 profiles originating from the Natural Resources Conservation Service (NRCS, Lincoln), 250 profiles obtained from the Food and Agriculture Organization (FAO, Rome), and 210 profiles from the reference collection of the International Soil Reference and Information Centre (ISRIC, Wageningen). All profiles are georeferenced and classified according to the 1974 Legend of the FAO-UNESCO Soil Map (FAC-UNESCO, 1974) of the World, as well as the 1988 Revised Legend of FAO-UNESCO (FAO, 1990). The data set includes information on soil classification, site data, soil horizon data, source of data, and methods used for determining analytical data. The data files are in a comma-delimited format. Data Citation: The data set should be cited as follows: Batjes, N. H. (ed). 2000. Global Soil Profile Data (ISRIC-WISE). Available on-line from the ORNL Distributed Active Archive Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee, U.S.A.
Facebook
TwitterThis data set provides gridded data for selected soil parameters derived from data and methods developed by the Global Soil Data Task, an international collaborative project with the objective of making accurate and appropriate data relating to soil properties accessible to the global change research community. The task was coordinated by the International Geosphere-Biosphere Programme (IGBP-DIS). The data in this data set were produced by the International Satellite Land-Surface Climatology Project, Initiative II (ISLSCP II) staff from data obtained from the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC, http://daac.ornl.gov/). See the related data sets section below. Two-dimensional gridded maps of selected soil parameters, including soil texture, at a 1.0 by 1.0 degree spatial resolution and for two soil depths are provided. All data layers have been adjusted to match the ISLSCP II land/water mask. There are 36 data files with this data set.
Facebook
TwitterA global data set of soil types is available at 0.5-degree latitude by 0.5-degree longitude resolution. There are 106 soil units, based on Zobler?s (1986) assessment of the FAO/UNESCO Soil Map of the World. This data set is a conversion of the Zobler 1-degree resolution version to a 0.5-degree resolution. The resolution of the data set was not actually increased. Rather, the 1-degree squares were divided into four 0.5-degree squares with the necessary adjustment of continental boundaries and islands. The computer code used to convert the original 1-degree data to 0.5-degree is provided as a companion file. A JPG image of the data is provided in this document. The Zobler data (1-degree resolution) as distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] contains two columns, one column for continent and one column for soil type. The Soil Map of the World consists of 9 maps that represent parts of the world. The texture data that Webb et al.(1993) provided allowed for the fact that a soil type in one part of the world may have different properties than the same soil in a different part of the world. This continent-specific information is retained in this 0.5-degree resolution data set, as well as the soil type information which is the second column. A code was written (one2half.c) to take the file CONTIZOB.LER distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] and simply divide the 1-degree cells into quarters. This code also reads in a land/water file (land.wave) that specifies the cells that are land at 0.5 degrees. The code checks for consistency between the newly quartered map and the land/water map to which the quartered map is to be registered. If there is a discrepancy between the two, an attempt was made to make the two consistent using the following logic. If the cell is supposed to be water, it is forced to be water. If it is supposed to be land but was resolved to water at 1 degree, the code looks at the surrounding 8 cells and picks the most frequent soil type and assigns it to the cell. If there are no surrounding land cells then it is kept as water in the hopes that on the next pass one or more of the surrounding cells might be converted from water to a soil type. The whole map is iterated 5 times. The remaining cells that should be land but couldn't be determined from surrounding cells (mostly islands that are resolved at 0.5 degree but not at 1 degree) are printed out with coordinate information. A temporary map is output with -9 indicating where data is required. This is repeated for the continent code in CONTIZOB.LER as well. A separate map of the temporary continent codes is produced with -9 indicating required data. A nearly identical code (one2half.c) does the same for the continent codes. The printout allows one to consult the printed versions of the soil map and look up the soil type with the largest coverage in the 0.5-degree cell. The program manfix.c then will go through the temporary map and prompt for input to correct both the soil codes and the continent codes for the map. This can be done manually or by preparing a file of changes (new_fix.dat) and redirecting stdin. A new complete version of the map is outputted. This is in the form of the original CONTIZOB.LER file (contizob.half) but four times larger. Original documentation and computer codes prepared by Post et al. (1996) are provided as companion files with this data set. Image of 106 global soil types available at 0.5-degree by 0.5-degree resolution. Additional documentation from Zobler?s assessment of FAO soil units is available from the NASA Center for Scientific Information.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Hydric soils are defined as those soils that are sufficiently wet in the upper part to develop anaerobic conditions during the growing season. The Hydric Soils section presents the most current information about hydric soils. The lists of hydric soils were created by using National Soil Information System (NASIS) database selection criteria that were developed by the National Technical Committee for Hydric Soils. These criteria are selected soil properties that are documented in Soil Taxonomy (Soil Survey Staff, 1999) and were designed primarily to generate a list of potentially hydric soils from the National Soil Information System (NASIS) database. It updates information that was previously published in Hydric Soils of the United States and coordinates it with information that has been published in the Federal Register. It also includes the most recent set of field indicators of hydric soils. The database selection criteria are selected soil properties that are documented in Soil Taxonomy and were designed primarily to generate a list of potentially hydric soils from soil survey databases. Only criteria 1, 3, and 4 can be used in the field to determine hydric soils; however, proof of anaerobic conditions must also be obtained for criteria 1, 3, and 4 either through data or best professional judgment (from Tech Note 1). The primary purpose of these selection criteria is to generate a list of soil map unit components that are likely to meet the hydric soil definition. Caution must be used when comparing the list of hydric components to soil survey maps. Many of the soils on the list have ranges in water table depths that allow the soil component to range from hydric to nonhydric depending on the location of the soil within the landscape as described in the map unit. Lists of hydric soils along with soil survey maps are good off-site ancillary tools to assist in wetland determinations, but they are not a substitute for observations made during on-site investigations. The list of field indicators of hydric soils — The field indicators are morphological properties known to be associated with soils that meet the definition of a hydric soil. Presence of one or more field indicators suggests that the processes associated with hydric soil formation have taken place on the site being observed. The field indicators are essential for hydric soil identification because once formed, they persist in the soil during both wet and dry seasonal periods. The Hydric Soil Technical Notes — Contain National Technical Committee for Hydric Soils (NTCHS) updates, insights, standards, and clarifications. Users can query the database by State or by Soil Survey Area. Resources in this dataset:Resource Title: Website Pointer to Hydric Soils . File Name: Web Page, url: https://www.nrcs.usda.gov/wps/portal/nrcs/main/soils/use/hydric/ Includes description of Criteria, Query by State or Soil Survey Area, national Technical Committee for Hydric Soils. Technical Notes, and Related Links. Report Metadata:
Criteria:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sixty one soils (soil types) represent the range of soils found across South Australia’s agricultural lands. Mapping shows the most common soil within each map unit, while more detailed proportion data are supplied for calculating respective areas of each soil type (spatial data statistics).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The Harmonized World Soil Database version 2.0 (HWSD v2.0) is a unique global soil inventory providing information on the morphological, chemical and physical properties of soils at approximately 1 km resolution. Its main objective is to serve as a basis for prospective studies on agro-ecological zoning, food security and climate change. The Harmonized World Soil Database (HWSD) was established in 2008 by the International Institute for Applied Systems Analysis (IIASA) and FAO, and in partnership with International Soil Reference and Information Centre (ISRIC), the European Soil Bureau Network (ESBN) and the Institute for Soil Sciences Chinese Academy of Sciences (CAS). The data entry and harmonization within a Geographic Information System (GIS) was carried out at IIASA, with verification of the database undertaken by all partners. HWSD was then updated in 2013 (HWSD v1.2) and in 2023 (HWSD v2.0). This updated version (HWSD v2.0) is built on the previous versions of HWSD with several improvements on (i) the data source that now includes several national soil databases, (ii) an enhanced number of soil attributes available for seven soil depth layers, instead of two in HWSD v1.2, and (iii) a common soil reference for all soil units (FAO1990 and the World Reference Base for Soil Resources). This contributes to a further harmonization of the database. The GIS raster image file is linked to the soil attribute database. The HWSD v2.0 soil attribute database provides information on the soil unit composition for each of the near 30 000 soil association mapping units. The HWSD v2.0 Viewer, provided with the database, creates this link automatically and provides direct access to the soil attribute data and the soil association information. Note: - A tutorial for accessing HWSD ver. 2.0 using R (prepared by David Rossiter, June 2023) has been added as an 'associated resource' (NOTE: Needs the SQLite version of HWSD v2 as provided below). - Soil property estimates in HWSDv2 were derived from Batjes (2016), Geoderma (https://doi.org/10.1016/j.geoderma.2016.01.034).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The National Soil Database has produced a national database of soil geochemistry including point and spatial distribution maps of major nutrients, major elements, essential trace elements, trace elements of special interest and minor elements. In addition, this study has generated a National Soil Archive, comprising bulk soil samples and a nucleic acids archive each of which represent a valuable resource for future soils research in Ireland. The geographical coherence of the geochemical results was considered to be predominantly underpinned by underlying parent material and glacial geology. Other factors such as soil type, land use, anthropogenic effects and climatic effects were also evident. The coherence between elements, as displayed by multivariate analyses, was evident in this study. Examples included strong relationships between Co, Fe, As, Mn and Cu. This study applied large-scale microbiological analysis of soils for the first time in Ireland and in doing so also investigated microbial community structure in a range of soil types in order to determine the relationship between soil microbiology and chemistry. The results of the microbiological analyses were consistent with geochemical analyses and demonstrated that bacterial community populations appeared to be predominantly determined by soil parent material and soil type.
Facebook
TwitterA workbook of all the soils data collected near Holton, Kansas, in agricultural fields. Laboratory analysis of soil properties was completed by Ward Labs in Kearny Nebraska. Isotope analysis of soils was completed in Integrated Stable Isotope Research Facility operated by US Environmental Protection Agency. The goal of this project was to evaluate if Soil Health Principles can reduce the risk of nitrate leaching from agricultural fields. This effort was a collaborative project between EPA Region 7, EPA Office of Research and Development, and Kansas Department of Health and Environment (KDHE). Discussion of the project generating these data is available on the KDHE website: https://storymaps.arcgis.com/stories/1efcfe1924fc4daf85a7958c0a41fa5a It can also be found on the KDHE Watershed Management Section at the end of the What we Do section. https://www.kdhe.ks.gov/974/Watershed-Management-Section
Facebook
TwitterWeb Soil Survey (WSS) provides soil data and information produced by the National Cooperative Soil Survey. It is operated by the USDA Natural Resources Conservation Service (NRCS) and provides access to the largest natural resource information system in the world. NRCS has soil maps and data available online for more than 95 percent of the nation's counties and anticipates having 100 percent in the near future. The site is updated and maintained online as the single authoritative source of soil survey information.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset consists of general soil association units. It was developed by the National Cooperative Soil Survey and supersedes the State Soil Geographic (STATSGO) dataset published in 1994. It consists of a broad based inventory of soils and nonsoil areas that occur in a repeatable pattern on the landscape and that can be cartographically shown at the scale mapped of 1:250,000 in the continental U.S., Hawaii, Puerto, and the Virgin Islands and 1:1,000,000 in Alaska. The dataset was created by generalizing more detailed soil survey maps. Where more detailed soil survey maps were not available, data on geology, topography, vegetation, and climate were assembled, together with Land Remote Sensing Satellite (LANDSAT) images. Soils of like areas were studied, and the probable classification and extent of the soils were determined.
Map unit composition was determined by transecting or sampling areas on the more detailed maps and expanding the data statistically to characterize the entire map unit.
This dataset consists of georeferenced vector digital data and tabular digital data. The map data were collected in 1- by 2-degree topographic quadrangle units. The soil map units are linked to attributes in the National Soil Information System relational database, which gives the proportionate extent of the component soils and their properties.
These data provide information about soil features on or near the surface of the Earth. Data were collected as part of the National Cooperative Soil Survey. These data are intended for geographic display and analysis at the state, regional, and national level. The data should be displayed and analyzed at scales appropriate for 1:250,000-scale data.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: STATSGO2-State For complete information, please visit https://data.gov.
Facebook
TwitterThis map depicts soils data from the USDA NRCS SSURGO dataset. The soil type is indicated in the MUSYM field. The data was downloaded from the NRCS website.The full Kansas geospatial catalog is administered by the Kansas Data Access & Support Center (DASC) and can be found at the following URL: https://hub.kansasgis.org/
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Please ensure to cite the paper when utilizing the dataset in a research study. Refer to the paper link or BibTeX provided below.
This repository contains comprehensive datasets for soil classification and recognition research. The Original Dataset comprises soil images sourced from various online repositories, which have been meticulously cleaned and preprocessed to ensure data quality and consistency. To enhance the dataset's size and diversity, we employed Generative Adversarial Networks (GANs), specifically the CycleGAN architecture, to generate synthetic soil images. This augmented collection is referred to as the CyAUG Dataset. Both datasets are specifically designed to advance research in soil classification and recognition using state-of-the-art deep learning methodologies.
This dataset was curated as part of the research study titled "An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations" by Farhan Sheth, Priya Mathur, Amit Kumar Gupta, and Sandeep Chaurasia, published in Engineering Applications of Artificial Intelligence.
Application produced by this research is available at:
Note: If you are using any part of this project; dataset, code, application, then please cite the work as mentioned in the Citation section below.
Both dataset consists of images of 7 different soil types.
The Soil Classification Dataset is structured to facilitate the classification of various soil types based on images. The dataset includes images of the following soil types:
The dataset is organized into folders, each named after a specific soil type, containing images of that soil type. The images vary in resolution and quality, providing a diverse set of examples for training and testing classification models.
If you are using any of the derived dataset, please cite the following paper:
@article{SHETH2025111425,
title = {An advanced artificial intelligence framework integrating ensembled convolutional neural networks and Vision Transformers for precise soil classification with adaptive fuzzy logic-based crop recommendations},
journal = {Engineering Applications of Artificial Intelligence},
volume = {158},
pages = {111425},
year = {2025},
issn = {0952-1976},
doi = {https://doi.org/10.1016/j.engappai.2025.111425},
url = {https://www.sciencedirect.com/science/article/pii/S0952197625014277},
author = {Farhan Sheth and Priya Mathur and Amit Kumar Gupta and Sandeep Chaurasia},
keywords = {Soil classification, Crop recommendation, Vision transformers, Convolutional neural network, Transfer learning, Fuzzy logic}
}