This raster file represents land within the Raft River Study Area classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 30-meter spatial resolution. These classifications were determined at the pixel level by a Random Forest supervised machine learning methodology. Random Forest models are often used to classify large datasets accurately and efficiently by assigning each pixel to one of a pre-determined set of labels or groups. The model works by using decision trees that split the data based on characteristics that make the resulting groups as different from each other as possible. The model “learns” the characteristics that correlate to each label based on manually classified data points, also known as training data.A variety of data can be supplied as input to the Random Forest model in making its classification determinations. Irrigation produces distinct signals in observational data that can be identified by machine learning algorithms. Additionally, datasets that provide the model with information on landscape characteristics that often influence whether irrigation is present are also useful. This dataset was classified by the Random Forest model using top-of-atmosphere reflectance data from Landsat 5, Mapping Evapotranspiration with Internalized Calibration (METRIC) data, United States Geological Survey National Elevation Dataset (USGS NED) data, and Height Above Nearest Drainage (HAND) data. Landsat 5, METRIC, and HAND data are at a 30-meter spatial resolution, and the USGS NED data are at a 10-meter spatial resolution. The National Land Cover Dataset (NLCD) from USGS, Bureau of Reclamation (BOR) Land Use and Land Cover data, as well as Digital Ortho Photo Quadrangle (DOQQ) data were also used in determining irrigation status for the manually classified training data points but were not used for the machine learning model predictions. The final model results were manually reviewed prior to release, however, no extensive ground truthing process was implemented. “Speckling”, or small areas of incorrectly classified pixels, in the mountain areas was reduced by masking all pixels with a slope value of 15% or greater as “non-irrigated”, regardless of the status they were assigned by the Random Forest model. Speckling within irrigated areas was reduced by a majority filter smoothing technique using a kernel of 8 nearest neighbors.
This raster file represents land within the Raft River Study Area classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 30-meter spatial resolution. These classifications were determined at the pixel level by a Random Forest supervised machine learning methodology. Random Forest models are often used to classify large datasets accurately and efficiently by assigning each pixel to one of a pre-determined set of labels or groups. The model works by using decision trees that split the data based on characteristics that make the resulting groups as different from each other as possible. The model “learns” the characteristics that correlate to each label based on manually classified data points, also known as training data. A variety of data can be supplied as input to the Random Forest model for it to use in making its classification determinations. Irrigation produces distinct signals in observational data that can be identified by machine learning algorithms. Additionally, datasets that provide the model with information on landscape characteristics that often influence whether irrigation is present are also useful. This dataset was classified by the Random Forest model using Level 2 (surface reflectance), Collection 2, Tier 1 data from Landsat 7 and Landsat 8, Mapping Evapotranspiration with Internalized Calibration (METRIC) data produced by IDWR, United States Geological Survey National Elevation Dataset (USGS NED) data, and Height Above Nearest Drainage (HAND) data. Landsat 7, Landsat 8, METRIC, and HAND data are at a 30-meter spatial resolution, and the USGS NED data are at a 10-meter spatial resolution. The Cropland Data Layer (CDL) from the United States Department of Agriculture (UDSA) National Agricultural Statistics Service (NASS), National Agriculture Imagery Program (NAIP) data from the USDA Farm Service Agency (FSA), Utah Water Related Land Use data from the Utah Division of Water Resources, and water rights data from IDWR were also used in determining irrigation status for the manually classified training data points but were not used for the machine learning model predictions. The final model results were manually reviewed prior to release, however, no extensive ground truthing process was implemented. “Speckling”, or small areas of incorrectly classified pixels, was reduced by masking all pixels with a slope value of 10% or greater as “non-irrigated”, regardless of the status they were assigned by the Random Forest model. Speckling within irrigated areas was reduced by a majority filter smoothing technique using a kernel of 8 nearest neighbors. A limited amount of manual corrections were also made to the final results.
This raster file represents land within the Raft River Study Area classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 10-meter spatial resolution. These classifications were determined at the pixel level by a Random Forest supervised machine learning methodology. Random Forest models are often used to classify large datasets accurately and efficiently by assigning each pixel to one of a pre-determined set of labels or groups. The model works by using decision trees that split the data based on characteristics that make the resulting groups as different from each other as possible. The model “learns” the characteristics that correlate to each label based on manually classified data points, also known as training data.A variety of data can be supplied as input to the Random Forest model for it to use in making its classification determinations. Irrigation produces distinct signals in observational data that can be identified by machine learning algorithms. Additionally, datasets that provide the model with information on landscape characteristics that often influence whether irrigation is present are also useful. This dataset was classified by the Random Forest model using United States Geological Survey (USGS) Landsat 8 and 9 Level 2, Collection 2, Tier 1 data, Harmonized Sentinel-2 Multispectral Instrument Level-2A data, USGS 3D Elevation Program (USGS 3DEP) data, and Height Above Nearest Drainage (HAND) data. Landsat 8, Landsat 9, and HAND data are at a 30-meter spatial resolution, and the Sentinel-2 and USGS 3DEP data are at a 10-meter spatial resolution. Sentinel-2 Normalized Difference Vegetation Index (NDVI) values and National Agriculture Imagery Program (NAIP) imagery from 2021 (the most recent available) were used to determine irrigation status for the manually classified training data points. Irrigated training point locations were first identified by the NAIP 2021 imagery. Those point locations were then used to sample all available Sentinel-2 NDVI images for the 2022 growing season, and the time series at each point location was reviewed. Only points whose NDVI values remained at or above 0.6 for the majority of the growing season retained their irrigation classification. All non-irrigated training points were reviewed with Sentinel-2 NDVI and false-color imagery to ensure no new crop fields had been established in those locations during the previous year.The final model results were manually reviewed prior to release, however, no extensive ground truthing process was implemented. A wetlands mask was applied using the U.S. Fish and Wildlife Service’s National Wetlands Inventory (FWS NWI) data for areas without overlapping irrigation POUs or locations manually determined to have potential irrigation. “Speckling”, or small areas of incorrectly classified pixels, was reduced by using the Boundary Clean smoothing tool in ArcGIS with a descending sorting type.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This raster file represents land within the Raft River Study Area classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 30-meter spatial resolution. These classifications were determined at the pixel level by a Random Forest supervised machine learning methodology. Random Forest models are often used to classify large datasets accurately and efficiently by assigning each pixel to one of a pre-determined set of labels or groups. The model works by using decision trees that split the data based on characteristics that make the resulting groups as different from each other as possible. The model “learns” the characteristics that correlate to each label based on manually classified data points, also known as training data.A variety of data can be supplied as input to the Random Forest model in making its classification determinations. Irrigation produces distinct signals in observational data that can be identified by machine learning algorithms. Additionally, datasets that provide the model with information on landscape characteristics that often influence whether irrigation is present are also useful. This dataset was classified by the Random Forest model using top-of-atmosphere reflectance data from Landsat 5, Mapping Evapotranspiration with Internalized Calibration (METRIC) data, United States Geological Survey National Elevation Dataset (USGS NED) data, and Height Above Nearest Drainage (HAND) data. Landsat 5, METRIC, and HAND data are at a 30-meter spatial resolution, and the USGS NED data are at a 10-meter spatial resolution. The National Land Cover Dataset (NLCD) from USGS, Bureau of Reclamation (BOR) Land Use and Land Cover data, as well as Digital Ortho Photo Quadrangle (DOQQ) data were also used in determining irrigation status for the manually classified training data points but were not used for the machine learning model predictions. The final model results were manually reviewed prior to release, however, no extensive ground truthing process was implemented. “Speckling”, or small areas of incorrectly classified pixels, in the mountain areas was reduced by masking all pixels with a slope value of 15% or greater as “non-irrigated”, regardless of the status they were assigned by the Random Forest model. Speckling within irrigated areas was reduced by a majority filter smoothing technique using a kernel of 8 nearest neighbors.