This raster file represents land within the Mountain Home study boundary classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 10-meter spatial resolution. These classifications were determined at the pixel level by use of Random Forest, a supervised machine learning algorithm. Classification models often employ Random Forest due to its accuracy and efficiency at labeling large spatial datasets. To build a Random Forest model and supervise the learning process, IDWR staff create pre-labeled data, or training points, which are used by the algorithm to construct decision trees that will be later used on unseen data. Model accuracy is determined using a subset of the training points, otherwise known as a validation dataset. Several satellite-based input datasets are made available to the Random Forest model, which aid in distinguishing characteristics of irrigated lands. These characteristics allow patterns to be established by the model, e.g., high NDVI during summer months for cultivated crops, or consistently low ET for dryland areas. Mountain Home Irrigated Lands 2023 employed the following input datasets: US Geological Survey (USGS) products, including Landsat 8/9 and 10-meter 3DEP DEM, and European Space Agency (ESA) Copernicus products, including Harmonized Sentinel-2 and Global 30m Height Above Nearest Drainage (HAND). For the creation of manually labeled training points, IDWR staff accessed the following datasets: NDVI derived from Landsat 8/9, Sentinel-2 CIR imagery, US Department of Agriculture National Agricultural Statistics Service (USDA NASS) Cropland Data Layer, Active Water Rights Place of Use data from IDWR, and USDA’s National Agriculture Imagery Program (NAIP) imagery. All datasets were available for the current year of interest (2023). The published Mountain Home Irrigated Lands 2023 land classification raster was generated after four model runs, where at each iteration, IDWR staff added or removed training points to help improve results. Early model runs showed poor results in riparian areas near the Snake River, concentrated animal feeding operations (CAFOs), and non-irrigated areas at higher elevations. These issues were resolved after several model runs in combination with post-processing masks. Masks used include Fish and Wildlife Service’s National Wetlands Inventory (FWS NWI) data. These data were amended to exclude polygons overlying irrigated areas, and to expand riparian area in specific locations. A manually created mask was primarily used to fill in areas around the Snake River that the model did not uniformly designate as irrigated. Ground-truthing and a thorough review of IDWR’s water rights database provided further insight for class assignments near the town of Mayfield. Lastly, the Majority Filter tool in ArcGIS was applied using a kernel of 8 nearest neighbors to smooth out “speckling” within irrigated fields. The masking datasets and the final iteration of training points are available on request. Information regarding Sentinel and Landsat imagery:All satellite data products used within the Random Forest model were accessed via the Google Earth Engine API. To find more information on Sentinel data used, query the Earth Engine Data Catalog https://developers.google.com/earth-engine/datasets) using “COPERNICUS/S2_SR_HARMONIZED.” Information on Landsat datasets used can be found by querying “LANDSAT/LC08/C02/T1_L2” (for Landsat 8) and “LANDSAT/LC09/C02/T1_L2” (for Landsat 9).Each satellite product has several bands of available data. For our purposes, shortwave infrared 2 (SWIR2), blue, Normalized Difference Vegetation Index (NDVI), and near infrared (NIR) were extracted from both Sentinel and Landsat images. These images were later interpolated to the following dates: 2023-04-15, 2023-05-15, 2023-06-14, 2023-07-14, 2023-08-13, 2023-09-12. Interpolated values were taken from up to 45 days before and after each interpolated date. April-June interpolated Landsat images, as well as the April interpolated Sentinel image, were not used in the model given the extent of cloud cover overlying irrigated area. For more information on the pre-processing of satellite data used in the Random Forest model, please reach out to IDWR at gisinfo@idwr.idaho.gov.
This raster file represents land within the Mountain Home Study Area classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 30-meter spatial resolution. These classifications were determined at the pixel level by a Random Forest supervised machine learning methodology. Random Forest models are often used to classify large datasets accurately and efficiently by assigning each pixel to one of a pre-determined set of labels or groups. The model works by using decision trees that split the data based on characteristics that make the resulting groups as different from each other as possible. The model “learns” the characteristics that correlate to each label based on manually classified data points, also known as training data.A variety of data can be supplied as input to the Random Forest model for it to use in making its classification determinations. Irrigation produces distinct signals in observational data that can be identified by machine learning algorithms. Additionally, datasets that provide the model with information on landscape characteristics that often influence whether irrigation is present are also useful. This dataset was classified by the Random Forest model using Collection 1 Tier 1 top-of-atmosphere reflectance data from Landsat 5 and Landsat 7, United States Geological Survey National Elevation Dataset (USGS NED) data, and Height Above Nearest Drainage (HAND) data. Landsat 5, Landsat 7, and HAND data are at a 30-meter spatial resolution, and the USGS NED data are at a 10-meter spatial resolution. The Cropland Data Layer (CDL) from the United States Department of Agriculture National Agricultural Statistics Service (USDA NASS), Active Water Rights Place of Use (POU) data from IDWR, and National Agriculture Imagery Program (NAIP) data from the USDA Farm Service Agency (FSA) were also used in determining irrigation status for the manually classified training data points but were not used for the machine learning model predictions. The final model results were manually reviewed prior to release, however, no extensive ground truthing process was implemented. A wetlands mask was applied using Fish and Wildlife Service’s National Wetlands Inventory (FWS NWI) data for areas without overlapping irrigation place of use areas or locations manually determined to have potential irrigation. “Speckling”, or small areas of incorrectly classified pixels, was reduced by a majority filter smoothing technique using a kernel of 8 nearest neighbors. A limited number of manual corrections were made to correct for missing data due to Landsat 7 ETM+ Scan Line Corrector gaps (https://www.usgs.gov/faqs/what-landsat-7-etm-slc-data). These data have also been snapped to same grid used with IDWR’s Mapping EvapoTranspiration using high Resolution and Internalized Calibration (METRIC) evapotranspiration data. Information regarding Landsat imagery:Landsat 5 and Landsat 7 Collection 1 Tier 1 top-of-atmosphere reflectance images that overlapped the area of interest were used in this analysis. Images were filtered to exclude those that were more than 70% cloud covered, resulting in 35 Landsat 5 and 35 Landsat 7 images for the analysis period of 2010-03-01 to 2010-10-27. Normalized Difference Vegetation Index (NDVI), Band 1 (Blue) and Band 7 (SWIR2) values were interpolated for the following dates: 2010-04-15, 2010-05-15, 2010-06-14, 2010-07-14, 2010-08-13, and 2010-09-12 using image values from up to 45 days before and after each interpolation date.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This raster file represents land within the Mountain Home study boundary classified as either “irrigated” with a cell value of 1 or “non-irrigated” with a cell value of 0 at a 10-meter spatial resolution. These classifications were determined at the pixel level by use of Random Forest, a supervised machine learning algorithm. Classification models often employ Random Forest due to its accuracy and efficiency at labeling large spatial datasets. To build a Random Forest model and supervise the learning process, IDWR staff create pre-labeled data, or training points, which are used by the algorithm to construct decision trees that will be later used on unseen data. Model accuracy is determined using a subset of the training points, otherwise known as a validation dataset. Several satellite-based input datasets are made available to the Random Forest model, which aid in distinguishing characteristics of irrigated lands. These characteristics allow patterns to be established by the model, e.g., high NDVI during summer months for cultivated crops, or consistently low ET for dryland areas. Mountain Home Irrigated Lands 2023 employed the following input datasets: US Geological Survey (USGS) products, including Landsat 8/9 and 10-meter 3DEP DEM, and European Space Agency (ESA) Copernicus products, including Harmonized Sentinel-2 and Global 30m Height Above Nearest Drainage (HAND). For the creation of manually labeled training points, IDWR staff accessed the following datasets: NDVI derived from Landsat 8/9, Sentinel-2 CIR imagery, US Department of Agriculture National Agricultural Statistics Service (USDA NASS) Cropland Data Layer, Active Water Rights Place of Use data from IDWR, and USDA’s National Agriculture Imagery Program (NAIP) imagery. All datasets were available for the current year of interest (2023). The published Mountain Home Irrigated Lands 2023 land classification raster was generated after four model runs, where at each iteration, IDWR staff added or removed training points to help improve results. Early model runs showed poor results in riparian areas near the Snake River, concentrated animal feeding operations (CAFOs), and non-irrigated areas at higher elevations. These issues were resolved after several model runs in combination with post-processing masks. Masks used include Fish and Wildlife Service’s National Wetlands Inventory (FWS NWI) data. These data were amended to exclude polygons overlying irrigated areas, and to expand riparian area in specific locations. A manually created mask was primarily used to fill in areas around the Snake River that the model did not uniformly designate as irrigated. Ground-truthing and a thorough review of IDWR’s water rights database provided further insight for class assignments near the town of Mayfield. Lastly, the Majority Filter tool in ArcGIS was applied using a kernel of 8 nearest neighbors to smooth out “speckling” within irrigated fields. The masking datasets and the final iteration of training points are available on request. Information regarding Sentinel and Landsat imagery:All satellite data products used within the Random Forest model were accessed via the Google Earth Engine API. To find more information on Sentinel data used, query the Earth Engine Data Catalog https://developers.google.com/earth-engine/datasets) using “COPERNICUS/S2_SR_HARMONIZED.” Information on Landsat datasets used can be found by querying “LANDSAT/LC08/C02/T1_L2” (for Landsat 8) and “LANDSAT/LC09/C02/T1_L2” (for Landsat 9).Each satellite product has several bands of available data. For our purposes, shortwave infrared 2 (SWIR2), blue, Normalized Difference Vegetation Index (NDVI), and near infrared (NIR) were extracted from both Sentinel and Landsat images. These images were later interpolated to the following dates: 2023-04-15, 2023-05-15, 2023-06-14, 2023-07-14, 2023-08-13, 2023-09-12. Interpolated values were taken from up to 45 days before and after each interpolated date. April-June interpolated Landsat images, as well as the April interpolated Sentinel image, were not used in the model given the extent of cloud cover overlying irrigated area. For more information on the pre-processing of satellite data used in the Random Forest model, please reach out to IDWR at gisinfo@idwr.idaho.gov.