100+ datasets found

t
ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...
researchdata.tuwien.ac.at
zip
Updated Jun 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo (2025). ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations [Dataset]. http://doi.org/10.48436/3fcxr-cde10
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/3fcxr-cde10
Dataset updated
Jun 6, 2025
Dataset provided by
TU Wien
Authors
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/

This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.

Dataset paper (public preprint)

A description of this dataset, including the methodology and validation results, is available at:

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.

Abstract

ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.

Summary

Gap-filled global estimates of volumetric surface soil moisture from 1991-2023 at 0.25° sampling

Fields of application (partial): climate variability and change, land-atmosphere interactions, global biogeochemical cycles and ecology, hydrological and land surface modelling, drought applications, and meteorology

Method: Modified version of DCT-PLS (Garcia, 2010) interpolation/smoothing algorithm, linear interpolation over periods of frozen soils. Uncertainty estimates are provided for all data points.

More information: See Preimesberger et al. (2025) and https://doi.org/10.5281/zenodo.8320869" target="_blank" rel="noopener">ESA CCI SM Algorithm Theoretical Baseline Document [Chapter 7.2.9] (Dorigo et al., 2023)

Programmatic Download

You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.

#!/bin/bash

# Set download directory
DOWNLOAD_DIR=~/Downloads

base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"

# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done

Data details

The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:

ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc

Data Variables

Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:

sm: (float) The Soil Moisture variable reflects estimates of daily average volumetric soil moisture content (m3/m3) in the soil surface layer (~0-5 cm) over a whole grid cell (0.25 degree).

sm_uncertainty: (float) The Soil Moisture Uncertainty variable reflects the uncertainty (random error) of the original satellite observations and of the predictions used to fill observation data gaps.

sm_anomaly: Soil moisture anomalies (reference period 1991-2020) derived from the gap-filled values (`sm`)

sm_smoothed: Contains DCT-PLS predictions used to fill data gaps in the original soil moisture field. These values are also provided for cases where an observation was initially available (compare `gapmask`). In this case, they provided a smoothed version of the original data.

gapmask: (0 | 1) Indicates grid cells where a satellite observation is available (1), and where the interpolated (smoothed) values are used instead (0) in the 'sm' field.

frozenmask: (0 | 1) Indicates grid cells where ERA5 soil temperature is <0 °C. In this case, a linear interpolation over time is applied.

Additional information for each variable is given in the netCDF attributes.

Version Changelog

Changes in v9.1r1 (previous version was v09.1):

This version uses a novel uncertainty estimation scheme as described in Preimesberger et al. (2025).

Software to open netCDF files

These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:

https://github.com/pydata/xarray" target="_blank" rel="noopener">Xarray (python)

https://unidata.github.io/netcdf4-python/" target="_blank" rel="noopener">netCDF4 (python)

https://github.com/TUW-GEO/esa_cci_sm">esa_cci_sm (python)

Similar tools exists for other programming languages (Matlab, R, etc.)

Software packages and GIS tools can open netCDF files, e.g. CDO, NCO, QGIS, ArCGIS

You can also use the GUI software Panoply to view the contents of each file

References

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.

Dorigo, W., Preimesberger, W., Stradiotti, P., Kidd, R., van der Schalie, R., van der Vliet, M., Rodriguez-Fernandez, N., Madelon, R., & Baghdadi, N. (2023). ESA Climate Change Initiative Plus - Soil Moisture Algorithm Theoretical Baseline Document (ATBD) Supporting Product Version 08.1 (version 1.1). Zenodo. https://doi.org/10.5281/zenodo.8320869

Garcia, D., 2010. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 54(4), pp.1167-1178. Available at: https://doi.org/10.1016/j.csda.2009.09.020

Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C.-J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., and Toll, D.: The Global Land Data Assimilation System, Bulletin of the American Meteorological Society, 85, 381 – 394, https://doi.org/10.1175/BAMS-85-3-381, 2004.

Related Records

The following records are all part of the Soil Moisture Climate Data Records from satellites community

1
ESA CCI SM MODELFREE Surface Soil Moisture Record
<a href="https://doi.org/10.48436/svr1r-27j77" target="_blank"
Z
National Weather Service Coded Surface Bulletins, 2003- (netCDF format)
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Biard, James C (2020). National Weather Service Coded Surface Bulletins, 2003- (netCDF format) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2651360
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Biard, James C
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains the Coded Surface Bulletin (CSB) dataset reformatted as netCDF-4 files. The CSB dataset is a collection of ASCII files containing the locations of weather fronts, troughs, high pressure centers, and low pressure centers as determined by National Weather Service meteorologists at the Weather Prediction Center (WPC) during the surface analysis they do every three hours. Each bulletin is broadcast on the NOAAPort service, and has been available since 2003.

Each netCDF file contains one year of CSB fronts data represented as spatial map data grids. The times and geospatial locations for the data grid cells are also included. The front data is stored in a netCDF variable with dimensions (time, front type, y, x), where x and y are geospatial dimensions. There is a 2D geospatial data grid for each time step for each of the 4 front types—cold, warm, stationary, and occluded. The front polylines from the CSB dataset are rasterized into the appropriate data grids. Each file conforms to the Climate and Forecast Metadata Conventions.

There are two large groupings of the CSB netCDF files. One group uses a data grid based on the North American Regional Reanalysis (NARR) grid, which is a Lambert Conformal Conic projection coordinate reference system (CRS) centered over North America. The NARR grid is quite close the the spatial range of data displayed on the WPC workstations used to perform surface analysis and identify front locations. The native NARR grid has grid cells which are 32 km on each side. Our grid covers the same extents with cells that are 96 km on each side.

The other group uses a 1° latitude/longitude data grid centered over North America with extents 171W – 31W / 10N – 77 N. The files in this group are identified by the name MERRA2, because they were used with data from the NASA MERRA-2 dataset, which uses a latitude/longitude data grid.

There are a number of files within each group. The files all follow the naming convention codsus_[masked]_.nc, where [masked] indicates that the presence of the word masked is optional and is either merra2-1deg or narr-96km. The element is either the word mask or the sequence wide_, where is the front width and is the year for the data stored in the file.

The codsus_mask.nc file is a file containing a single data grid that delineates the envelope of the geospatial region where there are, on average, 40 or more front crossing of any type per year. The WPC meteorologists don't attempt to provide equal levels of attention to every grid cell displayed on their workstations. The files of the form codsus_masked_wide_.nc have all had the mask described above applied to exclude parts of fronts that extend past the envelope. The files of the form codsus_wide_.nc have no masking applied.

The wide portion of the file names takes two forms—1wide and 3wide. The fronts in the1wide files were rasterized by drawing the front polylines with a width of one grid cell. The fronts in the 3wide files were rasterized by drawing the front polylines with a width of 3 grid cells.

Within each grid group, there are five subsets of files:

codsus_masked_1wide_.nc

codsus_masked_3wide_.nc

codsus_1wide_.nc

codsus_3wide_.nc

codsus_mask.nc

The primary source for this dataset is an internal archive maintained by personnel at the WPC and provided to the author. It is also provided at DOI 10.5281/zenodo.2642801. Some bulletins missing from the WPC archive were filled in with data acquired from the Iowa Environmental Mesonet.
TIGER/Line Shapefile, 2022, County, Robeson County, NC, Feature Names...
catalog.data.gov
datasets.ai
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Robeson County, NC, Feature Names Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2022-county-robeson-county-nc-feature-names-relationship-file
Explore at:
Dataset updated
Jan 28, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
Area covered
Robeson County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Feature Names Relationship File (FEATNAMES.dbf) contains a record for each feature name and any attributes associated with it. Each feature name can be linked to the corresponding edges that make up that feature in the All Lines Shapefile (EDGES.shp), where applicable to the corresponding address range or ranges in the Address Ranges Relationship File (ADDR.dbf), or to both files. Although this file includes feature names for all linear features, not just road features, the primary purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute, which can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature is identified by the linear feature identifier (LINEARID) attribute, which can be used to relate the address range back to the name attributes of the feature in the Feature Names Relationship File or to the feature record in the Primary Roads, Primary and Secondary Roads, or All Roads Shapefiles. The edge to which a feature name applies can be determined by linking the feature name record to the All Lines Shapefile (EDGES.shp) using the permanent edge identifier (TLID) attribute. The address range identifier(s) (ARID) for a specific linear feature can be found by using the linear feature identifier (LINEARID) from the Feature Names Relationship File (FEATNAMES.dbf) through the Address Range / Feature Name Relationship File (ADDRFN.dbf).
n
GRACE MONTHLY LAND WATER MASS GRIDS NETCDF RELEASE 5.0
podaac.jpl.nasa.gov
data.globalchange.gov
html
Updated Aug 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PO.DAAC (2024). GRACE MONTHLY LAND WATER MASS GRIDS NETCDF RELEASE 5.0 [Dataset]. http://doi.org/10.5067/TELND-NC005
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.5067/TELND-NC005
Dataset updated
Aug 23, 2024
Dataset provided by
PO.DAAC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 1, 2002 - Present
Variables measured
GRAVITY
Description
The twin satellites of the Gravity Recovery and Climate Experiment (GRACE), launched in March of 2002, are making detailed monthly measurements of Earth's gravity field changes. These observations can detect regional mass changes of Earth's water reservoirs over land, ice and oceans. GRACE measures gravity variations by relating it to the distance variations between the two satellites, which fly in the same orbit, separated by about 240 km at an altitude of ~450 km. The monthly land mass grids contain terrestrial water storage anomalies (in aquifers, river basins, etc.) from GRACE time-variable gravity data relative to a time-mean. The storage anomalies are given in 'equivalent water thickness' (in NetCDF format). The time coverage for the monthly grids are determined by GRACE months. For the list of GRACE month dates visit http://grace.jpl.nasa.gov/data/grace-months/ . For information please visit http://grace.jpl.nasa.gov/data/get-data/monthly-mass-grids-land/ .
NOAA Global Forecast System (GFS) netCDF Formatted Data
registry.opendata.aws
Updated Mar 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA (2025). NOAA Global Forecast System (GFS) netCDF Formatted Data [Dataset]. https://registry.opendata.aws/noaa-oar-arl-nacc-pds/
Explore at:
Dataset updated
Mar 5, 2025
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Description
The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). Dozens of atmospheric and land-soil variables are available through this dataset, from temperatures, winds, and precipitation to soil moisture and atmospheric ozone concentration. The GFS data files stored here can be immediately used for OAR/ARL’s NOAA-EPA Atmosphere-Chemistry Coupler Cloud (NACC-Cloud) tool, and are in a Network Common Data Form (netCDF), which is a very common format used across the scientific community. These particular GFS files contain a comprehensive number of global atmosphere/land variables at a relatively high spatiotemporal resolution (approximately 13x13 km horizontal, vertical resolution of 127 levels, and hourly), are not only necessary for the NACC-Cloud tool to adequately drive community air quality applications (e.g., U.S. EPA’s Community Multiscale Air Quality model; https://www.epa.gov/cmaq), but can be very useful for a myriad of other applications in the Earth system modeling communities (e.g., atmosphere, hydrosphere, pedosphere, etc.). While many other data file and record formats are indeed available for Earth system and climate research (e.g., GRIB, HDF, GeoTIFF), the netCDF files here are advantageous to the larger community because of the comprehensive, high spatiotemporal information they contain, and because they are more scalable, appendable, shareable, self-describing, and community-friendly (i.e., many tools available to the community of users). Out of the four operational GFS forecast cycles per day (at 00Z, 06Z, 12Z and 18Z) this particular netCDF dataset is updated daily (/inputs/yyyymmdd/) for the 12Z cycle and includes 24-hr output for both 2D (gfs.t12z.sfcf$0hh.nc) and 3D variables (gfs.t12z.atmf$0hh.nc).

Also available are netCDF formatted Global Land Surface Datasets (GLSDs) developed by Hung et al. (2024). The GLSDs are based on numerous satellite products, and have been gridded to match the GFS spatial resolution (~13x13 km). These GLSDs contain vegetation canopy data (e.g., land surface type, vegetation clumping index, leaf area index, vegetative canopy height, and green vegetation fraction) that are supplemental to and can be combined with the GFS meteorological netCDF data for various applications, including NOAA-ARL's canopy-app. The canopy data variables are climatological, based on satellite data from the year 2020, combined with GFS meteorology for the year 2022, and are created at a daily temporal resolution (/inputs/geo-files/gfs.canopy.t12z.2022mmdd.sfcf000.global.nc)
U
CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions
dataverse-staging.rdmc.unc.edu
datasearch.gesis.org
Updated Dec 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UNC Dataverse (2019). CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions [Dataset]. http://doi.org/10.15139/S3/XDYYB9
Explore at:
Unique identifier
https://doi.org/10.15139/S3/XDYYB9
Dataset updated
Dec 12, 2019
Dataset provided by
UNC Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
Data Summary: US states grid mask file and NOAA climate regions grid mask file, both compatible with the 12US1 modeling grid domain. Note:The datasets are on a Google Drive. The metadata associated with this DOI contain the link to the Google Drive folder and instructions for downloading the data. These files can be used with CMAQ-ISAMv5.3 to track state- or region-specific emissions. See Chapter 11 and Appendix B.4 in the CMAQ User's Guide for further information on how to use the ISAM control file with GRIDMASK files. The files can also be used for state or region-specific scaling of emissions using the CMAQv5.3 DESID module. See the DESID Tutorial and Appendix B.4 in the CMAQ User's Guide for further information on how to use the Emission Control File to scale emissions in predetermined geographical areas. File Location and Download Instructions: Link to GRIDMASK files Link to README text file with information on how these files were created File Format: The grid mask are stored as netcdf formatted files using I/O API data structures (https://www.cmascenter.org/ioapi/). Information on the model projection and grid structure is contained in the header information of the netcdf file. The output files can be opened and manipulated using I/O API utilities (e.g. M3XTRACT, M3WNDW) or other software programs that can read and write netcdf formatted files (e.g. Fortran, R, Python). File descriptions These GRIDMASK files can be used with the 12US1 modeling grid domain (grid origin x = -2556000 m, y = -1728000 m; N columns = 459, N rows = 299). GRIDMASK_STATES_12US1.nc - This file containes 49 variables for the 48 states in the conterminous U.S. plus DC. Each state variable (e.g., AL, AZ, AR, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that state. GRIDMASK_CLIMATE_REGIONS_12US1.nc - This file containes 9 variables for 9 NOAA climate regions based on the Karl and Koss (1984) definition of climate regions. Each climate region variable (e.g., CLIMATE_REGION_1, CLIMATE_REGION_2, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that climate region. NOAA Climate regions: CLIMATE_REGION_1: Northwest (OR, WA, ID) CLIMATE_REGION_2: West (CA, NV) CLIMATE_REGION_3: West North Central (MT, WY, ND, SD, NE) CLIMATE_REGION_4: Southwest (UT, AZ, NM, CO) CLIMATE_REGION_5: South (KS, OK, TX, LA, AR, MS) CLIMATE_REGION_6: Central (MO, IL, IN, KY, TN, OH, WV) CLIMATE_REGION_7: East North Central (MN, IA, WI, MI) CLIMATE_REGION_8: Northeast (MD, DE, NJ, PA, NY, CT, RI, MA, VT, NH, ME) + Washington, D.C.* CLIMATE_REGION_9: Southeast (VA, NC, SC, GA, AL, GA) *Note that Washington, D.C. is not included in any of the climate regions on the website but was included with the “Northeast” region for the generation of this GRIDMASK file.
TIGER/Line Shapefile, 2022, County, Wake County, NC, Feature Names...
datasets.ai
s.cnmilf.com
+1more
55, 57
Updated Jan 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Census Bureau, Department of Commerce (2024). TIGER/Line Shapefile, 2022, County, Wake County, NC, Feature Names Relationship File [Dataset]. https://datasets.ai/datasets/tiger-line-shapefile-2022-county-wake-county-nc-feature-names-relationship-file
Explore at:
55, 57Available download formats
Dataset updated
Jan 27, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
U.S. Census Bureau, Department of Commerce
Area covered
North Carolina, Wake County
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Feature Names Relationship File (FEATNAMES.dbf) contains a record for each feature name and any attributes associated with it. Each feature name can be linked to the corresponding edges that make up that feature in the All Lines Shapefile (EDGES.shp), where applicable to the corresponding address range or ranges in the Address Ranges Relationship File (ADDR.dbf), or to both files. Although this file includes feature names for all linear features, not just road features, the primary purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute, which can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature is identified by the linear feature identifier (LINEARID) attribute, which can be used to relate the address range back to the name attributes of the feature in the Feature Names Relationship File or to the feature record in the Primary Roads, Primary and Secondary Roads, or All Roads Shapefiles. The edge to which a feature name applies can be determined by linking the feature name record to the All Lines Shapefile (EDGES.shp) using the permanent edge identifier (TLID) attribute. The address range identifier(s) (ARID) for a specific linear feature can be found by using the linear feature identifier (LINEARID) from the Feature Names Relationship File (FEATNAMES.dbf) through the Address Range / Feature Name Relationship File (ADDRFN.dbf).
Outputs from a Regional Ocean Modeling System (ROMS) data assimilative...
seanoe.org
nc
Updated Nov 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Wilkin; Julia Levin (2021). Outputs from a Regional Ocean Modeling System (ROMS) data assimilative reanalysis (version DopAnV2R3-ini2007) of ocean circulation in the Mid-Atlantic Bight and Gulf of Maine for 2007-2020 [Dataset]. http://doi.org/10.17882/86286
Explore at:
ncAvailable download formats
Unique identifier
https://doi.org/10.17882/86286
Dataset updated
Nov 30, 2021
Dataset provided by
SEANOE
Authors
John Wilkin; Julia Levin
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Time period covered
Jan 1, 2007 - Aug 30, 2021
Area covered

Description
a hindcast reanalysis of ocean circulation in the mid-atlantic bight and gulf of maine has been computed using the regional ocean modeling system (roms) with 4-dimensional variational (4d-var) assimilation of data from satellites, land-based ocean surface current measuring radar, and all available in situ observations from the maracoos (maracoos.org) and neracoos (neracoos.org) regional associations of the u.s. integrated ocean observing system (ioos). this reanalysis is version dopanv2r3-ini2007 (version 2, release 3, initialized january 2007).the analysis covers the period 2-jan-2007 to 30-aug-2021 on a 7-km horizonal grid with 40 vertical terrain-following s-coordinate levels. ocean state variables computed are sea level, velocity, temperature, and salinity. air-sea fluxes of heat and momentum, and surface and bottoms stresses, are included.results are provided on the roms model native 3-dimensional grid as (i) 1-hourly interval snapshots (roms “history” files), (ii) 1-day averages, (iii) monthly averages, (iv) yearly averages, and (v) ensemble monthly averages (i.e., the mean of all days in the same month from all years). the output files are in netcdf format and data and metadata follow cf-1.4 conventions for the description of coordinates and variables.the files uploaded here are examples of one time record from each of these 5 collections. outputs for the full reanalysis, which comprises 6.8 terabytes fo data, are made available for download via a thredds (thematic real-time environmental distributed data services) web service to facilitate user geospatial or temporal sub-setting.the thredds catalog urls and example filenames available here, for the respective collections, are: 1-hourly history snapshots 2007-01-02 01:00 through 2021-08-31 00:00: ttps://tds.marine.rutgers.edu/thredds/roms/doppio/catalog.html?dataset=dopanv2r3-ini2007_da_history example file uploaded here is his_dopanv2r3_20140516t0100.nc for 2014-05-06 01:00 24-hour averages 2007-01-02 12:00 through 2021-08-30 12:00 https://tds.marine.rutgers.edu/thredds/roms/doppio/catalog.html?dataset=dopanv2r3-ini2007_da_average example file uploaded here is avg_dopanv2r3_20140516t1200.nc for 2014-05-06 monthly averages 2007-01-17 through 2020-12-16 https://tds.marine.rutgers.edu/thredds/roms/doppio/catalog.html?dataset=dopanv2r3-ini2007_da_monthly_averages example file uploaded here is mon_dopanv2r3_201405.nc for 2014-05 yearly averages 2007 through 2020: https://tds.marine.rutgers.edu/thredds/roms/doppio/catalog.html?dataset=dopanv2r3-ini2007_da_yearly_averages example file uploaded here is year_dopanv2r3_2014.nc for 2014 monthly ensemble averages: https://tds.marine.rutgers.edu/thredds/roms/doppio/catalog.html?dataset=dopanv2r3-ini2007_da_monthly_ensemble_means example file uploaded here is ensmon_dopanv2r3_05.nc for maythe underlying ocean circulation model configuration is described by lopez et al (2020). the observations that are assimilated and the error hypotheses and other aspects of the 4d-var assimilation implementation are described by levin et al. (2020; 2021).lópez, a. g., j. l. wilkin and j. c. levin, (2020) doppio – a roms (v3.6)-based circulation model for the mid-atlantic bight and gulf of maine: configuration and comparison to integrated coastal observing network observations, geosci. model dev., 13, 3709–3729, doi: 10.5194/gmd-13-3709-2020levin, j., h. arango, b. laughlin, e. hunter, j. wilkin and a. moore, (2020), observation impacts on the mid-atlantic bight front and cross-shelf transport in 4d-var ocean state estimates, part i – multiplatform analysis, ocean modelling, 156, 101721, doi: 10.1016/j.ocemod.2020.101721levin, j., h. g. arango, b. laughlin, j. wilkin and a. m. moore, (2021), the impact of remote sensing observations on cross-shelf transport estimates from 4d-var analyses of the mid-atlantic bight, advances in space research, 68, 553-570, doi: 10.1016/j.asr.2019.09.012
2023 Cartographic Boundary File (KML), Unified School District for North...
catalog.data.gov
gimi9.com
Updated May 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division (Point of Contact) (2024). 2023 Cartographic Boundary File (KML), Unified School District for North Carolina, 1:500,000 [Dataset]. https://catalog.data.gov/dataset/2023-cartographic-boundary-file-kml-unified-school-district-for-north-carolina-1-500000
Explore at:
Dataset updated
May 16, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
Description
The 2023 cartographic boundary KMLs are simplified representations of selected geographic areas from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). These boundary files are specifically designed for small-scale thematic mapping. When possible, generalization is performed with the intent to maintain the hierarchical relationships among geographies and to maintain the alignment of geographies within a file set for a given year. Geographic areas may not align with the same areas from another year. Some geographies are available as nation-based files while others are available only as state-based files. School Districts are single-purpose administrative units within which local officials provide public educational services for the area's residents. The Census Bureau obtains the boundaries, names, local education agency codes, grade ranges, and school district levels for school districts from state officials for the primary purpose of providing the U.S. Department of Education with estimates of the number of children in poverty within each school district. This information serves as the basis for the Department of Education to determine the annual allocation of Title I funding to states and school districts. The cartographic boundary files include separate files for elementary, secondary and unified school districts. The generalized school district boundaries in this file are based on those in effect for the 2022-2023 school year, i.e., in operation as of January 1, 2023.
r
NORTH CAROLINA
redivis.com
Updated May 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCLA Library (2024). NORTH CAROLINA [Dataset]. https://redivis.com/datasets/ey62-9t0gpyvbg/usage
Explore at:
Dataset updated
May 8, 2024
Dataset authored and provided by
UCLA Library
Description
The table NORTH CAROLINA is part of the dataset L2 Voter File, available at https://redivis.com/datasets/ey62-9t0gpyvbg. It contains 169832561 rows across 38 variables.
TIGER/Line Shapefile, 2023, County, Clay County, NC, Address Range-Feature...
s.cnmilf.com
datasets.ai
+1more
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Clay County, NC, Address Range-Feature Name Relationship File [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/tiger-line-shapefile-2023-county-clay-county-nc-address-range-feature-name-relationship-file
Explore at:
Dataset updated
Dec 15, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
United States Department of Commercehttp://www.commerce.gov/
Area covered
Clay County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national filewith no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independentdata set, or they can be combined to cover the entire nation. The Address Range / Feature Name Relationship File (ADDRFN.dbf) contains a record for each address range / linear feature name relationship. The purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute that can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature name is identified by the linear feature identifier (LINEARID) attribute that can be used to link to the Feature Names Relationship File (FEATNAMES.dbf).
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
TIGER/Line Shapefile, 2023, County, Bertie County, NC, Address Ranges...
datasets.ai
catalog.data.gov
55, 57
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Census Bureau, Department of Commerce, TIGER/Line Shapefile, 2023, County, Bertie County, NC, Address Ranges Relationship File [Dataset]. https://datasets.ai/datasets/tiger-line-shapefile-2023-county-bertie-county-nc-address-ranges-relationship-file
Explore at:
55, 57Available download formats
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
U.S. Census Bureau, Department of Commerce
Area covered
Bertie County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
TIGER/Line Shapefile, 2023, County, Halifax County, NC, Feature Names...
catalog.data.gov
datasets.ai
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Halifax County, NC, Feature Names Relationship File [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2023-county-halifax-county-nc-feature-names-relationship-file
Explore at:
Dataset updated
Dec 15, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
Area covered
Halifax County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Feature Names Relationship File (FEATNAMES.dbf) contains a record for each feature name and any attributes associated with it. Each feature name can be linked to the corresponding edges that make up that feature in the All Lines Shapefile (EDGES.shp), where applicable to the corresponding address range or ranges in the Address Ranges Relationship File (ADDR.dbf), or to both files. Although this file includes feature names for all linear features, not just road features, the primary purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute, which can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature is identified by the linear feature identifier (LINEARID) attribute, which can be used to relate the address range back to the name attributes of the feature in the Feature Names Relationship File or to the feature record in the Primary Roads, Primary and Secondary Roads, or All Roads Shapefiles. The edge to which a feature name applies can be determined by linking the feature name record to the All Lines Shapefile (EDGES.shp) using the permanent edge identifier (TLID) attribute. The address range identifier(s) (ARID) for a specific linear feature can be found by using the linear feature identifier (LINEARID) from the Feature Names Relationship File (FEATNAMES.dbf) through the Address Range / Feature Name Relationship File (ADDRFN.dbf).
TIGER/Line Shapefile, 2023, County, Cabarrus County, NC, Address Ranges...
s.cnmilf.com
catalog.data.gov
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, 2023, County, Cabarrus County, NC, Address Ranges Relationship File [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/tiger-line-shapefile-2023-county-cabarrus-county-nc-address-ranges-relationship-file
Explore at:
Dataset updated
Dec 15, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
United States Department of Commercehttp://www.commerce.gov/
Area covered
Cabarrus County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
TIGER/Line Shapefile, 2022, County, Pitt County, NC, Address Range-Feature...
s.cnmilf.com
catalog.data.gov
Updated Jan 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Pitt County, NC, Address Range-Feature Name Relationship File [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/tiger-line-shapefile-2022-county-pitt-county-nc-address-range-feature-name-relationship-file
Explore at:
Dataset updated
Jan 27, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
United States Department of Commercehttp://www.commerce.gov/
Area covered
Pitt County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national filewith no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independentdata set, or they can be combined to cover the entire nation. The Address Range / Feature Name Relationship File (ADDRFN.dbf) contains a record for each address range / linear feature name relationship. The purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute that can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature name is identified by the linear feature identifier (LINEARID) attribute that can be used to link to the Feature Names Relationship File (FEATNAMES.dbf).
TIGER/Line Shapefile, 2023, County, Alamance County, NC, Address...
datasets.ai
catalog.data.gov
55, 57
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Census Bureau, Department of Commerce, TIGER/Line Shapefile, 2023, County, Alamance County, NC, Address Range-Feature Name Relationship File [Dataset]. https://datasets.ai/datasets/tiger-line-shapefile-2023-county-alamance-county-nc-address-range-feature-name-relationship-fil
Explore at:
55, 57Available download formats
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
U.S. Census Bureau, Department of Commerce
Area covered
Alamance County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national filewith no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independentdata set, or they can be combined to cover the entire nation. The Address Range / Feature Name Relationship File (ADDRFN.dbf) contains a record for each address range / linear feature name relationship. The purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute that can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature name is identified by the linear feature identifier (LINEARID) attribute that can be used to link to the Feature Names Relationship File (FEATNAMES.dbf).
n
Public Libraries
nconemap.gov
arc-gis-hub-home-arcgishub.hub.arcgis.com
+3more
Updated Mar 25, 2003
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NC OneMap / State of North Carolina (2003). Public Libraries [Dataset]. https://www.nconemap.gov/datasets/public-libraries/api
Explore at:
Dataset updated
Mar 25, 2003
Dataset authored and provided by
NC OneMap / State of North Carolina
License
https://www.nconemap.gov/pages/termshttps://www.nconemap.gov/pages/terms
Area covered

Description
NC Center for Geographic Information and Analysis developed the digital Public Libraries data from addresses provided by the State Library of North Carolina on 3/20/03. This file enables users to identify public library locations. This data covers the entire extent of North Carolina.
TIGER/Line Shapefile, 2023, County, Forsyth County, NC, Address Ranges...
datasets.ai
catalog.data.gov
55, 57
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Census Bureau, Department of Commerce (2023). TIGER/Line Shapefile, 2023, County, Forsyth County, NC, Address Ranges Relationship File [Dataset]. https://datasets.ai/datasets/tiger-line-shapefile-2023-county-forsyth-county-nc-address-ranges-relationship-file
Explore at:
57, 55Available download formats
Dataset updated
Dec 15, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
U.S. Census Bureau, Department of Commerce
Area covered
Forsyth County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Relationship File (ADDR.dbf) contains the attributes of each address range. Each address range applies to a single edge and has a unique address range identifier (ARID) value. The edge to which an address range applies can be determined by linking the address range to the All Lines Shapefile (EDGES.shp) using the permanent topological edge identifier (TLID) attribute. Multiple address ranges can apply to the same edge since an edge can have multiple address ranges. Note that the most inclusive address range associated with each side of a street edge already appears in the All Lines Shapefile (EDGES.shp). The TIGER/Line Files contain potential address ranges, not individual addresses. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
TIGER/Line Shapefile, 2022, County, Henderson County, NC, Feature Names...
s.cnmilf.com
datasets.ai
+1more
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (Point of Contact) (2024). TIGER/Line Shapefile, 2022, County, Henderson County, NC, Feature Names Relationship File [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/tiger-line-shapefile-2022-county-henderson-county-nc-feature-names-relationship-file
Explore at:
Dataset updated
Jan 28, 2024
Dataset provided by
United States Department of Commercehttp://www.commerce.gov/
United States Census Bureauhttp://census.gov/
Area covered
Henderson County, North Carolina
Description
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Feature Names Relationship File (FEATNAMES.dbf) contains a record for each feature name and any attributes associated with it. Each feature name can be linked to the corresponding edges that make up that feature in the All Lines Shapefile (EDGES.shp), where applicable to the corresponding address range or ranges in the Address Ranges Relationship File (ADDR.dbf), or to both files. Although this file includes feature names for all linear features, not just road features, the primary purpose of this relationship file is to identify all street names associated with each address range. An edge can have several feature names; an address range located on an edge can be associated with one or any combination of the available feature names (an address range can be linked to multiple feature names). The address range is identified by the address range identifier (ARID) attribute, which can be used to link to the Address Ranges Relationship File (ADDR.dbf). The linear feature is identified by the linear feature identifier (LINEARID) attribute, which can be used to relate the address range back to the name attributes of the feature in the Feature Names Relationship File or to the feature record in the Primary Roads, Primary and Secondary Roads, or All Roads Shapefiles. The edge to which a feature name applies can be determined by linking the feature name record to the All Lines Shapefile (EDGES.shp) using the permanent edge identifier (TLID) attribute. The address range identifier(s) (ARID) for a specific linear feature can be found by using the linear feature identifier (LINEARID) from the Feature Names Relationship File (FEATNAMES.dbf) through the Address Range / Feature Name Relationship File (ADDRFN.dbf).

Facebook

Twitter

Click to copy link

Link copied

Cite

Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo (2025). ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations [Dataset]. http://doi.org/10.48436/3fcxr-cde10

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.48436/3fcxr-cde10

Dataset updated

Jun 6, 2025

Dataset provided by

TU Wien

Authors

Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/

This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.

Dataset paper (public preprint)

A description of this dataset, including the methodology and validation results, is available at:

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.

Abstract

ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.

Summary

Gap-filled global estimates of volumetric surface soil moisture from 1991-2023 at 0.25° sampling
Fields of application (partial): climate variability and change, land-atmosphere interactions, global biogeochemical cycles and ecology, hydrological and land surface modelling, drought applications, and meteorology
Method: Modified version of DCT-PLS (Garcia, 2010) interpolation/smoothing algorithm, linear interpolation over periods of frozen soils. Uncertainty estimates are provided for all data points.
More information: See Preimesberger et al. (2025) and https://doi.org/10.5281/zenodo.8320869" target="_blank" rel="noopener">ESA CCI SM Algorithm Theoretical Baseline Document [Chapter 7.2.9] (Dorigo et al., 2023)

Programmatic Download

You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.

#!/bin/bash

# Set download directory
DOWNLOAD_DIR=~/Downloads

base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"

# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
  echo "Downloading $year.zip..."
  wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
  unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
  rm "$DOWNLOAD_DIR/$year.zip"
done

Data details

The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:

ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc

Data Variables

Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:

sm: (float) The Soil Moisture variable reflects estimates of daily average volumetric soil moisture content (m3/m3) in the soil surface layer (~0-5 cm) over a whole grid cell (0.25 degree).
sm_uncertainty: (float) The Soil Moisture Uncertainty variable reflects the uncertainty (random error) of the original satellite observations and of the predictions used to fill observation data gaps.
sm_anomaly: Soil moisture anomalies (reference period 1991-2020) derived from the gap-filled values (`sm`)
sm_smoothed: Contains DCT-PLS predictions used to fill data gaps in the original soil moisture field. These values are also provided for cases where an observation was initially available (compare `gapmask`). In this case, they provided a smoothed version of the original data.
gapmask: (0 | 1) Indicates grid cells where a satellite observation is available (1), and where the interpolated (smoothed) values are used instead (0) in the 'sm' field.
frozenmask: (0 | 1) Indicates grid cells where ERA5 soil temperature is <0 °C. In this case, a linear interpolation over time is applied.

Additional information for each variable is given in the netCDF attributes.

Version Changelog

Changes in v9.1r1 (previous version was v09.1):

This version uses a novel uncertainty estimation scheme as described in Preimesberger et al. (2025).

Software to open netCDF files

These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:

https://github.com/pydata/xarray" target="_blank" rel="noopener">Xarray (python)
https://unidata.github.io/netcdf4-python/" target="_blank" rel="noopener">netCDF4 (python)
https://github.com/TUW-GEO/esa_cci_sm">esa_cci_sm (python)
Similar tools exists for other programming languages (Matlab, R, etc.)
Software packages and GIS tools can open netCDF files, e.g. CDO, NCO, QGIS, ArCGIS
You can also use the GUI software Panoply to view the contents of each file

References

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.
Dorigo, W., Preimesberger, W., Stradiotti, P., Kidd, R., van der Schalie, R., van der Vliet, M., Rodriguez-Fernandez, N., Madelon, R., & Baghdadi, N. (2023). ESA Climate Change Initiative Plus - Soil Moisture Algorithm Theoretical Baseline Document (ATBD) Supporting Product Version 08.1 (version 1.1). Zenodo. https://doi.org/10.5281/zenodo.8320869
Garcia, D., 2010. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 54(4), pp.1167-1178. Available at: https://doi.org/10.1016/j.csda.2009.09.020
Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C.-J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., and Toll, D.: The Global Land Data Assimilation System, Bulletin of the American Meteorological Society, 85, 381 – 394, https://doi.org/10.1175/BAMS-85-3-381, 2004.

Related Records

The following records are all part of the Soil Moisture Climate Data Records from satellites community

ESA CCI SM MODELFREE Surface Soil Moisture Record

<a href="https://doi.org/10.48436/svr1r-27j77" target="_blank"

Clear search

Close search

Google apps

Main menu

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...

Dataset paper (public preprint)

Abstract

Summary

Programmatic Download

Data details

Data Variables

Version Changelog

Software to open netCDF files

References

Related Records

National Weather Service Coded Surface Bulletins, 2003- (netCDF format)

TIGER/Line Shapefile, 2022, County, Robeson County, NC, Feature Names...

GRACE MONTHLY LAND WATER MASS GRIDS NETCDF RELEASE 5.0

NOAA Global Forecast System (GFS) netCDF Formatted Data

CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions

TIGER/Line Shapefile, 2022, County, Wake County, NC, Feature Names...

Outputs from a Regional Ocean Modeling System (ROMS) data assimilative...

2023 Cartographic Boundary File (KML), Unified School District for North...

NORTH CAROLINA

TIGER/Line Shapefile, 2023, County, Clay County, NC, Address Range-Feature...

Datasets for Sentiment Analysis

TIGER/Line Shapefile, 2023, County, Bertie County, NC, Address Ranges...

TIGER/Line Shapefile, 2023, County, Halifax County, NC, Feature Names...

TIGER/Line Shapefile, 2023, County, Cabarrus County, NC, Address Ranges...

TIGER/Line Shapefile, 2022, County, Pitt County, NC, Address Range-Feature...

TIGER/Line Shapefile, 2023, County, Alamance County, NC, Address...

Public Libraries

TIGER/Line Shapefile, 2023, County, Forsyth County, NC, Address Ranges...

TIGER/Line Shapefile, 2022, County, Henderson County, NC, Feature Names...

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observationsSee More Versions

Dataset paper (public preprint)

Abstract

Summary

Programmatic Download

Data details

Data Variables

Version Changelog

Software to open netCDF files

References

Related Records

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations