51 datasets found

H
Tutorial for NetCDF climate data retrieval and model integration
hydroshare.org
beta.hydroshare.org
+1more
zip
Updated Apr 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christina Bandaragoda; Jimmy Phuong (2019). Tutorial for NetCDF climate data retrieval and model integration [Dataset]. https://www.hydroshare.org/resource/8438dcb7795941d3ad2fe1a6fc055ef5
Explore at:
zip(125.5 KB)Available download formats
Dataset updated
Apr 4, 2019
Dataset provided by
HydroShare
Authors
Christina Bandaragoda; Jimmy Phuong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Hydrological and meteorological information can help inform the conditions and risk factors related to the environment and their inhabitants. Due to the limitations of observation sampling, gridded data sets provide the modeled information for areas where data collection are infeasible using observations collected and known process relations. Although available, data users are faced with barriers to use, challenges like how to access, acquire, then analyze data for small watershed areas, when these datasets were produced for large, continental scale processes. In this tutorial, we introduce Observatory for Gridded Hydrometeorology (OGH) to resolve such hurdles in a use-case that incorporates NetCDF gridded data sets processes developed to interpret the findings and apply secondary modeling frameworks (landlab).

LEARNING OBJECTIVES - Familiarize with data management, metadata management, and analyses with gridded data - Inspecting and problem solving with Python libraries - Explore data architecture and processes - Learn about OGH Python Library - Discuss conceptual data engineering and science operations

Use-case operations: 1. Prepare computing environment 2. Get list of grid cells 3. NetCDF retrieval and clipping to a spatial extent 4. Extract NetCDF metadata and convert NetCDFs to 1D ASCII time-series files 5. Visualize the average monthly total precipitations 6. Apply summary values as modeling inputs 7. Visualize modeling outputs 8. Save results in a new HydroShare resource

For inquiries, issues, or contribute to the developments, please refer to https://github.com/freshwater-initiative/Observatory
d
Data from: Multi-task Deep Learning for Water Temperature and Streamflow...
catalog.data.gov
Updated Nov 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Multi-task Deep Learning for Water Temperature and Streamflow Prediction (ver. 1.1, June 2022) [Dataset]. https://catalog.data.gov/dataset/multi-task-deep-learning-for-water-temperature-and-streamflow-prediction-ver-1-1-june-2022
Explore at:
Dataset updated
Nov 11, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This item contains data and code used in experiments that produced the results for Sadler et. al (2022) (see below for full reference). We ran five experiments for the analysis, Experiment A, Experiment B, Experiment C, Experiment D, and Experiment AuxIn. Experiment A tested multi-task learning for predicting streamflow with 25 years of training data and using a different model for each of 101 sites. Experiment B tested multi-task learning for predicting streamflow with 25 years of training data and using a single model for all 101 sites. Experiment C tested multi-task learning for predicting streamflow with just 2 years of training data. Experiment D tested multi-task learning for predicting water temperature with over 25 years of training data. Experiment AuxIn used water temperature as an input variable for predicting streamflow. These experiments and their results are described in detail in the WRR paper. Data from a total of 101 sites across the US was used for the experiments. The model input data and streamflow data were from the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset (Newman et. al 2014, Addor et. al 2017). The water temperature data were gathered from the National Water Information System (NWIS) (U.S. Geological Survey, 2016). The contents of this item are broken into 13 files or groups of files aggregated into zip files:

input_data_processing.zip: A zip file containing the scripts used to collate the observations, input weather drivers, and catchment attributes for the multi-task modeling experiments

flow_observations.zip: A zip file containing collated daily streamflow data for the sites used in multi-task modeling experiments. The streamflow data were originally accessed from the CAMELs dataset. The data are stored in csv and Zarr formats.

temperature_observations.zip: A zip file containing collated daily water temperature data for the sites used in multi-task modeling experiments. The data were originally accessed via NWIS. The data are stored in csv and Zarr formats.

temperature_sites.geojson: Geojson file of the locations of the water temperature and streamflow sites used in the analysis.

model_drivers.zip: A zip file containing the daily input weather driver data for the multi-task deep learning models. These data are from the Daymet drivers and were collated from the CAMELS dataset. The data are stored in csv and Zarr formats.

catchment_attrs.csv: Catchment attributes collatted from the CAMELS dataset. These data are used for the Random Forest modeling. For full metadata regarding these data see CAMELS dataset.

experiment_workflow_files.zip: A zip file containing workflow definitions used to run multi-task deep learning experiments. These are Snakemake workflows. To run a given experiment, one would run (for experiment A) 'snakemake -s expA_Snakefile --configfile expA_config.yml'

river-dl-paper_v0.zip: A zip file containing python code used to run multi-task deep learning experiments. This code was called by the Snakemake workflows contained in 'experiment_workflow_files.zip'.

random_forest_scripts.zip: A zip file containing Python code and a Python Jupyter Notebook used to prepare data for, train, and visualize feature importance of a Random Forest model.

plotting_code.zip: A zip file containing python code and Snakemake workflow used to produce figures showing the results of multi-task deep learning experiments.

results.zip: A zip file containing results of multi-task deep learning experiments. The results are stored in csv and netcdf formats. The netcdf files were used by the plotting libraries in 'plotting_code.zip'. These files are for five experiments, 'A', 'B', 'C', 'D', and 'AuxIn'. These experiment names are shown in the file name.

sample_scripts.zip: A zip file containing scripts for creating sample output to demonstrate how the modeling workflow was executed.

sample_output.zip: A zip file containing sample output data. Similar files are created by running the sample scripts provided.

A. Newman; K. Sampson; M. P. Clark; A. Bock; R. J. Viger; D. Blodgett, 2014. A large-sample watershed-scale hydrometeorological dataset for the contiguous USA. Boulder, CO: UCAR/NCAR. https://dx.doi.org/10.5065/D6MW2F4D

N. Addor, A. Newman, M. Mizukami, and M. P. Clark, 2017. Catchment attributes for large-sample studies. Boulder, CO: UCAR/NCAR. https://doi.org/10.5065/D6G73C3Q

Sadler, J. M., Appling, A. P., Read, J. S., Oliver, S. K., Jia, X., Zwart, J. A., & Kumar, V. (2022). Multi-Task Deep Learning of Daily Streamflow and Water Temperature. Water Resources Research, 58(4), e2021WR030138. https://doi.org/10.1029/2021WR030138

U.S. Geological Survey, 2016, National Water Information System data available on the World Wide Web (USGS Water Data for the Nation), accessed Dec. 2020.
t
ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...
researchdata.tuwien.ac.at
researchdata.tuwien.at
zip
Updated Sep 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo (2025). ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations [Dataset]. http://doi.org/10.48436/3fcxr-cde10
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/3fcxr-cde10
Dataset updated
Sep 5, 2025
Dataset provided by
TU Wien
Authors
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/

This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.

Dataset Paper (Open Access)

A description of this dataset, including the methodology and validation results, is available at:

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.

Abstract

ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.

Summary

Gap-filled global estimates of volumetric surface soil moisture from 1991-2023 at 0.25° sampling

Fields of application (partial): climate variability and change, land-atmosphere interactions, global biogeochemical cycles and ecology, hydrological and land surface modelling, drought applications, and meteorology

Method: Modified version of DCT-PLS (Garcia, 2010) interpolation/smoothing algorithm, linear interpolation over periods of frozen soils. Uncertainty estimates are provided for all data points.

More information: See Preimesberger et al. (2025) and https://doi.org/10.5281/zenodo.8320869" target="_blank" rel="noopener">ESA CCI SM Algorithm Theoretical Baseline Document [Chapter 7.2.9] (Dorigo et al., 2023)

Programmatic Download

You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.

#!/bin/bash

# Set download directory
DOWNLOAD_DIR=~/Downloads

base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"

# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done

Data details

The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:

ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc

Data Variables

Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:

sm: (float) The Soil Moisture variable reflects estimates of daily average volumetric soil moisture content (m3/m3) in the soil surface layer (~0-5 cm) over a whole grid cell (0.25 degree).

sm_uncertainty: (float) The Soil Moisture Uncertainty variable reflects the uncertainty (random error) of the original satellite observations and of the predictions used to fill observation data gaps.

sm_anomaly: Soil moisture anomalies (reference period 1991-2020) derived from the gap-filled values (`sm`)

sm_smoothed: Contains DCT-PLS predictions used to fill data gaps in the original soil moisture field. These values are also provided for cases where an observation was initially available (compare `gapmask`). In this case, they provided a smoothed version of the original data.

gapmask: (0 | 1) Indicates grid cells where a satellite observation is available (1), and where the interpolated (smoothed) values are used instead (0) in the 'sm' field.

frozenmask: (0 | 1) Indicates grid cells where ERA5 soil temperature is <0 °C. In this case, a linear interpolation over time is applied.

Additional information for each variable is given in the netCDF attributes.

Version Changelog

Changes in v9.1r1 (previous version was v09.1):

This version uses a novel uncertainty estimation scheme as described in Preimesberger et al. (2025).

Software to open netCDF files

These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:

https://github.com/pydata/xarray" target="_blank" rel="noopener">Xarray (python)

https://unidata.github.io/netcdf4-python/" target="_blank" rel="noopener">netCDF4 (python)

https://github.com/TUW-GEO/esa_cci_sm">esa_cci_sm (python)

Similar tools exists for other programming languages (Matlab, R, etc.)

Software packages and GIS tools can open netCDF files, e.g. CDO, NCO, QGIS, ArCGIS

You can also use the GUI software Panoply to view the contents of each file

References

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.

Dorigo, W., Preimesberger, W., Stradiotti, P., Kidd, R., van der Schalie, R., van der Vliet, M., Rodriguez-Fernandez, N., Madelon, R., & Baghdadi, N. (2023). ESA Climate Change Initiative Plus - Soil Moisture Algorithm Theoretical Baseline Document (ATBD) Supporting Product Version 08.1 (version 1.1). Zenodo. https://doi.org/10.5281/zenodo.8320869

Garcia, D., 2010. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 54(4), pp.1167-1178. Available at: https://doi.org/10.1016/j.csda.2009.09.020

Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C.-J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., and Toll, D.: The Global Land Data Assimilation System, Bulletin of the American Meteorological Society, 85, 381 – 394, https://doi.org/10.1175/BAMS-85-3-381, 2004.

Related Records

The following records are all part of the ESA CCI Soil Moisture science data records community

1
ESA CCI SM MODELFREE Surface Soil Moisture Record
<a href="https://doi.org/10.48436/svr1r-27j77" target="_blank"
4
Characteristic parameters extracted from the Jarkus dataset using the Jarkus...
data.4tu.nl
figshare.com
zip
Updated May 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christa van IJzendoorn (2021). Characteristic parameters extracted from the Jarkus dataset using the Jarkus Analysis Toolbox [Dataset]. http://doi.org/10.4121/14514213.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/14514213.v1
Dataset updated
May 4, 2021
Dataset provided by
4TU.ResearchData
Authors
Christa van IJzendoorn
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
This dataset presents the output of the application of the Jarkus Analysis Toolbox (JAT) to the Jarkus dataset. The Jarkus dataset is one of the most elaborate coastal datasets in the world and consists of coastal profiles of the entire Dutch coast, spaced about 250-500 m apart, which have been measured yearly since 1965. Different available definitions for extracting characteristic parameters from coastal profiles were collected and implemented in the JAT. The characteristic parameters allow stakeholders (e.g. scientists, engineers and coastal managers) to study the spatial and temporal variations in parameters like dune height, dune volume, dune foot, beach width and closure depth. This dataset includes a netcdf file (on the opendap server, see data link) that contains all characteristic parameters through space and time, and a distribution plot that shows the overview of each characteristic parameters. The Jarkus Analysis Toolbox and all scripts that were used to extract the characteristic parameters and create the distribution plots are available through Github (https://github.com/christavanijzendoorn/JAT). Example 5 that is included in the JAT provides a python script that shows how to load and work with the netcdf file.Documentation: https://jarkus-analysis-toolbox.readthedocs.io/.
H
(HS 2) Automate Workflows using Jupyter notebook to create Large Extent...
hydroshare.org
search.dataone.org
zip
Updated Oct 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Young-Don Choi (2024). (HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets [Dataset]. http://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad
Explore at:
zip(2.4 MB)Available download formats
Unique identifier
https://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad
Dataset updated
Oct 15, 2024
Dataset provided by
HydroShare
Authors
Young-Don Choi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.
H
Observational Large Ensemble
dataverse.harvard.edu
search.dataone.org
Updated Jul 19, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karen McKinnon (2017). Observational Large Ensemble [Dataset]. http://doi.org/10.7910/DVN/7CPJPQ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/7CPJPQ
Dataset updated
Jul 19, 2017
Dataset provided by
Harvard Dataverse
Authors
Karen McKinnon
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
These python datasets contain the results presented in the above paper with regard to the variability in trends over North America during DJF due to sampling of internal variability. Two types of files are available. The netcdf file contains samples from the synthetic ensemble of DJF temperatures over North America from 1966-2015. The synthetic ensemble is centered on the observed trend. Recentering the ensemble on the ensemble mean trend from the NCAR CESM1 LENS will create the Observational Large Ensemble, in which each sample can be viewed as a temperature history that could have occurred given various samplings of internal variability. The synthetic ensemble can also be recentered on any other estimate of the forced response to climate change. While the dataset is both land and ocean, it has only been validated over land. The second type of file, presented as python datasets (.npz) contains the results presented in the McKinnon et al (2017) reference. In particular, it contains the 50-year trends for both the observations and the NCAR CESM1 Large Ensemble that actually occurred, and could have occurred given a different sampling of internal variability. The bootstrap results can be compared to the true spread across the NCAR CESM1 Large Ensemble for validation, as was done in the manuscript. Each of these files is named based on the observational dataset, variable, time span, and spatial domain. They contain: BETA: the empirical OLS trend BOOTSAMPLES: the OLS trends estimated after bootstrapping INTERANNUALVAR: the interannual variance in the data after modeling and removing the forced trend empiricalAR1: the empirical AR(1) coefficient estimated from the residuals around the forced trend The first dimension of all variables is 42, which is a stack of the ensemble mean behavior (index 0), the forty members of the NCAR Large Ensemble (indices 1:40), and the observations (last index, -1). The second dimension is spatial. See latlon.npz for the latitude and longitude vectors. The third dimension, when present, is the bootstrap samples. We have saved 1000 bootstrap samples.
Model output and data used for analysis
catalog.data.gov
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Model output and data used for analysis [Dataset]. https://catalog.data.gov/dataset/model-output-and-data-used-for-analysis
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The modeled data in these archives are in the NetCDF format (https://www.unidata.ucar.edu/software/netcdf/). NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data. The Unidata Program Center supports and maintains netCDF programming interfaces for C, C++, Java, and Fortran. Programming interfaces are also available for Python, IDL, MATLAB, R, Ruby, and Perl. Data in netCDF format is: • Self-Describing. A netCDF file includes information about the data it contains. • Portable. A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers. • Scalable. Small subsets of large datasets in various formats may be accessed efficiently through netCDF interfaces, even from remote servers. • Appendable. Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure. • Sharable. One writer and multiple readers may simultaneously access the same netCDF file. • Archivable. Access to all earlier forms of netCDF data will be supported by current and future versions of the software. Pub_figures.tar.zip Contains the NCL scripts for figures 1-5 and Chesapeake Bay Airshed shapefile. The directory structure of the archive is ./Pub_figures/Fig#_data. Where # is the figure number from 1-5. EMISS.data.tar.zip This archive contains two NetCDF files that contain the emission totals for 2011ec and 2040ei emission inventories. The name of the files contain the year of the inventory and the file header contains a description of each variable and the variable units. EPIC.data.tar.zip contains the monthly mean EPIC data in NetCDF format for ammonium fertilizer application (files with ANH3 in the name) and soil ammonium concentration (files with NH3 in the name) for historical (Hist directory) and future (RCP-4.5 directory) simulations. WRF.data.tar.zip contains mean monthly and seasonal data from the 36km downscaled WRF simulations in the NetCDF format for the historical (Hist directory) and future (RCP-4.5 directory) simulations. CMAQ.data.tar.zip contains the mean monthly and seasonal data in NetCDF format from the 36km CMAQ simulations for the historical (Hist directory), future (RCP-4.5 directory) and future with historical emissions (RCP-4.5-hist-emiss directory). This dataset is associated with the following publication: Campbell, P., J. Bash, C. Nolte, T. Spero, E. Cooter, K. Hinson, and L. Linker. Projections of Atmospheric Nitrogen Deposition to the Chesapeake Bay Watershed. Journal of Geophysical Research - Biogeosciences. American Geophysical Union, Washington, DC, USA, 12(11): 3307-3326, (2019).
Forcing files for the ECMWF Integrated Forecasting System (IFS) Single...
catalogue.ceda.ac.uk
data-search.nerc.ac.uk
Updated Mar 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hannah M. Christensen; Andrew Dawson; Christopher Holloway (2020). Forcing files for the ECMWF Integrated Forecasting System (IFS) Single Column Model (SCM) over Indian Ocean/Tropical Pacific derived from a 10-day high resolution simulation [Dataset]. https://catalogue.ceda.ac.uk/uuid/bf4fb57ac7f9461db27dab77c8c97cf2
Explore at:
Dataset updated
Mar 2, 2020
Dataset provided by
Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
Authors
Hannah M. Christensen; Andrew Dawson; Christopher Holloway
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Time period covered
Apr 6, 2009 - Apr 16, 2009
Area covered
Variables measured
time, eastward_wind, northward_wind, surface_altitude, surface_temperature, surface_downward_latent_heat_flux, surface_downward_sensible_heat_flux, atmosphere hybrid sigma pressure coordinate
Description
This data set consisting of initial conditions, boundary conditions and forcing profiles for the Single Column Model (SCM) version of the European Centre for Medium-range Weather Forecasts (ECMWF) model, the Integrated Forecasting System (IFS). The IFS SCM is freely available through the OpenIFS project, on application to ECMWF for a licence. The data were produced and tested for IFS CY40R1, but will be suitable for earlier model cycles, and also for future versions assuming no new boundary fields are required by a later model. The data are archived as single time-stamp maps in netCDF files. If the data are extracted at any lat-lon location and the desired timestamps concatenated (e.g. using netCDF operators), the resultant file is in the correct format for input into the IFS SCM.

The data covers the Tropical Indian Ocean/Warm Pool domain spanning 20S-20N, 42-181E. The data are available every 15 minutes from 6 April 2009 0100 UTC for a period of ten days. The total number of grid points over which an SCM can be run is 480 in the longitudinal direction, and 142 latitudinally. With over 68,000 independent grid points available for evaluation of SCM simulations, robust statistics of bias can be estimated over a wide range of boundary and climatic conditions.

The initial conditions and forcing profiles were derived by coarse-graining high resolution (4 km) simulations produced as part of the NERC Cascade project, dataset ID xfhfc (also available on CEDA). The Cascade dataset is archived once an hour. The dataset was linearly interpolated in time to produce the 15-minute resolution required by the SCM. The resolution of the coarse-grained data corresponds to the IFS T639 reduced gaussian grid (approx 32 km). The boundary conditions are as used in the operational IFS at resolution T639. The coarse graining procedure by which the data were produced is detailed in Christensen, H. M., Dawson, A. and Holloway, C. E., 'Forcing Single Column Models using High-resolution Model Simulations', in review, Journal of Advances in Modeling Earth Systems (JAMES).

For full details of the parent Cascade simulation, see Holloway et al (2012). In brief, the simulations were produced using the limited-area setup of the MetUM version 7.1 (Davies et al, 2005). The model is semi-Lagrangian and non-hydrostatic. Initial conditions were specified from the ECMWF operational analysis. A 12 km parametrised convection run was first produced over a domain 1 degree larger in each direction, with lateral boundary conditions relaxed to the ECMWF operational analysis. The 4 km run was forced using lateral boundary conditions computed from the 12 km parametrised run, via a nudged rim of 8 model grid points. The model has 70 terrain-following hybrid levels in the vertical, with vertical resolution ranging from tens of metres in the boundary layer, to 250 m in the free troposphere, and with model top at 40 km. The time step was 30 s.

The Cascade dataset did not include archived soil variables, though surface sensible and latent heat fluxes were archived. When using the dataset, it is therefore recommended that the IFS land surface scheme be deactivated and the SCM forced using the surface fluxes instead. The first day of Cascade data exhibited evidence of spin-up. It is therefore recommended that the first day be discarded, and the data used from April 7 - April 16.

The software used to produce this dataset are freely available to interested users; 1. "cg-cascade"; NCL software to produce OpenIFS forcing fields from a high-resolution MetUM simulation and necessary ECMWF boundary files. https://github.com/aopp-pred/cg-cascade Furthermore, software to facilitate the use of this dataset are also available, consisting of; 2. "scmtiles"; Python software to deploy many independent SCMs over a domain. https://github.com/aopp-pred/scmtiles 3. "openifs-scmtiles"; Python software to deploy the OpenIFS SCM using scmtiles. https://github.com/aopp-pred/openifs-scmtiles
e
Sub-Rayleigh to supershear fracture transition in long Propagation Saw Tests...
data.europa.eu
envidat.ch
octet stream, pdf +1
Updated Nov 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EnviDat (2025). Sub-Rayleigh to supershear fracture transition in long Propagation Saw Tests [Dataset]. https://data.europa.eu/data/datasets/938ce90f-e917-4838-af40-e2e161a22122-envidat?locale=bg
Explore at:
octet stream(280950), octet stream(138010), octet stream(217503), pdf(117573), octet stream, octet stream(95474900), octet stream(20239), tiff(256122), octet stream(63991645), octet stream(3092), octet stream(3309), octet stream(5763)Available download formats
Dataset updated
Nov 24, 2025
Dataset authored and provided by
EnviDat
License
http://dcat-ap.ch/vocabulary/licenses/terms_byhttp://dcat-ap.ch/vocabulary/licenses/terms_by
Description
This dataset contains the experimental results described in Bergfeld et al. (2025). It includes three Propagation Saw Test (PST) experiments, each approximately 9 m long, performed side-by-side on a 37° slope. For each PST, we provide the full field of view along the crack-propagation direction. For the second and third PSTs, we additionally provide close-up recordings focused on the weak layer where cracking occurred. All data are supplied as netCDF files containing displacement and strain measurements derived from Digital Image Correlation (DIC) analysis. Metadata describing dimensions and units are stored directly within the netCDF files. We recommend using the xarray package in Python to read and work with these datasets. All figures presented in Bergfeld et al. (2025) can be reproduced using the included Python scripts. Information about the snowpack is provided in PDF, pickle, and CAAML file formats.
Northern elephant seal tracking and diving – raw and curated data
data.niaid.nih.gov
datasetcatalog.nlm.nih.gov
zip
Updated May 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Costa; Rachel Holser; Theresa Keates; Taiki Adachi; Roxanne Beltran; Cory Champagne; Crocker Daniel; Arina Favilla; Melinda Fowler; Juan Pablo Gallo-Reynoso; Chandra Goetsch; Jason Hassrick; Luis Hückstädt; Jessica Kendall-Bar; Sarah Kienle; Carey Kuhn; Jennifer Maresh; Sara Maxwell; Birgitte McDonald; Elizabeth McHuron; Patricia Morris; Yasuhiko Naito; Logan Pallin; Sarah Peterson; Patrick Robinson; Samantha Simmons; Akinori Takahashi; Nicole Teuschel; Michael Tift; Yann Tremblay; Stella Villegas-Amtman; Ken Yoda (2025). Northern elephant seal tracking and diving – raw and curated data [Dataset]. http://doi.org/10.7291/D10D61
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.7291/D10D61
Dataset updated
May 14, 2025
Dataset provided by
Centro de Investigación en Alimentación y Desarrollo
University of California, Santa Cruz
United States Geological Survey
Sonoma State University
NOAA National Marine Fisheries Service
University of North Carolina Wilmington
Springfield College
West Chester University
University of St Andrews
Baylor University
Nagoya University
Scripps Institution of Oceanography
Moss Landing Marine Laboratories
ICF International (United States)
University of Washington
University of Exeter
Marine Biodiversity Exploitation and Conservation
National Institute of Polar Research
Consolidated Safety Services-Dynamac (United States)
Authors
Daniel Costa; Rachel Holser; Theresa Keates; Taiki Adachi; Roxanne Beltran; Cory Champagne; Crocker Daniel; Arina Favilla; Melinda Fowler; Juan Pablo Gallo-Reynoso; Chandra Goetsch; Jason Hassrick; Luis Hückstädt; Jessica Kendall-Bar; Sarah Kienle; Carey Kuhn; Jennifer Maresh; Sara Maxwell; Birgitte McDonald; Elizabeth McHuron; Patricia Morris; Yasuhiko Naito; Logan Pallin; Sarah Peterson; Patrick Robinson; Samantha Simmons; Akinori Takahashi; Nicole Teuschel; Michael Tift; Yann Tremblay; Stella Villegas-Amtman; Ken Yoda
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Northern elephant seals (Mirounga angustirostris) have been integral to the development and progress of biologging technology and movement data analysis. Adult female elephant seals at Año Nuevo State Park and other colonies along the west coast of North America were tracked annually from 2004 to 2020 for a total of 653 instrument deployments and 561 recoveries. These high-resolution diving and location data have been compiled, curated, and processed. This repository has netCDF files containing the raw tracking and diving data. The processed data are available in a second repository (https://doi.org/10.7291/D18D7W). Methods These data were collected from biotelemetry devices attached to adult female northern elephant seals (Mirounga angustirostris) from 2004 to 2020. The instruments collected locations (Argos and/or GPS) and continuously recorded depth throughout the animals' trips. Data were processed in MATLAB and R using custom code, the IKNOS package for dive data processing, and the aniMotum package for track processing. The details of data collection and processing are documented in the data descriptor paper associated with this dataset. In addition, all code used to process the data are available on GitHub and Zenodo.

The data presented here are freely available for use under the CC0 (Creative Commons Zero), and attribution is encouraged to be given to the data descriptor (DOI: 10.1038/s41597-024-04084-4) and this Dryad repository. We encourage users to reach out to the data owner for richer insight into the dataset. Subsets of this dataset have been made available through other projects and data portals and we caution users that these are not independent northern elephant seal datasets. This includes the AniBOS/MEOP data portal (https://www.meop.net/database/meop-databases/), the Animal Tracking Network (ATN) (https://portal.atn.ioos.us/), Movebank (https://www.movebank.org/cms/movebank-main), and MegaMove (https://megamove.org/data-portal/).

Additional data about the instrumented animals, such as morphometrics, demographics, and other biologging data (e.g., acceleration, jaw motion, temperature), are available for many of these animals but are beyond the scope of this dataset. For more information, contact the author at rholser@ucsc.edu.

Sampling Biases

Generally, we have been careful to select healthy animals for sedation and instrumentation. For animals deployed at Año Nuevo (most of the tracks), typically individuals with known site fidelity to the colony were selected and if age was known it was usually restricted to 4- to 12-year-olds. Furthermore, the data reported here span two decades of work. During this time, different studies prompted additional non-random population sampling. Examples include focusing on one age for a year, repeat tracking the same individuals two trips in a row, and intentionally selecting previously tracked females who had used a coastal foraging strategy. Many individuals in the dataset have been tracked multiple times. We strongly encourage researchers to evaluate the metadata provided carefully and contact the author with inquiries at rholser@ucsc.edu.

Code Availability

All the code written for data processing and NetCDF data import code for MATLAB, R, and Python are available at GitHub (https://github.com/rholser/NES_TrackDive_DataProcessing) and Zenodo (https://doi.org/10.5281/zenodo.12511548). Extensive documentation of functions and scripts is also provided there. In addition, the authors have provided code in Python, R, and MATLAB for basic access to the netCDF files (https://github.com/rholser/NES-Read-netCDF). They should serve as a model to enable users unfamiliar with the format to access the data.
ERA-NUTS: time-series based on C3S ERA5 for European regions
zenodo.org
nc, zip
Updated Aug 4, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. De Felice; M. De Felice; K. Kavvadias; K. Kavvadias (2022). ERA-NUTS: time-series based on C3S ERA5 for European regions [Dataset]. http://doi.org/10.5281/zenodo.2650191
Explore at:
zip, ncAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2650191
Dataset updated
Aug 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
M. De Felice; M. De Felice; K. Kavvadias; K. Kavvadias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
# ERA-NUTS (1980-2018)

This dataset contains a set of time-series of meteorological variables based on Copernicus Climate Change Service (C3S) ERA5 reanalysis. The data files can be downloaded from here while notebooks and other files can be found on the associated Github repository.

This data has been generated with the aim of providing hourly time-series of the meteorological variables commonly used for power system modelling and, more in general, studies on energy systems.

An example of the analysis that can be performed with ERA-NUTS is shown in this video.

Important: this dataset is still a work-in-progress, we will add more analysis and variables in the near-future. If you spot an error or something strange in the data please tell us sending an email or opening an Issue in the associated Github repository.

## Data
The time-series have hourly/daily/monthly frequency and are aggregated following the NUTS 2016 classification. NUTS (Nomenclature of Territorial Units for Statistics) is a European Union standard for referencing the subdivisions of countries (member states, candidate countries and EFTA countries).

This dataset contains NUTS0/1/2 time-series for the following variables obtained from the ERA5 reanalysis data (in brackets the name of the variable on the Copernicus Data Store and its unit measure):

- t2m: 2-meter temperature (`2m_temperature`, Celsius degrees)
- ssrd: Surface solar radiation (`surface_solar_radiation_downwards`, Watt per square meter)
- ssrdc: Surface solar radiation clear-sky (`surface_solar_radiation_downward_clear_sky`, Watt per square meter)
- ro: Runoff (`runoff`, millimeters)

There are also a set of derived variables:
- ws10: Wind speed at 10 meters (derived by `10m_u_component_of_wind` and `10m_v_component_of_wind`, meters per second)
- ws100: Wind speed at 100 meters (derived by `100m_u_component_of_wind` and `100m_v_component_of_wind`, meters per second)
- CS: Clear-Sky index (the ratio between the solar radiation and the solar radiation clear-sky)
- HDD/CDD: Heating/Cooling Degree days (derived by 2-meter temperature the EUROSTAT definition.

For each variable we have 350 599 hourly samples (from 01-01-1980 00:00:00 to 31-12-2019 23:00:00) for 34/115/309 regions (NUTS 0/1/2).

The data is provided in two formats:

- NetCDF version 4 (all the variables hourly and CDD/HDD daily). NOTE: the variables are stored as `int16` type using a `scale_factor` of 0.01 to minimise the size of the files.
- Comma Separated Value ("single index" format for all the variables and the time frequencies and "stacked" only for daily and monthly)

All the CSV files are stored in a zipped file for each variable.

## Methodology

The time-series have been generated using the following workflow:

1. The NetCDF files are downloaded from the Copernicus Data Store from the ERA5 hourly data on single levels from 1979 to present dataset
2. The data is read in R with the climate4r packages and aggregated using the function `/get_ts_from_shp` from panas. All the variables are aggregated at the NUTS boundaries using the average except for the runoff, which consists of the sum of all the grid points within the regional/national borders.
3. The derived variables (wind speed, CDD/HDD, clear-sky) are computed and all the CSV files are generated using R
4. The NetCDF are created using `xarray` in Python 3.7.

NOTE: air temperature, solar radiation, runoff and wind speed hourly data have been rounded with two decimal digits.

## Example notebooks

In the folder `notebooks` on the associated Github repository there are two Jupyter notebooks which shows how to deal effectively with the NetCDF data in `xarray` and how to visualise them in several ways by using matplotlib or the enlopy package.

There are currently two notebooks:

- exploring-ERA-NUTS: it shows how to open the NetCDF files (with Dask), how to manipulate and visualise them.
- ERA-NUTS-explore-with-widget: explorer interactively the datasets with [jupyter]() and ipywidgets.

The notebook `exploring-ERA-NUTS` is also available rendered as HTML.

## Additional files

In the folder `additional files`on the associated Github repository there is a map showing the spatial resolution of the ERA5 reanalysis and a CSV file specifying the number of grid points with respect to each NUTS0/1/2 region.

## License

This dataset is released under CC-BY-4.0 license.
s
Data from: Dataset for "The impact of lake shape and size on lake breezes...
research.science.eus
ekoizpen-zientifikoa.ehu.eus
+1more
Updated 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chatain, Audrey; Rafkin, Scot C.R.; Soto, Alejandro; Moisan, Enora; Lora, Juan M.; Le Gall, Alice; Hueso, Ricardo; Spiga, Aymeric; Chatain, Audrey; Rafkin, Scot C.R.; Soto, Alejandro; Moisan, Enora; Lora, Juan M.; Le Gall, Alice; Hueso, Ricardo; Spiga, Aymeric (2023). Dataset for "The impact of lake shape and size on lake breezes and air-lake exchanges on Titan" [Dataset]. https://research.science.eus/documentos/67321dfcaea56d4af048502a
Explore at:
Dataset updated
2023
Authors
Chatain, Audrey; Rafkin, Scot C.R.; Soto, Alejandro; Moisan, Enora; Lora, Juan M.; Le Gall, Alice; Hueso, Ricardo; Spiga, Aymeric; Chatain, Audrey; Rafkin, Scot C.R.; Soto, Alejandro; Moisan, Enora; Lora, Juan M.; Le Gall, Alice; Hueso, Ricardo; Spiga, Aymeric
Description
Code and data presented in the paper "The impact of lake shape and size on lake breezes and air-lake exchanges on Titan", published in Icarus in 2024 (https://doi.org/10.1016/j.icarus.2023.115925).

Are made available:

-the Fortran source code of the model initialization module modified for this simulation work,"module_initialize_Titan_lakebreeze3d_xy_shoreline.F"

-the input files used to run the simulations,in "inputs/"

-a list of the simulations done and the netCDF outputs,"simus_done_for_paper3D.pdf""run-##_y0_tsol4.nc.gz" --> slices at a given y (at the center), 4th tsol [for 2D and 3D simulations]"run-##_y0_tsol3.nc.gz" --> slices at a given y (at the center), 3rd tsol [for 2D and 3D simulations]"run-##_z0_tsol4.nc.gz" --> slices at a given z (at the surface), 4th tsol [only for 3D simulations]"run-##_z200_tsol4.nc.gz" --> slices at a given z (at ~200 m), 4th tsol [only for 3D simulations]"run-##_tsol4_2am.nc.gz" --> total simulation output at a given time (2am on 4th tsol) [only for 3D simulations]"run-##_tsol4_2pm.nc.gz" --> total simulation output at a given time (2pm on 4th tsol) [only for 3D simulations]

-the Python codes to plot figures from the netCDF output files,in "postprocessing_python/""mtwrf_analysis_1D_t.py" --> plot variables with time at given (x,y,z) [for 2D and 3D simulations]"mtwrf_analysis_2D_xt.py" --> plot variables with (x,t) at given (y,z) [for 2D and 3D simulations]"mtwrf_analysis_2D_xz.py" --> plot variables with (x,z) at given (y,t) [for 2D and 3D simulations]"mtwrf_analysis_2D_xy.py" --> plot variables with (x,y) at given (z,t) [only for 3D simulations]-- same as the previous ones but to plot along a different x-axis (rotated fron the one of the netCDF files) [only for 3D simulations]"mtwrf_analysis_1D_t_diagonal.py""mtwrf_analysis_2D_xt_diagonal.py""mtwrf_analysis_2D_xz_diagonal.py"

-the Matlab variables and figures from the analysis of the simulated lake breeze dimensionsin "postprocessing_matlab/"
d
Data from: Calculated Leached Nitrogen from Septic Systems in Wisconsin,...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Calculated Leached Nitrogen from Septic Systems in Wisconsin, 1850-2010 [Dataset]. https://catalog.data.gov/dataset/calculated-leached-nitrogen-from-septic-systems-in-wisconsin-1850-2010
Explore at:
Dataset updated
Nov 20, 2025
Dataset provided by
U.S. Geological Survey
Area covered
Wisconsin
Description
This data release contains a netCDF file containing decadal estimates of nitrate leached from septic systems (kilograms per hectare per year, or kg/ha) in the state of Wisconsin from 1850 to 2010, as well as the python code and supporting files used to create the netCDF file. The netCDF file is used as an input to a Nitrate Decision Support Tool for the State of Wisconsin (GW-NDST; Juckem and others, 2024). The dataset was constructed starting with 1990 census records, which included responses about households using septic systems for waste disposal. The fraction of population using septic systems in 1990 was aggregated at the county scale and applied backward in time for each decade from 1850 to 1980. For decades from 1990 to 2010, the fraction of population using septic systems was computed on the finer resolution census block-group scale. Each decadal estimate of the fraction of population using septic systems was then multiplied by 4.13 kilograms per person per year of leached nitrate to estimate the per-area load of nitrate below the root zone. The data release includes a python notebook used to process the input datasets included in the data release, shapefiles created (or modified) using the python notebook, and the final netCDF file.
u
UIUC Mobile Sounding Data
data.ucar.edu
netcdf
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Janiszeski (2025). UIUC Mobile Sounding Data [Dataset]. http://doi.org/10.5065/D6X63KCG
Explore at:
netcdfAvailable download formats
Unique identifier
https://doi.org/10.5065/D6X63KCG
Dataset updated
Oct 7, 2025
Authors
Andrew Janiszeski
Time period covered
Jan 8, 2017 - Mar 9, 2017
Area covered

Description
This dataset contains data collected from 34 successful l University of Illinois Urbana-Champaign (UIUC) Mobile Radiosonde launches collected during the SNOWIE field campaign. Each successful launch, named by year-month-day-time-location.nc, has its own netCDF file. The data in each file includes: temperature (Celsius), relative humidity, time of sample (in seconds past launch time), height AGL (m), wind speed (m/s), wind direction (degrees), and pressure (mb) as measured by the radiosonde. The coordinates and altitude above MSL (m) corresponding to where each sounding was launched are written in the attributes of each file. All surface wind speed and direction is used from the previous hourly observation from KBOI for the Boise sites and KEUL for the Caldwell site. One exception was the launch on 16 February 2017 at Caldwell in which KBOI observations were used instead. Included with each dataset order is a python script (netcdfreadout.py) to easily view the netcdf data files.
Forcing data, evaluation data, model output and analysis scripts used in...
zenodo.org
application/gzip
Updated Jul 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Martin-Belda; David Martin-Belda (2022). Forcing data, evaluation data, model output and analysis scripts used in LPJ-GUESS/LSM description paper [Dataset]. http://doi.org/10.5281/zenodo.5813886
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5813886
Dataset updated
Jul 18, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Martin-Belda; David Martin-Belda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This archive contains:

- model_output.tar.gz: Model output
- extracted_fluxes.tar.gz: Sensible heat, latent heat and CO2 fluxes extracted from the FLUXNET2015 dataset, used to evaluate the model output
- extracted_climate.tar.gz: Climate data extracted from the FLUXNET2015 dataset, used to force the simulations
- scripts.tar.gz: Python scripts used to analyze the data and produce the results reported in the model description paper

The climate forcing data has been extracted from the FLUXNET2015 dataset [1]. Input and output data is stored in netCDF files. Evaluation data is stored in python serialized files (pickle).

[1] Pastorello, G., Trotta, C., Canfora, E. et al. The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. Sci Data 7, 225 (2020). https://doi.org/10.1038/s41597-020-0534-3
U
CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions
dataverse-staging.rdmc.unc.edu
datasearch.gesis.org
Updated Dec 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UNC Dataverse (2019). CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions [Dataset]. http://doi.org/10.15139/S3/XDYYB9
Explore at:
Unique identifier
https://doi.org/10.15139/S3/XDYYB9
Dataset updated
Dec 12, 2019
Dataset provided by
UNC Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
Data Summary: US states grid mask file and NOAA climate regions grid mask file, both compatible with the 12US1 modeling grid domain. Note:The datasets are on a Google Drive. The metadata associated with this DOI contain the link to the Google Drive folder and instructions for downloading the data. These files can be used with CMAQ-ISAMv5.3 to track state- or region-specific emissions. See Chapter 11 and Appendix B.4 in the CMAQ User's Guide for further information on how to use the ISAM control file with GRIDMASK files. The files can also be used for state or region-specific scaling of emissions using the CMAQv5.3 DESID module. See the DESID Tutorial and Appendix B.4 in the CMAQ User's Guide for further information on how to use the Emission Control File to scale emissions in predetermined geographical areas. File Location and Download Instructions: Link to GRIDMASK files Link to README text file with information on how these files were created File Format: The grid mask are stored as netcdf formatted files using I/O API data structures (https://www.cmascenter.org/ioapi/). Information on the model projection and grid structure is contained in the header information of the netcdf file. The output files can be opened and manipulated using I/O API utilities (e.g. M3XTRACT, M3WNDW) or other software programs that can read and write netcdf formatted files (e.g. Fortran, R, Python). File descriptions These GRIDMASK files can be used with the 12US1 modeling grid domain (grid origin x = -2556000 m, y = -1728000 m; N columns = 459, N rows = 299). GRIDMASK_STATES_12US1.nc - This file containes 49 variables for the 48 states in the conterminous U.S. plus DC. Each state variable (e.g., AL, AZ, AR, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that state. GRIDMASK_CLIMATE_REGIONS_12US1.nc - This file containes 9 variables for 9 NOAA climate regions based on the Karl and Koss (1984) definition of climate regions. Each climate region variable (e.g., CLIMATE_REGION_1, CLIMATE_REGION_2, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that climate region. NOAA Climate regions: CLIMATE_REGION_1: Northwest (OR, WA, ID) CLIMATE_REGION_2: West (CA, NV) CLIMATE_REGION_3: West North Central (MT, WY, ND, SD, NE) CLIMATE_REGION_4: Southwest (UT, AZ, NM, CO) CLIMATE_REGION_5: South (KS, OK, TX, LA, AR, MS) CLIMATE_REGION_6: Central (MO, IL, IN, KY, TN, OH, WV) CLIMATE_REGION_7: East North Central (MN, IA, WI, MI) CLIMATE_REGION_8: Northeast (MD, DE, NJ, PA, NY, CT, RI, MA, VT, NH, ME) + Washington, D.C.* CLIMATE_REGION_9: Southeast (VA, NC, SC, GA, AL, GA) *Note that Washington, D.C. is not included in any of the climate regions on the website but was included with the “Northeast” region for the generation of this GRIDMASK file.
d
Processed ADCP Current Depth Profiles, Flow Classification, and Power Law...
catalog.data.gov
mhkdr.openei.org
+1more
Updated Aug 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandia National Laboratories (2025). Processed ADCP Current Depth Profiles, Flow Classification, and Power Law Parameters at Tidal Energy Sites [Dataset]. https://catalog.data.gov/dataset/processed-adcp-current-depth-profiles-flow-classification-and-power-law-parameters-at-tida
Explore at:
Dataset updated
Aug 31, 2025
Dataset provided by
Sandia National Laboratories
Description
This dataset contains processed acoustic Doppler current profiler (ADCP) measurements from twenty energetic tidal energy sites in the United States, Scotland, and New Zealand, compiled for the 2025 publication Current Depth Profile Characterization for Tidal Energy Development (linked below). Measurements were sourced from peer-reviewed literature, the Marine and Hydrokinetic Data Repository, EMEC, and NOAA's C-MIST database, and were selected for sites with depth-averaged current speeds exceeding 1m/s. Data span a range of tidal cycles, depths (5-70m), and flow regimes, and have been quality-controlled, filtered, and transformed into principal flood and ebb flow directions. Each netCDF file corresponds to a single site, with file names based on the site codes defined in the publication. The dataset classifies current depth profiles by shape, reports their prevalence by flow regime, and provides fitted power law parameters for monotonic profiles, along with metrics for non-monotonic profiles. Detailed descriptions of variables, units, and file naming conventions are provided in the dataset README. The submission complies with FAIR data principles: it is findable through the open-access PRIMRE Marine and Hydrokinetic Data Repository with a DOI; accessible via self-describing netCDF files readable in open-source tools such as Python and R; interoperable for integration with other applications and databases; and reusable through comprehensive documentation.
d
Data from: Tidal Energy Resource Characterization, Bottom Lander...
catalog.data.gov
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (2025). Tidal Energy Resource Characterization, Bottom Lander Measurements, Cook Inlet, AK, 2021 [Dataset]. https://catalog.data.gov/dataset/tidal-energy-resource-characterization-bottom-lander-measurements-cook-inlet-ak-2021-7c225
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
National Renewable Energy Laboratory
Area covered
Cook Inlet
Description
These datasets are from tidal resource characterization measurements collected on the Terrasond High Energy Oceanographic Mooring (THEOM) from 1 July 2021 to 30 August 2021 (60 days) in Cook Inlet, Alaska. The lander was deployed at 60.7207031 N, 151.4294998 W in ~50 m of water. The dataset contains raw and processed data from the following two instruments: A Nortek Signature 500 kHz acoustic Doppler current profiler (ADCP). Data were recorded in 4 Hz in the beam coordinate system from all 5 beams. Processed data has been averaged into 5 minutes bins and converted to the East-North-Up (ENU) coordinate system. A Nortek Vector acoustic Doppler velocimeter (ADV). Data were recorded at 8 Hz in the beam coordinate system. Processed data has been averaged into 5 minutes bins and converted to the Streamwise - Cross-stream - Vertical (Principal) coordinate system. Turbulence statistics were calculated from 5-minute bins, with an FFT length equal to the bin length, and saved in the processed dataset. Data was read and analyzed using the DOLfYN (version 1.0.2) python package and saved in MATLAB (.mat) and netCDF (.nc) file formats. Files containing analyzed data (".b1") were standardized using the TSDAT (version 0.4.2) python package. NetCDF files can be opened using DOLfYN (e.g., dat = dolfyn.load(''*.nc")) or the xarray python package (e.g. `dat = xarray.open_dataset("*.nc"). All distances are in meters (e.g., depth, range, etc), and all velocities in m/s. See the DOLfYN documentation linked in the submission, and/or the Nortek documentation for additional details.
xesmf netcdf files for testing
figshare.com
application/x-gzip
Updated Feb 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raphael Dussin (2025). xesmf netcdf files for testing [Dataset]. http://doi.org/10.6084/m9.figshare.28378283.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28378283.v1
Dataset updated
Feb 9, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Raphael Dussin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Testing files for the xesmf remapping package.
Z
QLKNN11D training set
data.niaid.nih.gov
zenodo.org
+1more
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karel Lucas van de Plassche; Jonathan Citrin (2023). QLKNN11D training set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8011147
Explore at:
Dataset updated
Jun 8, 2023
Dataset provided by
DIFFER
Authors
Karel Lucas van de Plassche; Jonathan Citrin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
QLKNN11D training set

This dataset contains a large-scale run of ~1 billion flux calculations of the quasilinear gyrokinetic transport model QuaLiKiz. QuaLiKiz is applied in numerous tokamak integrated modelling suites, and is openly available at https://gitlab.com/qualikiz-group/QuaLiKiz/. This dataset was generated with the 'QLKNN11D-hyper' tag of QuaLiKiz, equivalent to 2.8.1 apart from the negative magnetic shear filter being disabled. See https://gitlab.com/qualikiz-group/QuaLiKiz/-/tags/QLKNN11D-hyper for the in-repository tag.

The dataset is appropriate for the training of learned surrogates of QuaLiKiz, e.g. with neural networks. See https://doi.org/10.1063/1.5134126 for a Physics of Plasmas publication illustrating the development of a learned surrogate (QLKNN10D-hyper) of an older version of QuaLiKiz (2.4.0) with a 300 million point 10D dataset. The paper is also available on arXiv https://arxiv.org/abs/1911.05617 and the older dataset on Zenodo https://doi.org/10.5281/zenodo.3497066. For an application example, see Van Mulders et al 2021 https://doi.org/10.1088/1741-4326/ac0d12, where QLKNN10D-hyper was applied for ITER hybrid scenario optimization. For any learned surrogates developed for QLKNN11D, the effective addition of the alphaMHD input dimension through rescaling the input magnetic shear (s) by s = s - alpha_MHD/2, as carried out in Van Mulders et al., is recommended.

Related repositories:

General QuaLiKiz documentation https://qualikiz.com

QuaLiKiz/QLKNN input/output variables naming scheme https://qualikiz.com/QuaLiKiz/Input-and-output-variables

Training, plotting, filtering, and auxiliary tools https://gitlab.com/Karel-van-de-Plassche/QLKNN-develop

QuaLiKiz related tools https://gitlab.com/qualikiz-group/QuaLiKiz-pythontools

FORTRAN QLKNN implementation with wrapper for Python and MATLAB https://gitlab.com/qualikiz-group/QLKNN-fortran

Weights and biases of 'hyperrectangle style' QLKNN https://gitlab.com/qualikiz-group/qlknn-hype

Data exploration

The data is provided in 43 netCDF files. We advise opening single datasets using xarray or multiple datasets out-of-core using dask. For reference, we give the load times and sizes of a single variable that just depends on the scan size dimx below. This was tested single-core on a Intel Xeon 8160 CPU at 2.1 GHz and 192 GB of DDR4 RAM. Note that during loading, more memory is needed than the final number.

Timing of dataset loading Amount of datasets Final in-RAM memory (GiB)

Loading time single var (M:SS)

1 10.3 0:09 5 43.9 1:00 10 63.2 2:01 16 98.0 3:25 17 Out Of Memory x:xx

Full dataset

The full dataset of QuaLiKiz in-and-output data is available on request. Note that this is 2.2 TiB of netCDF files!

Facebook

Twitter

Click to copy link

Link copied

Cite

Christina Bandaragoda; Jimmy Phuong (2019). Tutorial for NetCDF climate data retrieval and model integration [Dataset]. https://www.hydroshare.org/resource/8438dcb7795941d3ad2fe1a6fc055ef5

Tutorial for NetCDF climate data retrieval and model integration

Explore at:

zip(125.5 KB)Available download formats

Dataset updated

Apr 4, 2019

Dataset provided by

HydroShare

Authors

Christina Bandaragoda; Jimmy Phuong

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Hydrological and meteorological information can help inform the conditions and risk factors related to the environment and their inhabitants. Due to the limitations of observation sampling, gridded data sets provide the modeled information for areas where data collection are infeasible using observations collected and known process relations. Although available, data users are faced with barriers to use, challenges like how to access, acquire, then analyze data for small watershed areas, when these datasets were produced for large, continental scale processes. In this tutorial, we introduce Observatory for Gridded Hydrometeorology (OGH) to resolve such hurdles in a use-case that incorporates NetCDF gridded data sets processes developed to interpret the findings and apply secondary modeling frameworks (landlab).

LEARNING OBJECTIVES - Familiarize with data management, metadata management, and analyses with gridded data - Inspecting and problem solving with Python libraries - Explore data architecture and processes - Learn about OGH Python Library - Discuss conceptual data engineering and science operations

Use-case operations: 1. Prepare computing environment 2. Get list of grid cells 3. NetCDF retrieval and clipping to a spatial extent 4. Extract NetCDF metadata and convert NetCDFs to 1D ASCII time-series files 5. Visualize the average monthly total precipitations 6. Apply summary values as modeling inputs 7. Visualize modeling outputs 8. Save results in a new HydroShare resource

For inquiries, issues, or contribute to the developments, please refer to https://github.com/freshwater-initiative/Observatory

Clear search

Close search

Google apps

Main menu

Tutorial for NetCDF climate data retrieval and model integration

Data from: Multi-task Deep Learning for Water Temperature and Streamflow...

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...

Dataset Paper (Open Access)

Abstract

Summary

Programmatic Download

Data details

Data Variables

Version Changelog

Software to open netCDF files

References

Related Records

Characteristic parameters extracted from the Jarkus dataset using the Jarkus...

(HS 2) Automate Workflows using Jupyter notebook to create Large Extent...

Observational Large Ensemble

Model output and data used for analysis

Forcing files for the ECMWF Integrated Forecasting System (IFS) Single...

Sub-Rayleigh to supershear fracture transition in long Propagation Saw Tests...

Northern elephant seal tracking and diving – raw and curated data

ERA-NUTS: time-series based on C3S ERA5 for European regions

Data from: Dataset for "The impact of lake shape and size on lake breezes...

Data from: Calculated Leached Nitrogen from Septic Systems in Wisconsin,...

UIUC Mobile Sounding Data

Forcing data, evaluation data, model output and analysis scripts used in...

CMAQ Grid Mask Files for 12km CONUS - US States and NOAA Climate Regions

Processed ADCP Current Depth Profiles, Flow Classification, and Power Law...

Data from: Tidal Energy Resource Characterization, Bottom Lander...

xesmf netcdf files for testing

QLKNN11D training set

Tutorial for NetCDF climate data retrieval and model integration