Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hydrological and meteorological information can help inform the conditions and risk factors related to the environment and their inhabitants. Due to the limitations of observation sampling, gridded data sets provide the modeled information for areas where data collection are infeasible using observations collected and known process relations. Although available, data users are faced with barriers to use, challenges like how to access, acquire, then analyze data for small watershed areas, when these datasets were produced for large, continental scale processes. In this tutorial, we introduce Observatory for Gridded Hydrometeorology (OGH) to resolve such hurdles in a use-case that incorporates NetCDF gridded data sets processes developed to interpret the findings and apply secondary modeling frameworks (landlab).
LEARNING OBJECTIVES - Familiarize with data management, metadata management, and analyses with gridded data - Inspecting and problem solving with Python libraries - Explore data architecture and processes - Learn about OGH Python Library - Discuss conceptual data engineering and science operations
Use-case operations: 1. Prepare computing environment 2. Get list of grid cells 3. NetCDF retrieval and clipping to a spatial extent 4. Extract NetCDF metadata and convert NetCDFs to 1D ASCII time-series files 5. Visualize the average monthly total precipitations 6. Apply summary values as modeling inputs 7. Visualize modeling outputs 8. Save results in a new HydroShare resource
For inquiries, issues, or contribute to the developments, please refer to https://github.com/freshwater-initiative/Observatory
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.
A description of this dataset, including the methodology and validation results, is available at:
Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
Changes in v9.1r1 (previous version was v09.1):
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
The following records are all part of the Soil Moisture Climate Data Records from satellites community
1 |
ESA CCI SM MODELFREE Surface Soil Moisture Record | <a href="https://doi.org/10.48436/svr1r-27j77" target="_blank" |
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This dataset presents the output of the application of the Jarkus Analysis Toolbox (JAT) to the Jarkus dataset. The Jarkus dataset is one of the most elaborate coastal datasets in the world and consists of coastal profiles of the entire Dutch coast, spaced about 250-500 m apart, which have been measured yearly since 1965. Different available definitions for extracting characteristic parameters from coastal profiles were collected and implemented in the JAT. The characteristic parameters allow stakeholders (e.g. scientists, engineers and coastal managers) to study the spatial and temporal variations in parameters like dune height, dune volume, dune foot, beach width and closure depth. This dataset includes a netcdf file (on the opendap server, see data link) that contains all characteristic parameters through space and time, and a distribution plot that shows the overview of each characteristic parameters. The Jarkus Analysis Toolbox and all scripts that were used to extract the characteristic parameters and create the distribution plots are available through Github (https://github.com/christavanijzendoorn/JAT). Example 5 that is included in the JAT provides a python script that shows how to load and work with the netcdf file.Documentation: https://jarkus-analysis-toolbox.readthedocs.io/.
We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.
This item contains data and code used in experiments that produced the results for Sadler et. al (2022) (see below for full reference). We ran five experiments for the analysis, Experiment A, Experiment B, Experiment C, Experiment D, and Experiment AuxIn. Experiment A tested multi-task learning for predicting streamflow with 25 years of training data and using a different model for each of 101 sites. Experiment B tested multi-task learning for predicting streamflow with 25 years of training data and using a single model for all 101 sites. Experiment C tested multi-task learning for predicting streamflow with just 2 years of training data. Experiment D tested multi-task learning for predicting water temperature with over 25 years of training data. Experiment AuxIn used water temperature as an input variable for predicting streamflow. These experiments and their results are described in detail in the WRR paper. Data from a total of 101 sites across the US was used for the experiments. The model input data and streamflow data were from the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset (Newman et. al 2014, Addor et. al 2017). The water temperature data were gathered from the National Water Information System (NWIS) (U.S. Geological Survey, 2016). The contents of this item are broken into 13 files or groups of files aggregated into zip files:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Testing files for the xesmf remapping package.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a processed data in NetCDF (.nc) files, that used in our study. We used the SPI to determine meteorological drought conditions in the study area, that calculated by using the open-source module Climate and Drought Indices in Python.
The modeled data in these archives are in the NetCDF format (https://www.unidata.ucar.edu/software/netcdf/). NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data. The Unidata Program Center supports and maintains netCDF programming interfaces for C, C++, Java, and Fortran. Programming interfaces are also available for Python, IDL, MATLAB, R, Ruby, and Perl. Data in netCDF format is: • Self-Describing. A netCDF file includes information about the data it contains. • Portable. A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers. • Scalable. Small subsets of large datasets in various formats may be accessed efficiently through netCDF interfaces, even from remote servers. • Appendable. Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure. • Sharable. One writer and multiple readers may simultaneously access the same netCDF file. • Archivable. Access to all earlier forms of netCDF data will be supported by current and future versions of the software. Pub_figures.tar.zip Contains the NCL scripts for figures 1-5 and Chesapeake Bay Airshed shapefile. The directory structure of the archive is ./Pub_figures/Fig#_data. Where # is the figure number from 1-5. EMISS.data.tar.zip This archive contains two NetCDF files that contain the emission totals for 2011ec and 2040ei emission inventories. The name of the files contain the year of the inventory and the file header contains a description of each variable and the variable units. EPIC.data.tar.zip contains the monthly mean EPIC data in NetCDF format for ammonium fertilizer application (files with ANH3 in the name) and soil ammonium concentration (files with NH3 in the name) for historical (Hist directory) and future (RCP-4.5 directory) simulations. WRF.data.tar.zip contains mean monthly and seasonal data from the 36km downscaled WRF simulations in the NetCDF format for the historical (Hist directory) and future (RCP-4.5 directory) simulations. CMAQ.data.tar.zip contains the mean monthly and seasonal data in NetCDF format from the 36km CMAQ simulations for the historical (Hist directory), future (RCP-4.5 directory) and future with historical emissions (RCP-4.5-hist-emiss directory). This dataset is associated with the following publication: Campbell, P., J. Bash, C. Nolte, T. Spero, E. Cooter, K. Hinson, and L. Linker. Projections of Atmospheric Nitrogen Deposition to the Chesapeake Bay Watershed. Journal of Geophysical Research - Biogeosciences. American Geophysical Union, Washington, DC, USA, 12(11): 3307-3326, (2019).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data Summary: US states grid mask file and NOAA climate regions grid mask file, both compatible with the 12US1 modeling grid domain. Note:The datasets are on a Google Drive. The metadata associated with this DOI contain the link to the Google Drive folder and instructions for downloading the data. These files can be used with CMAQ-ISAMv5.3 to track state- or region-specific emissions. See Chapter 11 and Appendix B.4 in the CMAQ User's Guide for further information on how to use the ISAM control file with GRIDMASK files. The files can also be used for state or region-specific scaling of emissions using the CMAQv5.3 DESID module. See the DESID Tutorial and Appendix B.4 in the CMAQ User's Guide for further information on how to use the Emission Control File to scale emissions in predetermined geographical areas. File Location and Download Instructions: Link to GRIDMASK files Link to README text file with information on how these files were created File Format: The grid mask are stored as netcdf formatted files using I/O API data structures (https://www.cmascenter.org/ioapi/). Information on the model projection and grid structure is contained in the header information of the netcdf file. The output files can be opened and manipulated using I/O API utilities (e.g. M3XTRACT, M3WNDW) or other software programs that can read and write netcdf formatted files (e.g. Fortran, R, Python). File descriptions These GRIDMASK files can be used with the 12US1 modeling grid domain (grid origin x = -2556000 m, y = -1728000 m; N columns = 459, N rows = 299). GRIDMASK_STATES_12US1.nc - This file containes 49 variables for the 48 states in the conterminous U.S. plus DC. Each state variable (e.g., AL, AZ, AR, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that state. GRIDMASK_CLIMATE_REGIONS_12US1.nc - This file containes 9 variables for 9 NOAA climate regions based on the Karl and Koss (1984) definition of climate regions. Each climate region variable (e.g., CLIMATE_REGION_1, CLIMATE_REGION_2, etc.) is a 2D array (299 x 459) providing the fractional area of each grid cell that falls within that climate region. NOAA Climate regions: CLIMATE_REGION_1: Northwest (OR, WA, ID) CLIMATE_REGION_2: West (CA, NV) CLIMATE_REGION_3: West North Central (MT, WY, ND, SD, NE) CLIMATE_REGION_4: Southwest (UT, AZ, NM, CO) CLIMATE_REGION_5: South (KS, OK, TX, LA, AR, MS) CLIMATE_REGION_6: Central (MO, IL, IN, KY, TN, OH, WV) CLIMATE_REGION_7: East North Central (MN, IA, WI, MI) CLIMATE_REGION_8: Northeast (MD, DE, NJ, PA, NY, CT, RI, MA, VT, NH, ME) + Washington, D.C.* CLIMATE_REGION_9: Southeast (VA, NC, SC, GA, AL, GA) *Note that Washington, D.C. is not included in any of the climate regions on the website but was included with the “Northeast” region for the generation of this GRIDMASK file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title:
Flume Experiment Dataset – Granular Flow Tests (2023)
Authors:
I. Koa, A. Recking, F. Gimbert, H. Bellot, G. Chambon, T. Faug
Contact:
islamkoaa111@gmail.com
Description:
This dataset contains NetCDF (.nc) files from controlled flume experiments conducted in 2023 to study the transition from bedload to complex granular flow dynamics on steep slopes. Each file name encodes the experiment date and test number (e.g., CanalMU-20-04-2023-test5.nc = Test 5 on April 20, 2023).
Each test corresponds to a specific discharge (Q) value, detailed in the table below.
Example filename:
CanalMU-20-04-2023-test5.nc → Test 5 conducted on April 20, 2023.
Discharge Table:
Discharge (l/s) | Date | Test Number
----------------|-------------|-------------
0.14 | 06-04-2023 | Test 3
0.14 | 04-05-2023 | Test 5
0.15 | 13-04-2023 | Test 3
0.15 | 14-04-2023 | Test 1
0.15 | 14-04-2023 | Test 2
0.16 | 17-04-2023 | Test 2
0.16 | 18-04-2023 | Test 3
0.16 | 04-05-2023 | Test 3
0.16 | 04-05-2023 | Test 4
0.17 | 18-04-2023 | Test 4
0.17 | 18-04-2023 | Test 5
0.17 | 20-04-2023 | Test 2
0.17 | 20-04-2023 | Test 4
0.17 | 20-04-2023 | Test 5
0.18 | 20-04-2023 | Test 8
0.18 | 20-04-2023 | Test 9
0.19 | 20-04-2023 | Test 10
0.19 | 20-04-2023 | Test 11
0.20 | 20-04-2023 | Test 12
0.20 | 04-05-2023 | Test 1
0.20 | 04-05-2023 | Test 2
0.21 | 20-04-2023 | Test 13
0.21 | 21-04-2023 | Test 1
0.21 | 21-04-2023 | Test 2
0.22 | 21-04-2023 | Test 3
0.22 | 21-04-2023 | Test 4
0.23 | 21-04-2023 | Test 5
0.23 | 27-04-2023 | Test 2
0.23 | 27-04-2023 | Test 3
0.23 | 28-04-2023 | Test 7
0.24 | 28-04-2023 | Test 1
0.24 | 28-04-2023 | Test 2
0.24 | 28-04-2023 | Test 3
0.25 | 28-04-2023 | Test 4
0.25 | 21-06-2023 | Test 1
0.26 | 28-04-2023 | Test 6
0.26 | 21-06-2023 | Test 3
0.26 | 21-06-2023 | Test 4
0.27 | 22-06-2023 | Test 2
0.27 | 22-06-2023 | Test 3
0.27 | 22-06-2023 | Test 1
Data Acquisition and Processing:
The original data were acquired using LabVIEW and saved in TDMS (.tdms) format. These files were processed using custom Python scripts to extract synchronized time-series data, assign physical units, and store the results in structured NetCDF-4 files.
NetCDF File Structure:
Each file includes the following structured groups and variables:
1. Group: Data_Hydro (Hydraulic Measurements)
- Time_Hydro: Time [s]
- Date_et_heure_mesure: Measurement timestamps [string]
- Etat_de_l'interrupteur: Switch state [V]
- Debit_liquide_instant: Instantaneous water discharge [L/s]
- Debit_liquide_consigne: Target water discharge [L/s]
- Vitesse_tapis_instant: Instantaneous conveyor speed [m/s]
- Vitesse_tapis_consigne: Set conveyor speed [V]
- Debit_solide_instant: Instantaneous solid discharge [g/s]
- Hauteur1–4: Water heights from four sensors [cm]
2. Group: Data_Force (Impact Force Measurements)
- Time_Force: Time [s]
- Force_Normale: Vertical impact force [N]
- Force_Tangentielle: Tangential force [N]
3. Group: Data_Annexe (Experimental Metadata)
- channel_width, Channel_slope: Flume geometry
- Position_capteur_hauteur1–4: Water sensor locations [m]
- Position_capteur_force: Force sensor position [m]
- Plaque dimensions and mass: Plate size and weight [m, kg]
- Sensor frequencies and sensitivities [Hz, pC/N]
Format:
NetCDF-4 (.nc)
Suggested software for reading:
- Python (xarray, netCDF4)
- NASA Panoply
- MATLAB
Note:
The data were processed using custom Python scripts. These are available from the corresponding author upon request.
Example: Accessing NetCDF Data in Python
The dataset can be read using the `netCDF4` or `xarray` libraries in Python. Below is a simple example using netCDF4:
```python
from netCDF4 import Dataset
import numpy as np
# Open netCDF file
data = Dataset('CanalMU-20-04-2023-test5.nc')
# Load hydraulic data
thydro = data.groups['Data_Hydro'].variables['Time_Hydro'][:]
Qcons = data.groups['Data_Hydro'].variables['Debit_liquide_consigne'][:]
Qins = data.groups['Data_Hydro'].variables['Debit_liquide_instant'][:]
Tapis = data.groups['Data_Hydro'].variables['Vitesse_tapis_consigne'][:]
h1 = data.groups['Data_Hydro'].variables['Hauteur1'][:]
h2 = data.groups['Data_Hydro'].variables['Hauteur2'][:]
h3 = data.groups['Data_Hydro'].variables['Hauteur3'][:]
h4 = data.groups['Data_Hydro'].variables['Hauteur4'][:]
# Load force data
tforce = data.groups['Data_Force'].variables['Time_Force'][:]
FN = data.groups['Data_Force'].variables['Force_Normale'][:]
FT = data.groups['Data_Force'].variables['Force_Tangentielle'][:]
# Apply calibration factors
FN = FN
FT = FT
# Fetch metadata
slope = data.groups['Data_Annexe'].variables['Channel_slope']
alpha = np.arctan(slope[:]/100)
L = data.groups['Data_Annexe'].variables['Longueur_plaque_impact'][:]
W = data.groups['Data_Annexe'].variables['Largeur_plaque_impact'][:]
```
For more advanced processing, consider using `xarray` which provides easier multi-dimensional data access.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This data set consisting of initial conditions, boundary conditions and forcing profiles for the Single Column Model (SCM) version of the European Centre for Medium-range Weather Forecasts (ECMWF) model, the Integrated Forecasting System (IFS). The IFS SCM is freely available through the OpenIFS project, on application to ECMWF for a licence. The data were produced and tested for IFS CY40R1, but will be suitable for earlier model cycles, and also for future versions assuming no new boundary fields are required by a later model. The data are archived as single time-stamp maps in netCDF files. If the data are extracted at any lat-lon location and the desired timestamps concatenated (e.g. using netCDF operators), the resultant file is in the correct format for input into the IFS SCM.
The data covers the Tropical Indian Ocean/Warm Pool domain spanning 20S-20N, 42-181E. The data are available every 15 minutes from 6 April 2009 0100 UTC for a period of ten days. The total number of grid points over which an SCM can be run is 480 in the longitudinal direction, and 142 latitudinally. With over 68,000 independent grid points available for evaluation of SCM simulations, robust statistics of bias can be estimated over a wide range of boundary and climatic conditions.
The initial conditions and forcing profiles were derived by coarse-graining high resolution (4 km) simulations produced as part of the NERC Cascade project, dataset ID xfhfc (also available on CEDA). The Cascade dataset is archived once an hour. The dataset was linearly interpolated in time to produce the 15-minute resolution required by the SCM. The resolution of the coarse-grained data corresponds to the IFS T639 reduced gaussian grid (approx 32 km). The boundary conditions are as used in the operational IFS at resolution T639. The coarse graining procedure by which the data were produced is detailed in Christensen, H. M., Dawson, A. and Holloway, C. E., 'Forcing Single Column Models using High-resolution Model Simulations', in review, Journal of Advances in Modeling Earth Systems (JAMES).
For full details of the parent Cascade simulation, see Holloway et al (2012). In brief, the simulations were produced using the limited-area setup of the MetUM version 7.1 (Davies et al, 2005). The model is semi-Lagrangian and non-hydrostatic. Initial conditions were specified from the ECMWF operational analysis. A 12 km parametrised convection run was first produced over a domain 1 degree larger in each direction, with lateral boundary conditions relaxed to the ECMWF operational analysis. The 4 km run was forced using lateral boundary conditions computed from the 12 km parametrised run, via a nudged rim of 8 model grid points. The model has 70 terrain-following hybrid levels in the vertical, with vertical resolution ranging from tens of metres in the boundary layer, to 250 m in the free troposphere, and with model top at 40 km. The time step was 30 s.
The Cascade dataset did not include archived soil variables, though surface sensible and latent heat fluxes were archived. When using the dataset, it is therefore recommended that the IFS land surface scheme be deactivated and the SCM forced using the surface fluxes instead. The first day of Cascade data exhibited evidence of spin-up. It is therefore recommended that the first day be discarded, and the data used from April 7 - April 16.
The software used to produce this dataset are freely available to interested users; 1. "cg-cascade"; NCL software to produce OpenIFS forcing fields from a high-resolution MetUM simulation and necessary ECMWF boundary files. https://github.com/aopp-pred/cg-cascade Furthermore, software to facilitate the use of this dataset are also available, consisting of; 2. "scmtiles"; Python software to deploy many independent SCMs over a domain. https://github.com/aopp-pred/scmtiles 3. "openifs-scmtiles"; Python software to deploy the OpenIFS SCM using scmtiles. https://github.com/aopp-pred/openifs-scmtiles
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Part 2 of supplementary material for the Master's Thesis: "Evaluation of AROME Model Valley Wind Simulations in the Inn Valley, Austria" (Wibmer 2024, available here).
Due to memory constraints, the supplementary material consists of two parts:
To reproduce part of the figures, users must download the Python scripts and the preprocessed AROME model datasets (NetCDF files).
The Python scripts should be placed in the same parent folder because some of them depend on each other (!! Important !!).
Original AROME model output files (GRIB2 format) are not published due to their large size.
The naming convention for the AROME simulations uses OP* (where * represents the grid spacing in meters) to differentiate the model runs based on their horizontal
grid spacing:
Due to memory constraints, the datasets needed for the analyses are split up into two parts. The second part of the supplementary material contains:
The datasets of the 0.5-km simulation (datasets_OP500.tar.xz) can be found in Part 1 of the supplementary material (available here).
The NetCDF datasets of the performed AROME-Aut simulations are packaged and compressed into .tar.xz
files.
The Python scripts, available here, require these NetCDF datasets for plotting and analyses routines.
The provided NetCDF datasets are preprocessed from the GRIB2 output of the AROME-Aut simulations.
For the scripts to function properly, you need to adjust the path to the datasets within path_handling.py
.
Each datasets_OP*.tar.xz
file contains NetCDF files for different type of levels: surface, hybridPressure (model levels), isobaricInhPa (pressure levels), meanSea (mean sea level), heightAboveGround (constant height levels). The following naming convention for the datasets is used:
The original GRIB2 files are not provided due to their large size. For further information about the GRIB2 files or the NetCDF datasets, please feel free to contact me.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
this dataset presents fisheries acoustic data in both proprietary simrad raw format and international hac format recorded onboard r/v thalassa on 28/04/2013 between 14:56 and 15:16 gmt near the continental shelf edge in southern bay of biscay. data include typical small pelagic fish schools composed of anchovy and sardine encountered in springtime in this area.the dataset has also been converted to international sonar-netcdf4 format described at : https://github.com/ices-publications/sonar-netcdf4hac files can be displayed and processed using e.g. the movies3d freeware provided by ifremer at: http://flotte.ifremer.fr/fleet/presentation-of-the-fleet/logiciels-embarques/moviessonar-netcdf4 files can be displayed using standard netcdf viewers and python notebooks available at : https://gitlab.ifremer.fr/fleet/formats/pysonar-netcdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.
Data is available in both NetCDF (.nc
) and CSV (.csv
) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset
objects, enabling coordinate-based data selection.
Each dataset uses the following coordinate conventions:
The following data files are provided:
T
summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISOAdditionally, two CSV files are provided for convenience:
imm
: Total immigration flowsemi
: Total emigration flowsnet
: Net migrationimm_pop
: Total immigrant population (non-native-born)emi_pop
: Total emigrant population (living abroad)mig_prev
: Total origin-destination flowsmig_brth
: Total birth-destination flows, where Origin ISO
reflects place of birthEach dataset includes a mean
variable (mean estimate) and a std
variable (standard deviation of the estimate).
An ISO3 conversion table is also provided.
The netCDF file included here corresponds to datasets used in the Biogeosciences paper entitled "Evaluating the Arabian Sea as a regional source of atmospheric CO2: seasonal variability and drivers" by Alain de Verneil, Zouhair Lachkar, Shafer Smith, and Marina Levy The data included here comprises of model output used in the paper to generate figures in the main manuscript. Many of the figures also contain data from publicly available sources, which is detailed in the "Data availability" section at the end of the paper. The data are in standard netCDF file format, readily readable using netCDF tools (i.e. netCDF4 package in Python, ncread function in Matlab, etc.). Variables names, dimensions, and units are described in the metadata within the netCDF file. Questions regarding this dataset and how it can be used to reproduce the results in the article can be forwarded to Alain de Verneil through email at ajd11@nyu.edu
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The netCDF file included here corresponds to datasets used in the Biogeosciences paper entitled "Evaluating the Arabian Sea as a regional source of atmospheric CO2: seasonal variability and drivers" by Alain de Verneil, Zouhair Lachkar, Shafer Smith, and Marina Levy
The data included here comprises of model output used in the paper to generate figures in the main manuscript. Many of the figures also contain data from publicly available sources, which is detailed in the "Data availability" section at the end of the paper.
The data are in standard netCDF file format, readily readable using netCDF tools (i.e. netCDF4 package in Python, ncread function in Matlab, etc.).
Variables names, dimensions, and units are described in the metadata within the netCDF file.
Questions regarding this dataset and how it can be used to reproduce the results in the article can be forwarded to Alain de Verneil through email at ajd11@nyu.edu
This dataset contains data collected from 34 successful l University of Illinois Urbana-Champaign (UIUC) Mobile Radiosonde launches collected during the SNOWIE field campaign. Each successful launch, named by year-month-day-time-location.nc, has its own netCDF file. The data in each file includes: temperature (Celsius), relative humidity, time of sample (in seconds past launch time), height AGL (m), wind speed (m/s), wind direction (degrees), and pressure (mb) as measured by the radiosonde. The coordinates and altitude above MSL (m) corresponding to where each sounding was launched are written in the attributes of each file. All surface wind speed and direction is used from the previous hourly observation from KBOI for the Boise sites and KEUL for the Caldwell site. One exception was the launch on 16 February 2017 at Caldwell in which KBOI observations were used instead. Included with each dataset order is a python script (netcdfreadout.py) to easily view the netcdf data files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the python scripts and NetCDF data for the analyses and to recreate the figures of the manuscript: "Historical shifts in seasonality and timing of extreme precipitation" by Gaby Gründemann, Enrico Zorzetto, Nick van de Giesen, and Ruud van der Ent
This data release contains a netCDF file containing decadal estimates of nitrate leached from septic systems (kilograms per hectare per year, or kg/ha) in the state of Wisconsin from 1850 to 2010, as well as the python code and supporting files used to create the netCDF file. The netCDF file is used as an input to a Nitrate Decision Support Tool for the State of Wisconsin (GW-NDST; Juckem and others, 2024). The dataset was constructed starting with 1990 census records, which included responses about households using septic systems for waste disposal. The fraction of population using septic systems in 1990 was aggregated at the county scale and applied backward in time for each decade from 1850 to 1980. For decades from 1990 to 2010, the fraction of population using septic systems was computed on the finer resolution census block-group scale. Each decadal estimate of the fraction of population using septic systems was then multiplied by 4.13 kilograms per person per year of leached nitrate to estimate the per-area load of nitrate below the root zone. The data release includes a python notebook used to process the input datasets included in the data release, shapefiles created (or modified) using the python notebook, and the final netCDF file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results from the Python Coastal Impacts and Adaptation Model (pyCIAM), the inputs and source code necessary to replicate these outputs, and the results presented in Depsky et al. 2023.
All zipped Zarr stores can be downloaded and accessed locally or can be directly accessed via code similar to the following:
from fsspec.implementations.zip import ZipFileSystem import xarray as xr xr.open_zarr(ZipFileSystem(url_of_file_in_record}}).get_mapper())
File Inventory
Products
pyCIAM_outputs.zarr.zip: Outputs of the pyCIAM model, using the SLIIDERS dataset to define socioeconomic and extreme sea level characteristics of coastal regions and the 17th, 50th, and 83rd quantiles of local sea level rise as projected by various modeling frameworks (LocalizeSL and FACTS) and for multiple emissions scenarios and ice sheet models.
pyCIAM_outputs_{case}.nc: A NetCDF version of pyCIAM_outputs, in which the netcdf files are divided up by adaptation "case" to reduce file size.
diaz2016_outputs.zarr.zip: A replication of the results from Diaz 2016 - the model upon which pyCIAM was built, using an identical configuration to that of the original model.
suboptimal_capital_by_movefactor.zarr.zip: An analysis of the observed present-day allocation of capital compared to a "rational" allocation, as a function of the magnitude of non-market costs of relocation assumed in the model. See Depsky et al. 2023 for further details.
Inputs
ar5-msl-rel-2005-quantiles.zarr.zip: Quantiles of projected local sea level rise as projected from the LocalizeSL model, using a variety of temperature scenarios and ice sheet models developed in Kopp 2014, Bamber 2019, DeConto 2021, IPCC SROCC. The results contained in pyCIAM_outputs.zarr.zip cover a broader (and newer) range of SLR projections from a more recent projection framework (FACTS); however, these data are more easily obtained from the appropriate Zenodo records and thus are not hosted in this one.
diaz2016_inputs_raw.zarr.zip: The coastal inputs used in Diaz 2016, obtained from GitHub and formatted for use in the Python-based pyCIAM. These are based on the Dynamic Integrated Vulnerability Assessment (DIVA) dataset.
surge-lookup-seg(_adm).zarr.zip: Pre-computed lookup tables estimating average annual losses from extreme sea levels due to mortality and capital stock damage. This is an intermediate output of pyCIAM and is not necessary to replicate the model results. However, it is more time consuming to produce than the rest of the model and is provided for users who may wish to start from the pre-computed dataset. Two versions are provided - the first contains estimates for each unique intersection of ~50km coastal segment and state/province-level administrative unit (admin-1). This is derived from the characteristics in SLIIDERS. The second is simply estimated on a version of SLIIDERS collapsed over administrative units to vary only over coastal segments. Both are used in the process of running pyCIAM.
ypk_2000_2100.zarr.zip: An intermediate output in creating SLIIDERS that contains country-level projections of GDP, capital stock, and population, based on the Shared Socioeconomic Pathways (SSPs). This is only used in normalizing costs estimated in pyCIAM by country and global GDP to report in Depsky et al. 2023. It is not used in the execution of pyCIAM but is provided to replicate results reported in the manuscript.
Source Code
pyCIAM.zip: Contains the python-CIAM package as well as a notebook-based workflow to replicate the results presented in Depsky et al. 2023. It also contains two master shell scripts (run_example.sh and run_full_replication.sh) to assist in executing a small sample of the pyCIAM model or in fully executing the workflow of Depsky et al. 2023, respectively. This code is consistent with release 1.2.0 in the pyCIAM GitHub repository and is available as version 1.2.0 of the python-CIAM package on PyPI.
Version history:
1.2
Point data-acquisition.ipynb
to updated Zenodo deposit that fixes the dtype of subsets
variable in diaz2016_inputs_raw.zarr.zip
to be bool rather than int8
Variable name bugfix in data-acquisition.ipynb
Add netcdf versions of SLIIDERS and the pyCIAM results to upload-zenodo.ipynb
Update results in Zenodo record to use SLIIDERS v1.2
1.1.1
Bugfix to inputs/diaz2016_inputs_raw.zarr.zip to make the subsets
variable bool instead of int8.
1.1.0
Version associated with publication of Depsky et al., 2023
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hydrological and meteorological information can help inform the conditions and risk factors related to the environment and their inhabitants. Due to the limitations of observation sampling, gridded data sets provide the modeled information for areas where data collection are infeasible using observations collected and known process relations. Although available, data users are faced with barriers to use, challenges like how to access, acquire, then analyze data for small watershed areas, when these datasets were produced for large, continental scale processes. In this tutorial, we introduce Observatory for Gridded Hydrometeorology (OGH) to resolve such hurdles in a use-case that incorporates NetCDF gridded data sets processes developed to interpret the findings and apply secondary modeling frameworks (landlab).
LEARNING OBJECTIVES - Familiarize with data management, metadata management, and analyses with gridded data - Inspecting and problem solving with Python libraries - Explore data architecture and processes - Learn about OGH Python Library - Discuss conceptual data engineering and science operations
Use-case operations: 1. Prepare computing environment 2. Get list of grid cells 3. NetCDF retrieval and clipping to a spatial extent 4. Extract NetCDF metadata and convert NetCDFs to 1D ASCII time-series files 5. Visualize the average monthly total precipitations 6. Apply summary values as modeling inputs 7. Visualize modeling outputs 8. Save results in a new HydroShare resource
For inquiries, issues, or contribute to the developments, please refer to https://github.com/freshwater-initiative/Observatory