We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Testing files for the xesmf remapping package.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Proyek ini merupakan proyek pribadi untuk memperkenalkan pustaka xarray ke kalangan mahasiswa Strata-1 kebumian di Indonesia.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.
A description of this dataset, including the methodology and validation results, is available at:
Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
Changes in v9.1r1 (previous version was v09.1):
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
The following records are all part of the Soil Moisture Climate Data Records from satellites community
1 |
ESA CCI SM MODELFREE Surface Soil Moisture Record | <a href="https://doi.org/10.48436/svr1r-27j77" target="_blank" |
This resource contains Jupyter Python notebooks which are intended to be used to learn about the U.S. National Water Model (NWM). These notebooks explore NWM forecasts in various ways. NWM Notebooks 1, 2, and 3, access NWM forecasts directly from the NOAA NOMADS file sharing system. Notebook 4 accesses NWM forecasts from Google Cloud Platform (GCP) storage in addition to NOMADS. A brief summary of what each notebook does is included below:
Notebook 1 (NWM1_Visualization) focuses on visualization. It includes functions for downloading and extracting time series forecasts for any of the 2.7 million stream reaches of the U.S. NWM. It also demonstrates ways to visualize forecasts using Python packages like matplotlib.
Notebook 2 (NWM2_Xarray) explores methods for slicing and dicing NWM NetCDF files using the python library, XArray.
Notebook 3 (NWM3_Subsetting) is focused on subsetting NWM forecasts and NetCDF files for specified reaches and exporting NWM forecast data to CSV files.
Notebook 4 (NWM4_Hydrotools) uses Hydrotools, a new suite of tools for evaluating NWM data, to retrieve NWM forecasts both from NOMADS and from Google Cloud Platform storage where older NWM forecasts are cached. This notebook also briefly covers visualizing, subsetting, and exporting forecasts retrieved with Hydrotools.
NOTE: Notebook 4 Requires a newer version of NumPy that is not available on the default CUAHSI JupyterHub instance. Please use the instance "HydroLearn - Intelligent Earth" and ensure to run !pip install hydrotools.nwm_client[gcp].
The notebooks are part of a NWM learning module on HydroLearn.org. When the associated learning module is complete, the link to it will be added here. It is recommended that these notebooks be opened through the CUAHSI JupyterHub App on Hydroshare. This can be done via the 'Open With' button at the top of this resource page.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.
Data is available in both NetCDF (.nc
) and CSV (.csv
) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset
objects, enabling coordinate-based data selection.
Each dataset uses the following coordinate conventions:
The following data files are provided:
T
summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISOAdditionally, two CSV files are provided for convenience:
imm
: Total immigration flowsemi
: Total emigration flowsnet
: Net migrationimm_pop
: Total immigrant population (non-native-born)emi_pop
: Total emigrant population (living abroad)mig_prev
: Total origin-destination flowsmig_brth
: Total birth-destination flows, where Origin ISO
reflects place of birthEach dataset includes a mean
variable (mean estimate) and a std
variable (standard deviation of the estimate).
An ISO3 conversion table is also provided.
These data are ocean profile data measured by profiling Argo S2A floats at a specific latitude, longitude, and date nominally from the surface to 2000 meters depth. Pressure, in situ temperature (ITS-90), and practical salinity are provided at 1-m increments through the water column. Argo data from Gulf of Mexico (GOM) LC1 (9 floats) and LC2 (12 floats) were delayed mode quality controlled and submitted to Global Data Assembly Centers (GDACs) in May 2020. All available profiles are planned to be revisited and evaluated in early 2021. Float no. 4903233 started showing drift in salinity at profile no. 77, and the salinity data will be carefully examined with a new adjustment in early 2021. _NCProperties=version=2,netcdf=4.6.3,hdf5=1.10.4 cdm_altitude_proxy=PRES cdm_data_type=Profile cdm_profile_variables=profile comment=free text contributor_email=devops@rpsgroup.com contributor_name=RPS contributor_role=editor contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=https://www.rpsgroup.com/ Conventions=CF-1.7, ACDD-1.3, IOOS-1.2, Argo-3.2, COARDS date_metadata_modified=2020-12-22T15:54:25Z Easternmost_Easting=-80.7381 featureType=Profile geospatial_bounds=POINT (-80.7381 23.97186) geospatial_bounds_crs=EPSG:4326 geospatial_lat_max=23.97186 geospatial_lat_min=23.97186 geospatial_lat_units=degrees_north geospatial_lon_max=-80.7381 geospatial_lon_min=-80.7381 geospatial_lon_units=degrees_east history=2020-06-09T12:00:46Z creation id=R4903233_071 infoUrl=http://www.argodatamgt.org/Documentation institution=GCOOS instrument=Argo instrument_vocabulary=GCMD Earth Science Keywords. Version 5.3.3 keywords_vocabulary=GCMD Science Keywords naming_authority=edu.tamucc.gulfhub Northernmost_Northing=23.97186 note_CHAR_variables=RPS METADATA ENHANCEMENT NOTE Variables of data type 'CHAR' have been altered by the xarray and netCDF4-python libraries to contain an extra dimension (often denoted as 'string1'). This is due to an underlying issue in the libraries: https://github.com/pydata/xarray/issues/1977. Upon examination, one will find the data has not been altered but only changed shape. We realize this is sub-optimal and apologize for any inconveniences this may cause. note_FillValue=RPS METADATA ENHANCEMENT NOTE Many variables in this dataset are of type 'char' and have a '_FillValue' attribute which is interpreted through NumPy as 'b', an empty byte string. This causes serialization issues. As a result, all variables of type 'char' with '_FillValue = b' have had the _FillValue attribute removed to avoid serialization conflicts. However, no data has been changed, so the _FillValue is still "b' '". platform=subsurface_float platform_name=Argo Float platform_vocabulary=IOOS Platform Vocabulary processing_level=Argo data are received via satellite transmission, decoded and assembled at national DACs. These DACs apply a set of automatic quality tests (RTQC) to the data, and quality flags are assigned accordingly. In the delayed-mode process (DMQC), data are subjected to visual examination and are re-flagged where necessary. For the float data affected by sensor drift, statistical tools and climatological comparisons are used to adjust the data for sensor drift when needed. For each float that has been processed in delayed-mode, the OWC method (Owens and Wong, 2009; Cabanes et al., 2016) is run with four different sets of spatial and temporal decorrelation scales and the latest available reference dataset. If the salinity adjustments obtained from the four runs all differ significantly from the existing adjustment, then the salinity data from the float are re-examined and a new adjustment is suggested if necessary. The usual practice is to examine the profiles in delayed-mode initially about 12 months after they are collected, and then revisit several times as more data from the floats are obtained (see details in Wong et al., 2020). program=Understanding Gulf Ocean Systems (UGOS) project=National Academy of Science Understanding Gulf Ocean Systems 'LC-Floats - Near Real-time Hydrography and Deep Velocity in the Loop Current System using Autonomous Profilers' Program references=http://www.argodatamgt.org/Documentation sea_name=Gulf of Mexico source=Argo float sourceUrl=(local files) Southernmost_Northing=23.97186 standard_name_vocabulary=CF Standard Name Table v67 subsetVariables=CYCLE_NUMBER, DIRECTION, DATA_MODE, time, JULD_QC, JULD_LOCATION, latitude, longitude, POSITION_QC, CONFIG_MISSION_NUMBER, PROFILE_PRES_QC, PROFILE_TEMP_QC, PROFILE_PSAL_QC time_coverage_duration=P0000-00-00T00:00:00 time_coverage_end=2020-06-04T12:57:59Z time_coverage_resolution=P0000-00-00T00:00:00 time_coverage_start=2020-06-04T12:57:59Z user_manual_version=3.2 Westernmost_Easting=-80.7381
Phsyical Oceanographic Circulation Study _NCProperties=version=2,netcdf=4.6.3,hdf5=1.10.4 acknowledgement=N/A cdm_altitude_proxy=PRES cdm_data_type=Profile cdm_profile_variables=DIRECTION,JULD_QC,JULD_LOCATION,latitude,longitude,POSITION_QC,PROFILE_PRES_QC,PROFILE_PSAL_QC,PROFILE_TEMP_QC,PRES,PRES_QC,PRES_ADJUSTED,PRES_ADJUSTED_QC,PRES_ADJUSTED_ERROR,PSAL,PSAL_QC,PSAL_ADJUSTED,PSAL_ADJUSTED_QC,PSAL_ADJUSTED_ERROR,TEMP,TEMP_QC,TEMP_ADJUSTED,TEMP_ADJUSTED_QC,TEMP_ADJUSTED_ERROR,profile comment=N/A contributor_email=devops@rpsgroup.com contributor_name=RPS contributor_role=editor contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=rpsgroup.com Conventions=ACDD-1.3, CF-1.7, IOOS-1.2 date_metadata_modified=2021-03-15T14:51:42.035312 Easternmost_Easting=-91.8529 featureType=Profile geospatial_bounds=MULTIPOINT (-91.98566 25.96674, -91.97807 25.98228, -91.9726 26.00366, -91.97461 26.02827, -91.97528 26.05012, -91.94217 26.16566, -91.8529 26.33388, -91.96507 26.42177, -92.09139 26.54617, -92.27458 26.631 , -92.20681 26.70364, -92.31983 26.77333, -92.12682 26.80391, -91.98944 26.78424, -92.09974 26.68598, -92.09822 26.57188, -92.19828 26.50321, -92.35713 26.3762 , -92.82355 26.28237, -93.23445 26.23658, -93.63897 26.22036, -95.03909 26.08702, -95.30309 26.105 , -95.57178 25.82772, -95.91577 25.47436, -96.01818 25.40922, -96.13509 25.33579, -95.99375 25.20764, -96.00343 25.09988, -96.22206 24.69119, -96.11989 24.48749, -95.9642 24.53761, -96.03794 24.42939, -96.00763 24.08652, -96.04693 24.03853, -96.19311 23.80906, -96.25197 23.64117, -95.98351 23.76981, -95.62223 23.98807, -95.58291 24.07059, -95.93292 24.2792 , -96.07919 24.17341, -95.56488 23.87719, -95.32403 24.39069, -95.0897 24.74174, -95.02783 24.8449 , -95.28647 25.05364, -95.03694 25.18664, -94.83349 24.87683, -94.90457 25.18515, -94.7876 25.1155 , -94.57192 24.25851, -94.32748 24.61555, -94.58534 24.65175, -94.2076 25.22078, -93.15773 25.13793, -92.85737 25.12772, -92.97108 24.91901, -93.15694 24.6478 , -93.36139 24.61949, -93.7884 24.61305) geospatial_bounds_crs=EPSG:4326 geospatial_bounds_vertical_crs=EPSG:5831 geospatial_lat_max=26.80391 geospatial_lat_min=23.64117 geospatial_lat_units=degrees_north geospatial_lon_max=-91.8529 geospatial_lon_min=-96.25197 geospatial_lon_units=degrees_east geospatial_vertical_positive=down history=_prof.nc and _meta.nc concatenated and enhanced metadata by RPS 2021-03-15T14:51:42.035301 id=4901599 infoUrl=https://gcoos.org institution=US Argo (US GDAC) instrument=US ARGO Profiler naming_authority=edu.tamucc.gulfhub Northernmost_Northing=26.80391 note_CHAR_variables=RPS METADATA ENHANCEMENT NOTE Variables of data type 'CHAR' have been altered by the xarray and netCDF4-python libraries to contain an extra dimension (often denoted as 'string1'). This is due to an underlying issue in the libraries: https://github.com/pydata/xarray/issues/1977. Upon examination, one will find the data has not been altered but only changed shape. We realize this is sub-optimal and apologize for any inconveniences this may cause. platform=subsurface_float platform_id=4901599 platform_name=US Argo APEX Float 4901599 platform_vocabulary=GCMD Keywords Version 8.7 processing_level=Data QA/QC performed by USARGO program=Deep Langrangian Observations in the Gulf of Mexico (funding: BOEM) project=Deep Circulation in the Gulf of Mexico: A Lagrangian Study references=https://espis.boem.gov/final%20reports/5471.pdf source=observation sourceUrl=(local files) Southernmost_Northing=23.64117 standard_name_vocabulary=CF Standard Name Table v72 time_coverage_duration=P0001-06-24T11:40:49 time_coverage_end=2015-11-16T13:03:00Z time_coverage_resolution=P0000-00-00T10:55:10 time_coverage_start=2014-04-23T01:22:11Z Westernmost_Easting=-96.25197
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset has been retrieved on the Copernicus Climate data Store (https://cds.climate.copernicus.eu/#!/home) and is meant to be used for teaching purposes only. Data retrieved were split per year and concatenated to create two separate files. Then these two files were converted from GRIB format to netcdf using xarray (http://xarray.pydata.org/en/stable/). This dataset is used in the Galaxy training on "Visualize Climate data with Panoply in Galaxy".
See https://training.galaxyproject.org/ (topic: climate) for more information.
The python code below show how it has been retrieved on CDS:
import cdsapi
c = cdsapi.Client()
c.retrieve(
'ecv-for-climate-change',
{
'variable': [
'precipitation', 'sea_ice_cover', 'surface_air_temperature',
],
'product_type': 'monthly_mean',
'time_aggregation': '1_month',
'year': [
'1979', '2018',
],
'month': [
'01', '02', '03',
'04', '05', '06',
'07', '08', '09',
'10', '11', '12',
],
'origin': 'era5',
'format': 'zip',
},
'download.zip')
2m_temperature
, Celsius degrees) - ssrd: Surface solar radiation (surface_solar_radiation_downwards
, Watt per square meter) - ssrdc: Surface solar radiation clear-sky (surface_solar_radiation_downward_clear_sky
, Watt per square meter) - ro: Runoff (runoff
, millimeters) There are also a set of derived variables: - ws10: Wind speed at 10 meters (derived by 10m_u_component_of_wind
and 10m_v_component_of_wind
, meters per second) - ws100: Wind speed at 100 meters (derived by 100m_u_component_of_wind
and 100m_v_component_of_wind
, meters per second) - CS: Clear-Sky index (the ratio between the solar radiation and the solar radiation clear-sky) - HDD/CDD: Heating/Cooling Degree days (derived by 2-meter temperature the EUROSTAT definition. For each variable we have 367 440 hourly samples (from 01-01-1980 00:00:00 to 31-12-2021 23:00:00) for 34/115/309 regions (NUTS 0/1/2). The data is provided in two formats: - NetCDF version 4 (all the variables hourly and CDD/HDD daily). NOTE: the variables are stored as int16
type using a scale_factor
to minimise the size of the files. - Comma Separated Value ("single index" format for all the variables and the time frequencies and "stacked" only for daily and monthly) All the CSV files are stored in a zipped file for each variable. ## Methodology The time-series have been generated using the following workflow: 1. The NetCDF files are downloaded from the Copernicus Data Store from the ERA5 hourly data on single levels from 1979 to present dataset 2. The data is read in R with the climate4r packages and aggregated using the function /get_ts_from_shp
from panas. All the variables are aggregated at the NUTS boundaries using the average except for the runoff, which consists of the sum of all the grid points within the regional/national borders. 3. The derived variables (wind speed, CDD/HDD, clear-sky) are computed and all the CSV files are generated using R 4. The NetCDF are created using xarray
in Python 3.8. ## Example notebooks In the folder notebooks
on the associated Github repository there are two Jupyter notebooks which shows how to deal effectively with the NetCDF data in xarray
and how to visualise them in several ways by using matplotlib or the enlopy package. There are currently two notebooks: - exploring-ERA-NUTS: it shows how to open the NetCDF files (with Dask), how to manipulate and visualise them. - ERA-NUTS-explore-with-widget: explorer interactively the datasets with jupyter and ipywidgets. The notebook exploring-ERA-NUTS
is also available rendered as HTML. ## Additional files In the folder additional files
on the associated Github repository there is a map showing the spatial resolution of the ERA5 reanalysis and a CSV file specifying the number of grid points with respect to each NUTS0/1/2 region. ## License This dataset is released under CC-BY-4.0 license.Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides global daily estimates of Root-Zone Soil Moisture (RZSM) content at 0.25° spatial grid resolution, derived from gap-filled merged satellite observations of 14 passive satellites sensors operating in the microwave domain of the electromagnetic spectrum. Data is provided from January 1991 to December 2023.
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/" target="_blank" rel="noopener">https://climate.esa.int/en/projects/soil-moisture/. Operational implementation is supported by the Copernicus Climate Change Service implemented by ECMWF through C3S2 312a/313c.
This dataset is used by Hirschi et al. (2025) to assess recent summer drought trends in Switzerland.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations from various microwave satellite remote sensing sensors (Dorigo et al., 2017, 2024; Gruber et al., 2019). This version of the dataset uses the PASSIVE record as input, which contains only observations from passive (radiometer) measurements (scaling reference AMSR-E). The surface observations are gap-filled using a univariate interpolation algorithm (Preimesberger et al., 2025). The gap-filled passive observations serve as input for an exponential filter based method to assess soil moisture in different layers of the root-zone of soil (0-200 cm) following the approach by Pasik et al. (2023). The final gap-free root-zone soil moisture estimates based on passive surface input data are provided here at 4 separate depth layers (0-10, 10-40, 40-100, 100-200 cm) over the period 1991-2023.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Downloads on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.ac.at/records/8dda4-xne96/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESA_CCI_PASSIVERZSM-YYYYMMDD000000-fv09.1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
Please see the ESA CCI Soil Moisture science data records community for more records based on ESA CCI SM.
These data are ocean profile data measured by profiling Argo S2A floats at a specific latitude, longitude, and date nominally from the surface to 2000 meters depth. Pressure, in situ temperature (ITS-90), and practical salinity are provided at 1-m increments through the water column. Argo data from Gulf of Mexico (GOM) LC1 (9 floats) and LC2 (12 floats) were delayed mode quality controlled and submitted to Global Data Assembly Centers (GDACs) in May 2020. All available profiles are planned to be revisited and evaluated in early 2021. Float no. 4903233 started showing drift in salinity at profile no. 77, and the salinity data will be carefully examined with a new adjustment in early 2021. _NCProperties=version=2,netcdf=4.6.3,hdf5=1.10.4 cdm_altitude_proxy=PRES cdm_data_type=Profile cdm_profile_variables=profile comment=free text contributor_email=devops@rpsgroup.com contributor_name=RPS contributor_role=editor contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=https://www.rpsgroup.com/ Conventions=CF-1.7, ACDD-1.3, IOOS-1.2, Argo-3.2, COARDS date_metadata_modified=2020-12-22T15:54:25Z Easternmost_Easting=-89.40393 featureType=Profile geospatial_bounds=POINT (-89.40393 24.72764) geospatial_bounds_crs=EPSG:4326 geospatial_lat_max=24.72764 geospatial_lat_min=24.72764 geospatial_lat_units=degrees_north geospatial_lon_max=-89.40393 geospatial_lon_min=-89.40393 geospatial_lon_units=degrees_east history=2020-06-20T17:01:00Z creation id=R4903259_055 infoUrl=http://www.argodatamgt.org/Documentation institution=GCOOS instrument=Argo instrument_vocabulary=GCMD Earth Science Keywords. Version 5.3.3 keywords_vocabulary=GCMD Science Keywords naming_authority=edu.tamucc.gulfhub Northernmost_Northing=24.72764 note_CHAR_variables=RPS METADATA ENHANCEMENT NOTE Variables of data type 'CHAR' have been altered by the xarray and netCDF4-python libraries to contain an extra dimension (often denoted as 'string1'). This is due to an underlying issue in the libraries: https://github.com/pydata/xarray/issues/1977. Upon examination, one will find the data has not been altered but only changed shape. We realize this is sub-optimal and apologize for any inconveniences this may cause. note_FillValue=RPS METADATA ENHANCEMENT NOTE Many variables in this dataset are of type 'char' and have a '_FillValue' attribute which is interpreted through NumPy as 'b', an empty byte string. This causes serialization issues. As a result, all variables of type 'char' with '_FillValue = b' have had the _FillValue attribute removed to avoid serialization conflicts. However, no data has been changed, so the _FillValue is still "b' '". platform=subsurface_float platform_name=Argo Float platform_vocabulary=IOOS Platform Vocabulary processing_level=Argo data are received via satellite transmission, decoded and assembled at national DACs. These DACs apply a set of automatic quality tests (RTQC) to the data, and quality flags are assigned accordingly. In the delayed-mode process (DMQC), data are subjected to visual examination and are re-flagged where necessary. For the float data affected by sensor drift, statistical tools and climatological comparisons are used to adjust the data for sensor drift when needed. For each float that has been processed in delayed-mode, the OWC method (Owens and Wong, 2009; Cabanes et al., 2016) is run with four different sets of spatial and temporal decorrelation scales and the latest available reference dataset. If the salinity adjustments obtained from the four runs all differ significantly from the existing adjustment, then the salinity data from the float are re-examined and a new adjustment is suggested if necessary. The usual practice is to examine the profiles in delayed-mode initially about 12 months after they are collected, and then revisit several times as more data from the floats are obtained (see details in Wong et al., 2020). program=Understanding Gulf Ocean Systems (UGOS) project=National Academy of Science Understanding Gulf Ocean Systems 'LC-Floats - Near Real-time Hydrography and Deep Velocity in the Loop Current System using Autonomous Profilers' Program references=http://www.argodatamgt.org/Documentation sea_name=Gulf of Mexico source=Argo float sourceUrl=(local files) Southernmost_Northing=24.72764 standard_name_vocabulary=CF Standard Name Table v67 subsetVariables=CYCLE_NUMBER, DIRECTION, DATA_MODE, time, JULD_QC, JULD_LOCATION, latitude, longitude, POSITION_QC, CONFIG_MISSION_NUMBER, PROFILE_PRES_QC, PROFILE_TEMP_QC, PROFILE_PSAL_QC time_coverage_duration=P0000-00-00T00:00:00 time_coverage_end=2020-06-15T16:02:33Z time_coverage_resolution=P0000-00-00T00:00:00 time_coverage_start=2020-06-15T16:02:33Z user_manual_version=3.2 Westernmost_Easting=-89.40393
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IAGOS-CARIBIC_MS_files_collection_20250711 contains merged IAGOS-CARIBIC data, on a 10s grid (CARIBIC-1 and CARIBIC-2; <https://www.caribic-atmospheric.com/>). There is one netCDF (version 4) file per IAGOS-CARIBIC flight. Files were generated from NASA Ames 1001 source files. For detailed content information, see global and variable attributes. Global attribute `na_file_header` contains the original NASA Ames file header as an array of strings.
The data set covers 22 years of CARIBIC data from 1997 to 2020, flight numbers 1 to 591. There is no data available after 2020. Also, there is no data available for the following flight numbers within the [1..591] range:
CARIBIC-1 data only contains a subset of the variables found in CARIBIC-2 data files. To distinguish those two campaigns, use the global attribute 'mission'.
netCDF v4, created with xarray, <https://docs.xarray.dev/en/stable/>. Compression: zlib, level 5. Metadata conventions: CF-1.10, ACDD-1.3 (see also 'comment' global attribute).
See `CARIBIC-MS_files_species-and-contributors.csv` in the zip archive.
This dataset is also available via the KIT-IMKASF THREDDS server, <https://thredds.atmohub.kit.edu/thredds/catalog/iagos-caribic/catalog.html>.
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
thk - ice thickness uvel - ice velocity in the x direction vvel - ice velocity in the y direction mask - a mask of ice cover NetCDF file of a model simulation of the former British Irish Ice Sheet.
The folder is provided as a .tar.gz file. This can be uncompressed using a command (linux) such as: tar -xf archive.tar.gz or in Windows using 7zip.
Within the folder is output from a numerical model simulation of the former British Irish Ice Sheet, modelled using PISM: https://www.pism.io/
The netcdf file contains several variables, output every 100 years:
This can be viewed using the ncview software and manipulated in python using packages such as netCDF4 and xarray.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wave and sea surface temperature measurements collected by a Sofar Spotter buoy in 2023. The buoy was deployed on July 27, 2023 at 11:30 UTC northwest of Culebra Island, Puerto Rico, (18.3878 N, 65.3899 W) and recovered on Nov 5, 2023 at 12:45 UTC. Data are saved here in netCDF format, organized by month, and include directional wave statistics, GPS, and SST measurements at 30-minute intervals. Figures produced from these data are provided here as well. They include timeseries of wave height/period/direction and SST, GPS location, wave roses, and directional spectra. Additionally, raw CSV files from the Spotter's memory card can also be found below. NetCDF files can be read in python using the netCDF4 or Xarray packages, or through MATLAB using the "ncread()" command.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the Galaxy Pangeo tutorials on Xarray.
Data is in netCDF format and is from Copernicus Air Monitoring Service and more precisely PM2.5 (Particle Matter < 2.5 μm) 4 days forecast from December, 22 2021. This dataset is very small and there is no need to parallelize our data analysis. Parallel data analysis with Pangeo is not covered in this tutorial and will make use of another dataset.
This dataset is not meant to be useful for scientific studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a globally consistent, satellite-derived dataset of CO_2 enhancement (ΔXCO_2), quantifying the spatially resolved excess in atmospheric CO_2 concentrations as a collective consequence of anthropogenic emissions and terrestrial carbon uptake. This dataset is generated from the deviations of NASA's OCO-3 satellite retrievals comprising 54 million observations across more than 200 countries from 2019 to 2023.
Dear reviewers, please download the datasets here and access using the password enclosed in the review documents. Many thanks!
Data Descriptions -----------------------------------------
# install pre-requests
! pip install netcdf4
! pip install h5netcdf# read co2 enhancement data
import xarray as xr
fn = './CO2_Enhancements_Global.nc'
data = xr.open_dataset(fn)
type(data)
Please cite at least one of the following for any use of the CO2E dataset.
Zhou, Y.*, Fan, P., Liu, J., Xu, Y., Huang, B., Webster, C. (2025). GloCE v1.0: Global CO2 Enhancement Dataset 2019-2023 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15209825
Fan, P., Liu, J., Xu, Y., Huang, B., Webster, C., & Zhou, Y*. (Under Review) A global dataset of CO2 enhancements during 2019-2023.
For any data inquiries, please email Yulun Zhou at yulunzhou@hku.hk.
These data are ocean profile data measured by profiling Argo S2A floats at a specific latitude, longitude, and date nominally from the surface to 2000 meters depth. Pressure, in situ temperature (ITS-90), and practical salinity are provided at 1-m increments through the water column. Argo data from Gulf of Mexico (GOM) LC1 (9 floats) and LC2 (12 floats) were delayed mode quality controlled and submitted to Global Data Assembly Centers (GDACs) in May 2020. All available profiles are planned to be revisited and evaluated in early 2021. Float no. 4903233 started showing drift in salinity at profile no. 77, and the salinity data will be carefully examined with a new adjustment in early 2021. _NCProperties=version=2,netcdf=4.6.3,hdf5=1.10.4 cdm_altitude_proxy=PRES cdm_data_type=Profile cdm_profile_variables=profile comment=free text contributor_email=devops@rpsgroup.com contributor_name=RPS contributor_role=editor contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=https://www.rpsgroup.com/ Conventions=CF-1.7, ACDD-1.3, IOOS-1.2, Argo-3.2, COARDS date_metadata_modified=2020-12-22T15:54:25Z Easternmost_Easting=-86.47535 featureType=Profile geospatial_bounds=POINT (-86.47535 27.82717) geospatial_bounds_crs=EPSG:4326 geospatial_lat_max=27.82717 geospatial_lat_min=27.82717 geospatial_lat_units=degrees_north geospatial_lon_max=-86.47535 geospatial_lon_min=-86.47535 geospatial_lon_units=degrees_east history=2020-02-19T18:01:10Z creation id=R4903238_048 infoUrl=http://www.argodatamgt.org/Documentation institution=GCOOS instrument=Argo instrument_vocabulary=GCMD Earth Science Keywords. Version 5.3.3 keywords_vocabulary=GCMD Science Keywords naming_authority=edu.tamucc.gulfhub Northernmost_Northing=27.82717 note_CHAR_variables=RPS METADATA ENHANCEMENT NOTE Variables of data type 'CHAR' have been altered by the xarray and netCDF4-python libraries to contain an extra dimension (often denoted as 'string1'). This is due to an underlying issue in the libraries: https://github.com/pydata/xarray/issues/1977. Upon examination, one will find the data has not been altered but only changed shape. We realize this is sub-optimal and apologize for any inconveniences this may cause. note_FillValue=RPS METADATA ENHANCEMENT NOTE Many variables in this dataset are of type 'char' and have a '_FillValue' attribute which is interpreted through NumPy as 'b', an empty byte string. This causes serialization issues. As a result, all variables of type 'char' with '_FillValue = b' have had the _FillValue attribute removed to avoid serialization conflicts. However, no data has been changed, so the _FillValue is still "b' '". platform=subsurface_float platform_name=Argo Float platform_vocabulary=IOOS Platform Vocabulary processing_level=Argo data are received via satellite transmission, decoded and assembled at national DACs. These DACs apply a set of automatic quality tests (RTQC) to the data, and quality flags are assigned accordingly. In the delayed-mode process (DMQC), data are subjected to visual examination and are re-flagged where necessary. For the float data affected by sensor drift, statistical tools and climatological comparisons are used to adjust the data for sensor drift when needed. For each float that has been processed in delayed-mode, the OWC method (Owens and Wong, 2009; Cabanes et al., 2016) is run with four different sets of spatial and temporal decorrelation scales and the latest available reference dataset. If the salinity adjustments obtained from the four runs all differ significantly from the existing adjustment, then the salinity data from the float are re-examined and a new adjustment is suggested if necessary. The usual practice is to examine the profiles in delayed-mode initially about 12 months after they are collected, and then revisit several times as more data from the floats are obtained (see details in Wong et al., 2020). program=Understanding Gulf Ocean Systems (UGOS) project=National Academy of Science Understanding Gulf Ocean Systems 'LC-Floats - Near Real-time Hydrography and Deep Velocity in the Loop Current System using Autonomous Profilers' Program references=http://www.argodatamgt.org/Documentation sea_name=Gulf of Mexico source=Argo float sourceUrl=(local files) Southernmost_Northing=27.82717 standard_name_vocabulary=CF Standard Name Table v67 subsetVariables=CYCLE_NUMBER, DIRECTION, DATA_MODE, time, JULD_QC, JULD_LOCATION, latitude, longitude, POSITION_QC, CONFIG_MISSION_NUMBER, PROFILE_PRES_QC, PROFILE_TEMP_QC, PROFILE_PSAL_QC time_coverage_duration=P0000-00-00T00:00:00 time_coverage_end=2020-02-14T15:59:10Z time_coverage_resolution=P0000-00-00T00:00:00 time_coverage_start=2020-02-14T15:59:10Z user_manual_version=3.2 Westernmost_Easting=-86.47535
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is comprised of ECMWF ERA5-Land data covering 2014 to October 2022. This data is on a 0.1 degree grid and has fewer variables than the standard ERA5-reanalysis, but at a higher resolution. All the data has been downloaded as NetCDF files from the Copernicus Data Store and converted to Zarr using Xarray, then uploaded here. Each file is one day, and holds 24 timesteps.
These data are ocean profile data measured by profiling Argo S2A floats at a specific latitude, longitude, and date nominally from the surface to 2000 meters depth. Pressure, in situ temperature (ITS-90), and practical salinity are provided at 1-m increments through the water column. Argo data from Gulf of Mexico (GOM) LC1 (9 floats) and LC2 (12 floats) were delayed mode quality controlled and submitted to Global Data Assembly Centers (GDACs) in May 2020. All available profiles are planned to be revisited and evaluated in early 2021. Float no. 4903233 started showing drift in salinity at profile no. 77, and the salinity data will be carefully examined with a new adjustment in early 2021. _NCProperties=version=2,netcdf=4.6.3,hdf5=1.10.4 cdm_altitude_proxy=PRES cdm_data_type=Profile cdm_profile_variables=profile comment=free text contributor_email=devops@rpsgroup.com contributor_name=RPS contributor_role=editor contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=https://www.rpsgroup.com/ Conventions=CF-1.7, ACDD-1.3, IOOS-1.2, Argo-3.2, COARDS date_metadata_modified=2020-12-22T15:54:25Z Easternmost_Easting=-86.06862 featureType=Profile geospatial_bounds=POINT (-86.06862 27.54357) geospatial_bounds_crs=EPSG:4326 geospatial_lat_max=27.54357 geospatial_lat_min=27.54357 geospatial_lat_units=degrees_north geospatial_lon_max=-86.06862 geospatial_lon_min=-86.06862 geospatial_lon_units=degrees_east history=2020-02-10T18:10:48Z creation id=R4903249_029 infoUrl=http://www.argodatamgt.org/Documentation institution=GCOOS instrument=Argo instrument_vocabulary=GCMD Earth Science Keywords. Version 5.3.3 keywords_vocabulary=GCMD Science Keywords naming_authority=edu.tamucc.gulfhub Northernmost_Northing=27.54357 note_CHAR_variables=RPS METADATA ENHANCEMENT NOTE Variables of data type 'CHAR' have been altered by the xarray and netCDF4-python libraries to contain an extra dimension (often denoted as 'string1'). This is due to an underlying issue in the libraries: https://github.com/pydata/xarray/issues/1977. Upon examination, one will find the data has not been altered but only changed shape. We realize this is sub-optimal and apologize for any inconveniences this may cause. note_FillValue=RPS METADATA ENHANCEMENT NOTE Many variables in this dataset are of type 'char' and have a '_FillValue' attribute which is interpreted through NumPy as 'b', an empty byte string. This causes serialization issues. As a result, all variables of type 'char' with '_FillValue = b' have had the _FillValue attribute removed to avoid serialization conflicts. However, no data has been changed, so the _FillValue is still "b' '". platform=subsurface_float platform_name=Argo Float platform_vocabulary=IOOS Platform Vocabulary processing_level=Argo data are received via satellite transmission, decoded and assembled at national DACs. These DACs apply a set of automatic quality tests (RTQC) to the data, and quality flags are assigned accordingly. In the delayed-mode process (DMQC), data are subjected to visual examination and are re-flagged where necessary. For the float data affected by sensor drift, statistical tools and climatological comparisons are used to adjust the data for sensor drift when needed. For each float that has been processed in delayed-mode, the OWC method (Owens and Wong, 2009; Cabanes et al., 2016) is run with four different sets of spatial and temporal decorrelation scales and the latest available reference dataset. If the salinity adjustments obtained from the four runs all differ significantly from the existing adjustment, then the salinity data from the float are re-examined and a new adjustment is suggested if necessary. The usual practice is to examine the profiles in delayed-mode initially about 12 months after they are collected, and then revisit several times as more data from the floats are obtained (see details in Wong et al., 2020). program=Understanding Gulf Ocean Systems (UGOS) project=National Academy of Science Understanding Gulf Ocean Systems 'LC-Floats - Near Real-time Hydrography and Deep Velocity in the Loop Current System using Autonomous Profilers' Program references=http://www.argodatamgt.org/Documentation sea_name=Gulf of Mexico source=Argo float sourceUrl=(local files) Southernmost_Northing=27.54357 standard_name_vocabulary=CF Standard Name Table v67 subsetVariables=CYCLE_NUMBER, DIRECTION, DATA_MODE, time, JULD_QC, JULD_LOCATION, latitude, longitude, POSITION_QC, CONFIG_MISSION_NUMBER, PROFILE_PRES_QC, PROFILE_TEMP_QC, PROFILE_PSAL_QC time_coverage_duration=P0000-00-00T00:00:00 time_coverage_end=2020-02-05T17:05:17Z time_coverage_resolution=P0000-00-00T00:00:00 time_coverage_start=2020-02-05T17:05:17Z user_manual_version=3.2 Westernmost_Easting=-86.06862
We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.