Facebook
TwitterWe implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test datasets for use with xmitgcm.These data were generated by running mitgcm in different configurations. Each tar archive contain a folder full of mds *.data / *.meta files.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.
A description of this dataset, including the methodology and validation results, is available at:
Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
Changes in v9.1r1 (previous version was v09.1):
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
The following records are all part of the ESA CCI Soil Moisture science data records community
| 1 |
ESA CCI SM MODELFREE Surface Soil Moisture Record | <a href="https://doi.org/10.48436/svr1r-27j77" target="_blank" |
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
onemil1_1.nc is the train dataset.
onemil1_2.nc is the validation dataset.
onemil2.nc, p240.nc, and p390.nc are the test datasets.
These files are in .nc format; use xarray with Python to interface with them.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the datasets and code used in the Zastrow and Glotch manuscript entitled "Distinct Lithologies in Jezero Crater, Mars".
Abstract
Jezero crater is the landing site for the Mars 2020 Perseverance rover. The Noachian-aged crater has undergone several periods of fluvial and lacustrine activity and many phyllosilicate- and carbonate-bearing rocks formed as a result. It also contains a large portion of the regional Nili Fossae olivine-carbonate unit. In this work, we performed spectral mixture analysis of visible/near-infrared hyperspectral imagery over Jezero. We modeled carbonate abundances up to ~35% and identified three distinct units containing different carbonate phases. Our work also suggests that the olivine in the regional unit is largely restricted to aeolian deposis overlying the carbonate-bearing rocks. The diversity of carbonate phases in Jezero points to multiple periods of carbonate formation under varying conditions.
Description of Repository Datasets
code_unmixing.zip:
data_input.zip:
data_models.zip:
data_roi.zip:
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
For this experiment we measured all of the data using heterodyne detection.File "fig2" contains the data for the expeiment performed by storing the light in the GEM and then reading it out with EIT and gaussian functions fitted to the collecte time traces.Files "fig3_a" and "fig3_c_sim" contains the parameters of the gaussian functions fitted to the experimental time traces and simulated time traces respectively.File "fig4" contains fourier transforms of the experimentally measured time traces for impulses stored in EIT and readout in GEM.File "fig5" contains time traces for impulse with two frequencies stored in GEM and readout in EIT with fitted gaussian functions and fourier transform of time traces for two impulse stored in EIT and readout in GEM with fitted gaussian functions.Files were generated with Xarray (v2025.3.1) Python library. Files are in the HDF5 format. Files can be loaded uisng Xarray function "xarray.load_dataset". We analyzed the data using Python programming language. Data for theoretical model were created using Python.The “Quantum Optical Technologies” (FENG.02.01-IP.05-0017/23) project is carried out within the Measure 2.1 International Research Agendas programme of the Foundation for Polish Science, co-financed by the European Union under the European Funds for Smart Economy 2021--2027 (FENG). This research was funded in whole or in part by the National Science Centre, Poland, grant no. 2024/53/B/ST2/04040. Publication co-financed from the state budget funds (Poland), awarded by the Minister of Science under the “Perły Nauki II” program, project No. PN/02/0027/2023, co-financing amount PLN 239,998.00, total project value PLN 239,998.00.
Facebook
TwitterThese datasets are from tidal resource characterization measurements collected on the Terrasond High Energy Oceanographic Mooring (THEOM) from 1 July 2021 to 30 August 2021 (60 days) in Cook Inlet, Alaska. The lander was deployed at 60.7207031 N, 151.4294998 W in ~50 m of water. The dataset contains raw and processed data from the following two instruments: A Nortek Signature 500 kHz acoustic Doppler current profiler (ADCP). Data were recorded in 4 Hz in the beam coordinate system from all 5 beams. Processed data has been averaged into 5 minutes bins and converted to the East-North-Up (ENU) coordinate system. A Nortek Vector acoustic Doppler velocimeter (ADV). Data were recorded at 8 Hz in the beam coordinate system. Processed data has been averaged into 5 minutes bins and converted to the Streamwise - Cross-stream - Vertical (Principal) coordinate system. Turbulence statistics were calculated from 5-minute bins, with an FFT length equal to the bin length, and saved in the processed dataset. Data was read and analyzed using the DOLfYN (version 1.0.2) python package and saved in MATLAB (.mat) and netCDF (.nc) file formats. Files containing analyzed data (".b1") were standardized using the TSDAT (version 0.4.2) python package. NetCDF files can be opened using DOLfYN (e.g., dat = dolfyn.load(''*.nc")) or the xarray python package (e.g. `dat = xarray.open_dataset("*.nc"). All distances are in meters (e.g., depth, range, etc), and all velocities in m/s. See the DOLfYN documentation linked in the submission, and/or the Nortek documentation for additional details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Author: Andrew J. FeltonDate: 5/5/2024
This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis and figure production for the study entitled:
"Global estimates of the storage and transit time of water through vegetation"
Please note that 'turnover' and 'transit' are used interchangeably in this project.
Data information:
The data folder contains key data sets used for analysis. In particular:
"data/turnover_from_python/updated/annual/multi_year_average/average_annual_turnover.nc" contains a global array summarizing five year (2016-2020) averages of annual transit, storage, canopy transpiration, and number of months of data. This is the core dataset for the analysis; however, each folder has much more data, including a dataset for each year of the analysis. Data are also available is separate .csv files for each land cover type. Oterh data can be found for the minimum, monthly, and seasonal transit time found in their respective folders. These data were produced using the python code found in the "supporting_code" folder given the ease of working with .nc and EASE grid in the xarray python module. R was used primarily for data visualization purposes. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here.
Python scripts can be found in the "supporting_code" folder.
Each R script in this project has a particular function:
01_start.R: This script loads the R packages used in the analysis, sets thedirectory, and imports custom functions for the project. You can also load in the main transit time (turnover) datasets here using the source() function.
02_functions.R: This script contains the custom function for this analysis, primarily to work with importing the seasonal transit data. Load this using the source() function in the 01_start.R script.
03_generate_data.R: This script is not necessary to run and is primarilyfor documentation. The main role of this code was to import and wranglethe data needed to calculate ground-based estimates of aboveground water storage.
04_annual_turnover_storage_import.R: This script imports the annual turnover andstorage data for each landcover type. You load in these data from the 01_start.R scriptusing the source() function.
05_minimum_turnover_storage_import.R: This script imports the minimum turnover andstorage data for each landcover type. Minimum is defined as the lowest monthlyestimate.You load in these data from the 01_start.R scriptusing the source() function.
06_figures_tables.R: This is the main workhouse for figure/table production and supporting analyses. This script generates the key figures and summary statistics used in the study that then get saved in the manuscript_figures folder. Note that allmaps were produced using Python code found in the "supporting_code"" folder.
Facebook
TwitterSimulation Data The waveplate.hdf5 file stores the results of the FDTD simulation that are visualized in Fig. 3 b)-d). The simulation was performed using the Tidy 3D Python library and also utilizes its methods for data visualization. The following snippet can be used to visualize the data: import tidy3d as td import matplotlib.pyplot as plt sim_data: td.SimulationData = td.SimulationData.from_file(f"waveplate.hdf5") fig, axs = plt.subplots(1, 2, tight_layout=True, figsize=(12, 5)) for fn, ax in zip(("Ex", "Ey"), axs): sim_data.plot_field("field_xz", field_name=fn, val="abs^2", ax=ax).set_aspect(1 / 10) ax.set_xlabel("x [$\mu$m]") ax.set_ylabel("z [$\mu$m]") fig.show() Measurement Data Signal data used for plotting Fig. 4-6. The data is stored in NetCDF providing self describing data format that is easy to manipulate using the Xarray Python library, specifically by calling xarray.open_dataset() Three datasets are provided and structured as follows: The electric_fields.nc dataset contains data displayed in Fig. 4. It has 3 data variables, corresponding to the signals themselves, as well as estimated Rabi frequencies and electric fields. The freq dimension is the x-axis and contains coordinates for the Probe field detuning in MHz. The n dimension labels different configurations of applied electric field, with the 0th one having no EHF field. The detune.nc dataset contains data displayed in Fig. 6. It has 2 data variables, corresponding to the signals themselves, as well as estimated peak separations, multiplied by the coupling factor. The freq dimension is the same, while the detune dimension labels different EHF field detunings, from -100 to 100 MHz with a step of 10. The waveplates.nc dataset contains data displayed in Fig. 5. It contains estimated Rabi frequencies calculated for different waveplate positions. The angles are stored in radians. There is the quarter- and half-waveplate to choose from. Usage examples Opening the dataset import matplotlib.pyplot as plt import xarray as xr electric_fields_ds = xr.open_dataset("data/electric_fields.nc") detuned_ds = xr.open_dataset("data/detune.nc") waveplates_ds = xr.open_dataset("data/waveplates.nc") sigmas_da = xr.open_dataarray("data/sigmas.nc") peak_heights_da = xr.open_dataarray("data/peak_heights.nc") Plotting the Fig. 4 signals and printing params fig, ax = plt.subplots() electric_fields_ds["signals"].plot.line(x="freq", hue="n", ax=ax) print(f"Rabi frequencies [Hz]: {electric_fields_ds['rabi_freqs'].values}") print(f"Electric fields [V/m]: {electric_fields_ds['electric_fields'].values}") fig.show() Plotting the Fig. 5 data (waveplates_ds["rabi_freqs"] ** 2).plot.scatter(x="angle", col="waveplate") Plotting the Fig. 6 signals for chosen detunes fig, ax = plt.subplots() detuned_ds["signals"].sel( detune=[ -100, -70, -40, 40, 70, 100, ] ).plot.line(x="freq", hue="detune", ax=ax) fig.show() Plotting the Fig. 6 inset plot fig, ax = plt.subplots() detuned_ds["separations"].plot.scatter(x="detune", ax=ax) ax.plot( detuned_ds.detune, np.sqrt(detuned_ds.detune**2 + detuned_ds["separations"].sel(detune=0) ** 2), ) fig.show() Plotting the Fig. 7 calculated peak widths sigmas_da.plot.scatter() Plotting the Fig. 8 calculated detuned smaller peak heights peak_heights_da.plot.scatter()
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.
Data is available in both NetCDF (.nc) and CSV (.csv) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset objects, enabling coordinate-based data selection.
Each dataset uses the following coordinate conventions:
The following data files are provided:
T summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISOAdditionally, two CSV files are provided for convenience:
imm: Total immigration flowsemi: Total emigration flowsnet: Net migrationimm_pop: Total immigrant population (non-native-born)emi_pop: Total emigrant population (living abroad)mig_prev: Total origin-destination flowsmig_brth: Total birth-destination flows, where Origin ISO reflects place of birthEach dataset includes a mean variable (mean estimate) and a std variable (standard deviation of the estimate).
An ISO3 conversion table is also provided.
Facebook
Twitterhttp://dcat-ap.ch/vocabulary/licenses/terms_byhttp://dcat-ap.ch/vocabulary/licenses/terms_by
This dataset contains the experimental results described in Bergfeld et al. (2025). It includes three Propagation Saw Test (PST) experiments, each approximately 9 m long, performed side-by-side on a 37° slope. For each PST, we provide the full field of view along the crack-propagation direction. For the second and third PSTs, we additionally provide close-up recordings focused on the weak layer where cracking occurred. All data are supplied as netCDF files containing displacement and strain measurements derived from Digital Image Correlation (DIC) analysis. Metadata describing dimensions and units are stored directly within the netCDF files. We recommend using the xarray package in Python to read and work with these datasets. All figures presented in Bergfeld et al. (2025) can be reproduced using the included Python scripts. Information about the snowpack is provided in PDF, pickle, and CAAML file formats.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Approximately two million thermal infrared spectra from the Emirates Mars Infrared Spectrometer (EMIRS) were utilized for this dataset [Ls ≈ 49° in Mars Year (MY) 36 through Ls ≈ 347° in MY 37; approx. May 2021–October 2024]. Results are stored as a self describing NetCDF4 file (https://www.unidata.ucar.edu/software/netcdf/).
The root group contains retrieval output for each spectrum. Variable descriptions are included in the file archive. Coordinate variables associated with the observation are latitude, longitude, solar longitude, local true solar time, emission angle of the observation, solar incidence angle of the observation, and the spacecraft clock timestamp when the observation was made. Output from the aerosol retrieval are stored as data variables and include the surface temperature, water-ice optical depth, water-ice optical depth uncertainty, surface anisothermality "slope" correction parameter, surface emissivity correction parameter, and estimated water condensation level. Spectra included in this dataset are those observations which passed quality checks for the water-ice optical depth retrieval.
Additional uncertainty weighted mean values for water-ice optical depth averaged across various coordinate dimensions are included as NetCDF4 groups. These groups correspond to selected figures shown in the accompanying manuscript for this dataset, and can be used to reconstruct the figures or compare against other datasets or results.
As an example, the full dataset can be opened using python and xarray (v2024.10.0 or greater) as an xarray DataTree with the following: import xarray as xr emirs_data = xr.open_datatree('EMIRS_tauice_dataset.nc')
Description of the retrieval and data in this archive can be found in the accompanying manuscript for this dataset: "The Full Diurnal Cycle of Mars Water-Ice Cloud Optical Depth in EMIRS Observations" by Samuel A. Atwood, Michael D. Smith, Michael J. Wolff, and Christopher S. Edwards, submitted to JGR-Planets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cropland Data Layer (CDL) data from the US Department of Agriculture's National Agricultural Statistics Service (NASS), subset spatially to cover the Snake River Basin, USA for years 2010-2017, inclusive. This data is the raw data used to support initialization of the Janus agent based model of land use land cover change. It was developed by downloading CDL data from the USDA NASS site for an area of interest encompassing the Snake River Basin for individual years from 2010-2017. Data were converted to a georeferenced GeoTiff format using the Geospatial Data Abstraction Library (GDAL) command line interface. They were then concatenated into a single dataset using the rioxarray python library and saved as a CF-compliant NetCDF4 file using the xarray python library. Note that this file is saved with zlib compression level 1 and, therefore, users may experience a slowdown upon initial reading of the file.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This database consists of a high-resolution village-level drought dataset for major Indian states for the past 43 years (1981 – 2022) for each month. It was created by utilising the CHIRPS precipitation and GLEAM evapotranspiration datasets. GLEAMS dataset based on the well recognised Priestley-Taylor equation to estimate potential evapotranspiration (PET) based on observations of surface net radiation and near-surface air temperature. The SPEI was calculated for spatial grids of 5x5 km for the SPEI 3-month time scale, suitable for agricultural drought monitoring.This high-resolution SPEI dataset was integrated with Indian village boundaries and associated census attribute dataset. This allows researchers to perform multi-disciplinary investigations, e.g., climate migration modelling, drought hazards, and exposure assessment. The development of the dataset has been performed while keeping potential users in mind. Therefore, the dataset can be integrated into a GIS system for visualization (using .mid/.mif format) and into Python programming for modelling and analysis (using .csv). For advanced analysis, I have also provided it in netCDF format, which can be read in Python using xarray or the netcdf4 library. More details are in the README.pdf file. Date Submitted: 2023-11-07 Issued: 2023-11-07
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
# ERA-NUTS (1980-2018)
This dataset contains a set of time-series of meteorological variables based on Copernicus Climate Change Service (C3S) ERA5 reanalysis. The data files can be downloaded from here while notebooks and other files can be found on the associated Github repository.
This data has been generated with the aim of providing hourly time-series of the meteorological variables commonly used for power system modelling and, more in general, studies on energy systems.
An example of the analysis that can be performed with ERA-NUTS is shown in this video.
Important: this dataset is still a work-in-progress, we will add more analysis and variables in the near-future. If you spot an error or something strange in the data please tell us sending an email or opening an Issue in the associated Github repository.
## Data
The time-series have hourly/daily/monthly frequency and are aggregated following the NUTS 2016 classification. NUTS (Nomenclature of Territorial Units for Statistics) is a European Union standard for referencing the subdivisions of countries (member states, candidate countries and EFTA countries).
This dataset contains NUTS0/1/2 time-series for the following variables obtained from the ERA5 reanalysis data (in brackets the name of the variable on the Copernicus Data Store and its unit measure):
- t2m: 2-meter temperature (`2m_temperature`, Celsius degrees)
- ssrd: Surface solar radiation (`surface_solar_radiation_downwards`, Watt per square meter)
- ssrdc: Surface solar radiation clear-sky (`surface_solar_radiation_downward_clear_sky`, Watt per square meter)
- ro: Runoff (`runoff`, millimeters)
There are also a set of derived variables:
- ws10: Wind speed at 10 meters (derived by `10m_u_component_of_wind` and `10m_v_component_of_wind`, meters per second)
- ws100: Wind speed at 100 meters (derived by `100m_u_component_of_wind` and `100m_v_component_of_wind`, meters per second)
- CS: Clear-Sky index (the ratio between the solar radiation and the solar radiation clear-sky)
- HDD/CDD: Heating/Cooling Degree days (derived by 2-meter temperature the EUROSTAT definition.
For each variable we have 350 599 hourly samples (from 01-01-1980 00:00:00 to 31-12-2019 23:00:00) for 34/115/309 regions (NUTS 0/1/2).
The data is provided in two formats:
- NetCDF version 4 (all the variables hourly and CDD/HDD daily). NOTE: the variables are stored as `int16` type using a `scale_factor` of 0.01 to minimise the size of the files.
- Comma Separated Value ("single index" format for all the variables and the time frequencies and "stacked" only for daily and monthly)
All the CSV files are stored in a zipped file for each variable.
## Methodology
The time-series have been generated using the following workflow:
1. The NetCDF files are downloaded from the Copernicus Data Store from the ERA5 hourly data on single levels from 1979 to present dataset
2. The data is read in R with the climate4r packages and aggregated using the function `/get_ts_from_shp` from panas. All the variables are aggregated at the NUTS boundaries using the average except for the runoff, which consists of the sum of all the grid points within the regional/national borders.
3. The derived variables (wind speed, CDD/HDD, clear-sky) are computed and all the CSV files are generated using R
4. The NetCDF are created using `xarray` in Python 3.7.
NOTE: air temperature, solar radiation, runoff and wind speed hourly data have been rounded with two decimal digits.
## Example notebooks
In the folder `notebooks` on the associated Github repository there are two Jupyter notebooks which shows how to deal effectively with the NetCDF data in `xarray` and how to visualise them in several ways by using matplotlib or the enlopy package.
There are currently two notebooks:
- exploring-ERA-NUTS: it shows how to open the NetCDF files (with Dask), how to manipulate and visualise them.
- ERA-NUTS-explore-with-widget: explorer interactively the datasets with [jupyter]() and ipywidgets.
The notebook `exploring-ERA-NUTS` is also available rendered as HTML.
## Additional files
In the folder `additional files`on the associated Github repository there is a map showing the spatial resolution of the ERA5 reanalysis and a CSV file specifying the number of grid points with respect to each NUTS0/1/2 region.
## License
This dataset is released under CC-BY-4.0 license.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data was prepared as input for the Selkie GIS-TE tool. This GIS tool aids site selection, logistics optimization and financial analysis of wave or tidal farms in the Irish and Welsh maritime areas. Read more here: https://www.selkie-project.eu/selkie-tools-gis-technoeconomic-model/
This research was funded by the Science Foundation Ireland (SFI) through MaREI, the SFI Research Centre for Energy, Climate and the Marine and by the Sustainable Energy Authority of Ireland (SEAI). Support was also received from the European Union's European Regional Development Fund through the Ireland Wales Cooperation Programme as part of the Selkie project.
File Formats
Results are presented in three file formats:
tif Can be imported into a GIS software (such as ARC GIS) csv Human-readable text format, which can also be opened in Excel png Image files that can be viewed in standard desktop software and give a spatial view of results
Input Data
All calculations use open-source data from the Copernicus store and the open-source software Python. The Python xarray library is used to read the data.
Hourly Data from 2000 to 2019
Wind -
Copernicus ERA5 dataset
17 by 27.5 km grid
10m wind speed
Wave - Copernicus Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis dataset 3 by 5 km grid
Accessibility
The maximum limits for Hs and wind speed are applied when mapping the accessibility of a site.
The Accessibility layer shows the percentage of time the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5) are below these limits for the month.
Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined by checking if
the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total number of hours for the month.
Environmental data is from the Copernicus data store (https://cds.climate.copernicus.eu/). Wave hourly data is from the 'Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis' dataset.
Wind hourly data is from the ERA 5 dataset.
Availability
A device's availability to produce electricity depends on the device's reliability and the time to repair any failures. The repair time depends on weather
windows and other logistical factors (for example, the availability of repair vessels and personnel.). A 2013 study by O'Connor et al. determined the
relationship between the accessibility and availability of a wave energy device. The resulting graph (see Fig. 1 of their paper) shows the correlation between
accessibility at Hs of 2m and wind speed of 15.0m/s and availability. This graph is used to calculate the availability layer from the accessibility layer.
The input value, accessibility, measures how accessible a site is for installation or operation and maintenance activities. It is the percentage time the
environmental conditions, i.e. the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5), are below operational limits.
Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined
by checking if the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total
number of hours for the month. Once the accessibility was known, the percentage availability was calculated using the O'Connor et al. graph of the relationship
between the two. A mature technology reliability was assumed.
Weather Window
The weather window availability is the percentage of possible x-duration windows where weather conditions (Hs, wind speed) are below maximum limits for the
given duration for the month.
The resolution of the wave dataset (0.05° × 0.05°) is higher than that of the wind dataset
(0.25° x 0.25°), so the nearest wind value is used for each wave data point. The weather window layer is at the resolution of the wave layer.
The first step in calculating the weather window for a particular set of inputs (Hs, wind speed and duration) is to calculate the accessibility at each timestep.
The accessibility is based on a simple boolean evaluation: are the wave and wind conditions within the required limits at the given timestep?
Once the time series of accessibility is calculated, the next step is to look for periods of sustained favourable environmental conditions, i.e. the weather
windows. Here all possible operating periods with a duration matching the required weather-window value are assessed to see if the weather conditions remain
suitable for the entire period. The percentage availability of the weather window is calculated based on the percentage of x-duration windows with suitable
weather conditions for their entire duration.The weather window availability can be considered as the probability of having the required weather window available
at any given point in the month.
Extreme Wind and Wave
The Extreme wave layers show the highest significant wave height expected to occur during the given return period. The Extreme wind layers show the highest wind speed expected to occur during the given return period.
To predict extreme values, we use Extreme Value Analysis (EVA). EVA focuses on the extreme part of the data and seeks to determine a model to fit this reduced
portion accurately. EVA consists of three main stages. The first stage is the selection of extreme values from a time series. The next step is to fit a model
that best approximates the selected extremes by determining the shape parameters for a suitable probability distribution. The model then predicts extreme values
for the selected return period. All calculations use the python pyextremes library. Two methods are used - Block Maxima and Peaks over threshold.
The Block Maxima methods selects the annual maxima and fits a GEVD probability distribution.
The peaks_over_threshold method has two variable calculation parameters. The first is the percentile above which values must be to be selected as extreme (0.9 or 0.998). The
second input is the time difference between extreme values for them to be considered independent (3 days). A Generalised Pareto Distribution is fitted to the selected
extremes and used to calculate the extreme value for the selected return period.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides global daily estimates of Root-Zone Soil Moisture (RZSM) content at 0.25° spatial grid resolution, derived from gap-filled merged satellite observations of 14 passive satellites sensors operating in the microwave domain of the electromagnetic spectrum. Data is provided from January 1991 to December 2023.
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/" target="_blank" rel="noopener">https://climate.esa.int/en/projects/soil-moisture/. Operational implementation is supported by the Copernicus Climate Change Service implemented by ECMWF through C3S2 312a/313c.
This dataset is used by Hirschi et al. (2025) to assess recent summer drought trends in Switzerland.
Hirschi, M., Michel, D., Schumacher, D. L., Preimesberger, W., and Seneviratne, S. I.: Recent summer soil moisture drying in Switzerland based on measurements from the SwissSMEX network, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2025-416, in review, 2025.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations from various microwave satellite remote sensing sensors (Dorigo et al., 2017, 2024; Gruber et al., 2019). This version of the dataset uses the PASSIVE record as input, which contains only observations from passive (radiometer) measurements (scaling reference AMSR-E). The surface observations are gap-filled using a univariate interpolation algorithm (Preimesberger et al., 2025). The gap-filled passive observations serve as input for an exponential filter based method to assess soil moisture in different layers of the root-zone of soil (0-200 cm) following the approach by Pasik et al. (2023). The final gap-free root-zone soil moisture estimates based on passive surface input data are provided here at 4 separate depth layers (0-10, 10-40, 40-100, 100-200 cm) over the period 1991-2023.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Downloads on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.ac.at/records/8dda4-xne96/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESA_CCI_PASSIVERZSM-YYYYMMDD000000-fv09.1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
Please see the ESA CCI Soil Moisture science data records community for more records based on ESA CCI SM.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supporting data for the paper "Satellite derived SO2 emissions from the relatively low-intensity, effusive 2021 eruption of Fagradalsfjall, Iceland" by Esse et al. The data files are in netCDF4 format, created using the Python xarray library. Each is a separate xarray Dataset.
2021-05-02_18403_Fagradalsfjall_results.nc contains the analysis results for TROPOMI orbit 18403 shown in Figure 2.
Fagradalsfjall_2021_emission_intensity.nc contains the SO2 emission intensity data shown in Figures 3, 4 and 5.
cloud_effective_altitude_difference.nc contains the daily cloud effective altitude difference shown in figure 6.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/" target="_blank" rel="noopener">https://climate.esa.int/en/projects/soil-moisture/
This dataset contains information on the Root Zone Soil Moisture (RZSM) content derived from satellite observations in the microwave domain.
The operational (ACTIVE, PASSIVE, COMBINED) ESA CCI SM products are available at https://catalogue.ceda.ac.uk/uuid/c256fcfeef24460ca6eb14bf0fe09572/ (Dorigo et al., 2017; Gruber et al., 2019; Preimesberger et al., 2021).
You can use command-line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.at/records/tqrwj-t7r58/files"
# Loop through years 1980 to 2024 and download & extract data
for year in {1980..2024}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1980-2024 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD) and month (MM) of that year in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name follows the convention:
ESACCI-SOILMOISTURE-L3S-RZSMV-COMBINED-YYYYMMDD000000-fv09.2.nc
Each netCDF file contains 3 coordinate variables
and the following data variables
Additional information for each variable are given in the netCDF attributes.
Changes in v9.2:
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
This record and all related records are part of the ESA CCI Soil Moisture science data records community.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data repository contains the accompanying data for the study by Stradiotti et al. (2025). Developed as part of the ESA Climate Change Initiative (CCI) Soil Moisture project. Project website: https://climate.esa.int/en/projects/soil-moisture/
This dataset was created as part of the following study, which contains a description of the algorithm and validation results.
Stradiotti, P., Gruber, A., Preimesberger, W., & Dorigo, W. (2025). Accounting for seasonal retrieval errors in the merging of multi-sensor satellite soil moisture products. Science of Remote Sensing, 12, 100242. https://doi.org/10.1016/j.srs.2025.100242
This repository contains the final, merged soil moisture and uncertainty values from Stradiotti et al. (2025), derived using a novel uncertainty quantification and merging scheme. In the accompanying study, we present a method to quantify the seasonal component of satellite soil moisture observations, based on Triple Collocation Analysis. Data from three independent satellite missions are used (from ASCAT, AMSR2, and SMAP). We observe consistent intra-annual variations in measurement uncertainties across all products (primarily caused by dynamics on the land surface such as seasonal vegetation changes), which affect the quality of the received signals. We then use these estimates to merge data from the three missions into a single consistent record, following the approach described by Dorigo et al. (2017). The new (seasonal) uncertainty estimates are propagated through the merging scheme, to enhance the uncertainty characterization of the final merged product provided here.
Evaluation against in situ data suggests that the estimated uncertainties of the new product are more representative of their true seasonal behaviour, compared to the previously used static approach. Based on these findings, we conclude that using a seasonal TCA approach can provide a more realistic characterization of dataset uncertainty, in particular its temporal variation. However, improvements in the merged soil moisture values are constrained, primarily due to correlated uncertainties among the sensors.
The dataset provides global daily gridded soil moisture estimates for the 2012-2023 period at 0.25° (~25 km) resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). All file names follow the naming convention:
L3S-SSMS-MERGED-SOILMOISTURE-YYYYMMDD000000-fv0.1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
After extracting the .nc files from the downloaded zip archived, they can read by any software that supports Climate and Forecast (CF) standard conform netCDF files, such as:
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/
Facebook
TwitterWe implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.