Facebook
TwitterThis dataset was created by TAG
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains data for three different experiments presented in the paper:
(1) moose_feet (40 files): The moose leg experiments are labeled as ax_y.nc,
where 'a' indicates attached digits and 'f' indicates free digits. The
number 'x' is either 1 (front leg) or 2 (hind leg), and the number 'y'
is an increment from 0 to 9 representing the 10 samples of each set.
(2) synthetic_feet (120 files): The synthetic feet experiments are labeled
as lw_a_y.nc, where 'lw' (Low Water content) can be replaced by 'mw'
(Medium Water content) or 'vw' (Vast Water content). The 'a' can be 'o'
(Original Go1 foot), 'r' (Rigid extended foot), 'f' (Free digits anisotropic
foot), or 'a' (Attached digits). Similar to (1), the last number is an increment from 0 to 9.
(3) Go1 (15 files): The locomotion experiments of the quadruped robot on the
track are labeled as condition_y.nc, where 'condition' is either 'hard_ground'
for experiments on hard ground, 'bioinspired_feet' for the locomotion of the
quadruped on mud using bio-inspired anisotropic feet, or 'original_feet' for
experiments where the robot used the original Go1 feet. The 'y' is an increment from 0 to 4.
The files for moose_feet and synthetic_feet contain timestamp (s), position (m), and force (N) data.
The files for Go1 contain timestamp (s), position (rad), velocity (rad/s), torque (Nm) data for all 12 motors, and the distance traveled by the robot (m).
All files can be read using xarray datasets (https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.
Data is available in both NetCDF (.nc) and CSV (.csv) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset objects, enabling coordinate-based data selection.
Each dataset uses the following coordinate conventions:
The following data files are provided:
T summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISOAdditionally, two CSV files are provided for convenience:
imm: Total immigration flowsemi: Total emigration flowsnet: Net migrationimm_pop: Total immigrant population (non-native-born)emi_pop: Total emigrant population (living abroad)mig_prev: Total origin-destination flowsmig_brth: Total birth-destination flows, where Origin ISO reflects place of birthEach dataset includes a mean variable (mean estimate) and a std variable (standard deviation of the estimate).
An ISO3 conversion table is also provided.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test datasets for use with xmitgcm.These data were generated by running mitgcm in different configurations. Each tar archive contain a folder full of mds *.data / *.meta files.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and code for "Spin-filtered measurements of Andreev Bound States"
van Driel, David; Wang, Guanzhong; Dvir, Tom
This folder contains the raw data and code used to generate the plots for the paper Spin-filtered measurements of Andreev Bound States (arXiv: ??).
To run the Jupyter notebook, install Anaconda and execute:
conda env create -f environment.yml
followed by:
conda activate spinABS
Finally,
jupyter notebook
to launch the notebook called 'zenodo_notebook.ipynb'.
Raw data are stored in netCDF (.nc) format. The files are exported by the data acquisition package QCoDeS and can be read as an xarray Dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test data for ASTE Release 1 integration with ECCOv4-py.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains the raw data and code used to generate the plots for the paper Singlet and triplet Cooper pair splitting in hybrid superconducting nanowires (arXiv: 2205.03458).
To run the Jupyter notebooks, install Anaconda and execute:
conda env create -f cps-exp.yml
followed by:
conda activate cps-exp
for the experiment data, or
conda env create -f cps-theory.yml
and similarly
conda activate cps-theory
for the theory plots. Finally,
jupyter notebook
to launch the corresponding notebook.
Raw data are stored in netCDF (.nc) format. The files are directly exported by the data acquisition package QCoDeS and can be read as an xarray Dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset contains the model output data that was used to create the figures of the study "The effects of a storm surge event on salt intrusion: Insights from the Rhine-Meuse Delta". The dataset includes:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset of Sentinel-1 radiometric terrain corrected (RTC) imagery processed by the Alaska Satellite Facility covering a region within the Central Himalaya. It accompanies a tutorial demonstrating accessing and working with Sentinel-1 RTC imagery using xarray and other open source python packages.
Facebook
TwitterRenamed the "Unindexed dimensions" section in the Dataset and DataArray repr (added in v0.9.0) to "Dimensions without coordinates".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Testing files for the xesmf remapping package.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
onemil1_1.nc is the train dataset.
onemil1_2.nc is the validation dataset.
onemil2.nc, p240.nc, and p390.nc are the test datasets.
These files are in .nc format; use xarray with Python to interface with them.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.
A description of this dataset, including the methodology and validation results, is available at:
Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.
ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.
You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.
#!/bin/bash
# Set download directory
DOWNLOAD_DIR=~/Downloads
base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"
# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done
The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:
ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc
Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:
Additional information for each variable is given in the netCDF attributes.
Changes in v9.1r1 (previous version was v09.1):
These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:
The following records are all part of the ESA CCI Soil Moisture science data records community
| 1 |
ESA CCI SM MODELFREE Surface Soil Moisture Record | <a href="https://doi.org/10.48436/svr1r-27j77" target="_blank" |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SHNITSEL
The Surface Hopping Nested Instances Training Set for Excited-State Learning (SHNITSEL) is a comprehensive data repository designed to support the development and benchmarking of excited-state dynamics methods.
Configuration Space
SHNITSEL contains datasets for nine organic molecules that represent a diverse range of photochemical behaviors. The following molecules are included in the dataset:
Alkenes: ethene (A01), propene (A02), 2-butene (A03)
Ring structures: fulvene (R01), 1,3-cyclohexadiene (R02), tyrosine (R03)
Other molecules: methylenimmonium cation (I01), methanethione (T01), diiodomethane (H01)
Property Space
These datasets provide key electronic properties for singlet and triplet states, including energies, forces, dipole moments, transition dipole moments, nonadiabatic couplings, and spin-orbit couplings, computed at the multi-reference ab initio level. The data is categorized into static and dynamic data, based on its origin and purpose.
Static data (#147,169 data points in total) consists of sampled molecular structures without time-dependent information, covering relevant vibrational and conformational spaces. These datasets are provided for eight molecules: A01, A02, A03, R01, R03, I01, T01, and H01
Dynamic data (#444,581 data points in total) originates from surface hopping simulations and captures the evolution of molecular structures and properties over time, as they propagate on potential energy surfaces according to Newton’s equations of motion. These datasets are provided for five molecules: A01, A02, A03, R02, and I01
Data Structure and Workflow
The data is stored in xarray format, using xarray.Dataset objects for efficient handling of multidimensional data. Key dimensions include electronic states, couplings, atoms, and time frames for dynamic data. The dataset is scalable and compatible with large datasets, stored in NetCDF4 format within HDF5 for optimal performance. Tools for data processing, visualization, and integration into machine learning workflows are provided by the shnitsel Python package published on Github (shnitsel-tools) .(https://github.com/SHNITSEL/shnitsel-tools).
An overview of the molecular structures and visualizations of key properties (from trajectory data) are compiled on the SHNITSEL webpage (https://shnitsel.github.io/).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
QLKNN11D training set
This dataset contains a large-scale run of ~1 billion flux calculations of the quasilinear gyrokinetic transport model QuaLiKiz. QuaLiKiz is applied in numerous tokamak integrated modelling suites, and is openly available at https://gitlab.com/qualikiz-group/QuaLiKiz/. This dataset was generated with the 'QLKNN11D-hyper' tag of QuaLiKiz, equivalent to 2.8.1 apart from the negative magnetic shear filter being disabled. See https://gitlab.com/qualikiz-group/QuaLiKiz/-/tags/QLKNN11D-hyper for the in-repository tag.
The dataset is appropriate for the training of learned surrogates of QuaLiKiz, e.g. with neural networks. See https://doi.org/10.1063/1.5134126 for a Physics of Plasmas publication illustrating the development of a learned surrogate (QLKNN10D-hyper) of an older version of QuaLiKiz (2.4.0) with a 300 million point 10D dataset. The paper is also available on arXiv https://arxiv.org/abs/1911.05617 and the older dataset on Zenodo https://doi.org/10.5281/zenodo.3497066. For an application example, see Van Mulders et al 2021 https://doi.org/10.1088/1741-4326/ac0d12, where QLKNN10D-hyper was applied for ITER hybrid scenario optimization. For any learned surrogates developed for QLKNN11D, the effective addition of the alphaMHD input dimension through rescaling the input magnetic shear (s) by s = s - alpha_MHD/2, as carried out in Van Mulders et al., is recommended.
Related repositories:
General QuaLiKiz documentation https://qualikiz.com
QuaLiKiz/QLKNN input/output variables naming scheme https://qualikiz.com/QuaLiKiz/Input-and-output-variables
Training, plotting, filtering, and auxiliary tools https://gitlab.com/Karel-van-de-Plassche/QLKNN-develop
QuaLiKiz related tools https://gitlab.com/qualikiz-group/QuaLiKiz-pythontools
FORTRAN QLKNN implementation with wrapper for Python and MATLAB https://gitlab.com/qualikiz-group/QLKNN-fortran
Weights and biases of 'hyperrectangle style' QLKNN https://gitlab.com/qualikiz-group/qlknn-hype
Data exploration
The data is provided in 43 netCDF files. We advise opening single datasets using xarray or multiple datasets out-of-core using dask. For reference, we give the load times and sizes of a single variable that just depends on the scan size dimx below. This was tested single-core on a Intel Xeon 8160 CPU at 2.1 GHz and 192 GB of DDR4 RAM. Note that during loading, more memory is needed than the final number.
Timing of dataset loading
Amount of datasets
Final in-RAM memory (GiB)
Loading time single var (M:SS)
1
10.3
0:09
5
43.9
1:00
10
63.2
2:01
16
98.0
3:25
17
Out Of Memory
x:xx
Full dataset
The full dataset of QuaLiKiz in-and-output data is available on request. Note that this is 2.2 TiB of netCDF files!
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This xarray dataset ('mvomstein_ds_vibrotactile_spatial_acuity.nc') contains high-resolution vibrotactile spatial acuity data collected from 33 participants across five large-area body sites (forearm, abdomen, lower back near-spine area (NSA), lower back peripheral-spine area (PSA), and thigh). The data were obtained using a fully automated experimental setup that employs Bayesian adaptive parameter estimation to generate continuous psychometric functions. The dataset supports the findings of the study "Data-driven design guide for vibrotactile display layouts by continuous mapping", which is currently under review at Scientific Reports (Nature Research).
Additionally, the supplementary files ('mvomstein_SI_[...].xlsx') provide statistical analyses for the figures presented in the manuscript, including detailed results for comparisons across body sites, anisotropy effects, and regional sensitivity gradients.
Facebook
TwitterSimulation Data The waveplate.hdf5 file stores the results of the FDTD simulation that are visualized in Fig. 3 b)-d). The simulation was performed using the Tidy 3D Python library and also utilizes its methods for data visualization. The following snippet can be used to visualize the data: import tidy3d as td import matplotlib.pyplot as plt sim_data: td.SimulationData = td.SimulationData.from_file(f"waveplate.hdf5") fig, axs = plt.subplots(1, 2, tight_layout=True, figsize=(12, 5)) for fn, ax in zip(("Ex", "Ey"), axs): sim_data.plot_field("field_xz", field_name=fn, val="abs^2", ax=ax).set_aspect(1 / 10) ax.set_xlabel("x [$\mu$m]") ax.set_ylabel("z [$\mu$m]") fig.show() Measurement Data Signal data used for plotting Fig. 4-6. The data is stored in NetCDF providing self describing data format that is easy to manipulate using the Xarray Python library, specifically by calling xarray.open_dataset() Three datasets are provided and structured as follows: The electric_fields.nc dataset contains data displayed in Fig. 4. It has 3 data variables, corresponding to the signals themselves, as well as estimated Rabi frequencies and electric fields. The freq dimension is the x-axis and contains coordinates for the Probe field detuning in MHz. The n dimension labels different configurations of applied electric field, with the 0th one having no EHF field. The detune.nc dataset contains data displayed in Fig. 6. It has 2 data variables, corresponding to the signals themselves, as well as estimated peak separations, multiplied by the coupling factor. The freq dimension is the same, while the detune dimension labels different EHF field detunings, from -100 to 100 MHz with a step of 10. The waveplates.nc dataset contains data displayed in Fig. 5. It contains estimated Rabi frequencies calculated for different waveplate positions. The angles are stored in radians. There is the quarter- and half-waveplate to choose from. Usage examples Opening the dataset import matplotlib.pyplot as plt import xarray as xr electric_fields_ds = xr.open_dataset("data/electric_fields.nc") detuned_ds = xr.open_dataset("data/detune.nc") waveplates_ds = xr.open_dataset("data/waveplates.nc") sigmas_da = xr.open_dataarray("data/sigmas.nc") peak_heights_da = xr.open_dataarray("data/peak_heights.nc") Plotting the Fig. 4 signals and printing params fig, ax = plt.subplots() electric_fields_ds["signals"].plot.line(x="freq", hue="n", ax=ax) print(f"Rabi frequencies [Hz]: {electric_fields_ds['rabi_freqs'].values}") print(f"Electric fields [V/m]: {electric_fields_ds['electric_fields'].values}") fig.show() Plotting the Fig. 5 data (waveplates_ds["rabi_freqs"] ** 2).plot.scatter(x="angle", col="waveplate") Plotting the Fig. 6 signals for chosen detunes fig, ax = plt.subplots() detuned_ds["signals"].sel( detune=[ -100, -70, -40, 40, 70, 100, ] ).plot.line(x="freq", hue="detune", ax=ax) fig.show() Plotting the Fig. 6 inset plot fig, ax = plt.subplots() detuned_ds["separations"].plot.scatter(x="detune", ax=ax) ax.plot( detuned_ds.detune, np.sqrt(detuned_ds.detune**2 + detuned_ds["separations"].sel(detune=0) ** 2), ) fig.show() Plotting the Fig. 7 calculated peak widths sigmas_da.plot.scatter() Plotting the Fig. 8 calculated detuned smaller peak heights peak_heights_da.plot.scatter()
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data accompanies the paper entitled Stabilizing or Destabilizing: Simulations of Chymotrypsin Inhibitor 2 under Crowding Reveal Existence of a Crossover Temperature (https://dx.doi.org/10.1021/acs.jpclett.0c03626). CI2_REST2.zip: The zip archive includes REST2 trajectories for the three systems investigated in the paper: dilute conditions, crowding by BSA, and crowding by lysozyme. The trajectories are saved in the GROMACS XTC file format, separately for each temperature (i=0,...,23). Given the large trajectory sizes, only protein coordinates (CI2 + crowder(s)) are reported, and the output frequency is reduced to 100 ps. A starting geometry (in the Gromos87 GRO format) after a short relaxation is provided for each REST2 simulation (conf_prot.gro). Moreover, for each REST2 simulation, an xarray (http://xarray.pydata.org) dataset, saved in the netCDF file format, is included with the following observables computed for CI2: fraction of native contacts relative to crystal structure, radius of gyration, secondary-structure content, fraction of native contacts evaluated separately for the alpha helix and the two beta strands.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supporting data for the paper "Satellite derived SO2 emissions from the relatively low-intensity, effusive 2021 eruption of Fagradalsfjall, Iceland" by Esse et al. The data files are in netCDF4 format, created using the Python xarray library. Each is a separate xarray Dataset.
2021-05-02_18403_Fagradalsfjall_results.nc contains the analysis results for TROPOMI orbit 18403 shown in Figure 2.
Fagradalsfjall_2021_emission_intensity.nc contains the SO2 emission intensity data shown in Figures 3, 4 and 5.
cloud_effective_altitude_difference.nc contains the daily cloud effective altitude difference shown in figure 6.
Facebook
TwitterThese datasets are from tidal resource characterization measurements collected on the Terrasond High Energy Oceanographic Mooring (THEOM) from 1 July 2021 to 30 August 2021 (60 days) in Cook Inlet, Alaska. The lander was deployed at 60.7207031 N, 151.4294998 W in ~50 m of water. The dataset contains raw and processed data from the following two instruments: A Nortek Signature 500 kHz acoustic Doppler current profiler (ADCP). Data were recorded in 4 Hz in the beam coordinate system from all 5 beams. Processed data has been averaged into 5 minutes bins and converted to the East-North-Up (ENU) coordinate system. A Nortek Vector acoustic Doppler velocimeter (ADV). Data were recorded at 8 Hz in the beam coordinate system. Processed data has been averaged into 5 minutes bins and converted to the Streamwise - Cross-stream - Vertical (Principal) coordinate system. Turbulence statistics were calculated from 5-minute bins, with an FFT length equal to the bin length, and saved in the processed dataset. Data was read and analyzed using the DOLfYN (version 1.0.2) python package and saved in MATLAB (.mat) and netCDF (.nc) file formats. Files containing analyzed data (".b1") were standardized using the TSDAT (version 0.4.2) python package. NetCDF files can be opened using DOLfYN (e.g., dat = dolfyn.load(''*.nc")) or the xarray python package (e.g. `dat = xarray.open_dataset("*.nc"). All distances are in meters (e.g., depth, range, etc), and all velocities in m/s. See the DOLfYN documentation linked in the submission, and/or the Nortek documentation for additional details.
Facebook
TwitterThis dataset was created by TAG