34 datasets found
  1. d

    (HS 2) Automate Workflows using Jupyter notebook to create Large Extent...

    • search.dataone.org
    • hydroshare.org
    Updated Oct 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Young-Don Choi (2024). (HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets [Dataset]. http://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad
    Explore at:
    Dataset updated
    Oct 19, 2024
    Dataset provided by
    Hydroshare
    Authors
    Young-Don Choi
    Description

    We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.

  2. f

    xmitgcm test datasets

    • figshare.com
    application/gzip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Abernathey (2023). xmitgcm test datasets [Dataset]. http://doi.org/10.6084/m9.figshare.4033530.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Ryan Abernathey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Test datasets for use with xmitgcm.These data were generated by running mitgcm in different configurations. Each tar archive contain a folder full of mds *.data / *.meta files.

  3. Data from: Deep learning four decades of human migration: datasets

    • zenodo.org
    csv, nc
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel (2025). Deep learning four decades of human migration: datasets [Dataset]. http://doi.org/10.5281/zenodo.17344747
    Explore at:
    csv, ncAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.

    Data is available in both NetCDF (.nc) and CSV (.csv) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset objects, enabling coordinate-based data selection.

    Each dataset uses the following coordinate conventions:

    • Year: 1990–2023
    • Birth ISO: Country of birth (UN ISO3)
    • Origin ISO: Country of origin (UN ISO3)
    • Destination ISO: Destination country (UN ISO3)
    • Country ISO: Used for net migration data (UN ISO3)

    The following data files are provided:

    • T.nc: Full table of flows disaggregated by country of birth. Dimensions: Year, Birth ISO, Origin ISO, Destination ISO
    • flows.nc: Total origin-destination flows (equivalent to T summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISO
    • net_migration.nc: Net migration data by country. Dimensions: Year, Country ISO
    • stocks.nc: Stock estimates for each country pair. Dimensions: Year, Origin ISO (corresponding to Birth ISO), Destination ISO
    • test_flows.nc: Flow estimates on a randomly selected set of test edges, used for model validation

    Additionally, two CSV files are provided for convenience:

    • mig_unilateral.csv: Unilateral migration estimates per country, comprising:
      • imm: Total immigration flows
      • emi: Total emigration flows
      • net: Net migration
      • imm_pop: Total immigrant population (non-native-born)
      • emi_pop: Total emigrant population (living abroad)
    • mig_bilateral.csv: Bilateral flow data, comprising:
      • mig_prev: Total origin-destination flows
      • mig_brth: Total birth-destination flows, where Origin ISO reflects place of birth

    Each dataset includes a mean variable (mean estimate) and a std variable (standard deviation of the estimate).

    An ISO3 conversion table is also provided.

  4. t

    ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...

    • researchdata.tuwien.ac.at
    • researchdata.tuwien.at
    zip
    Updated Sep 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo (2025). ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations [Dataset]. http://doi.org/10.48436/3fcxr-cde10
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 5, 2025
    Dataset provided by
    TU Wien
    Authors
    Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/

    This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.

    Dataset Paper (Open Access)

    A description of this dataset, including the methodology and validation results, is available at:

    Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.

    Abstract

    ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
    However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
    Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.

    Summary

    • Gap-filled global estimates of volumetric surface soil moisture from 1991-2023 at 0.25° sampling
    • Fields of application (partial): climate variability and change, land-atmosphere interactions, global biogeochemical cycles and ecology, hydrological and land surface modelling, drought applications, and meteorology
    • Method: Modified version of DCT-PLS (Garcia, 2010) interpolation/smoothing algorithm, linear interpolation over periods of frozen soils. Uncertainty estimates are provided for all data points.
    • More information: See Preimesberger et al. (2025) and https://doi.org/10.5281/zenodo.8320869" target="_blank" rel="noopener">ESA CCI SM Algorithm Theoretical Baseline Document [Chapter 7.2.9] (Dorigo et al., 2023)

    Programmatic Download

    You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.

    #!/bin/bash

    # Set download directory
    DOWNLOAD_DIR=~/Downloads

    base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"

    # Loop through years 1991 to 2023 and download & extract data
    for year in {1991..2023}; do
    echo "Downloading $year.zip..."
    wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
    unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
    rm "$DOWNLOAD_DIR/$year.zip"
    done

    Data details

    The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:

    ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc

    Data Variables

    Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:

    • sm: (float) The Soil Moisture variable reflects estimates of daily average volumetric soil moisture content (m3/m3) in the soil surface layer (~0-5 cm) over a whole grid cell (0.25 degree).
    • sm_uncertainty: (float) The Soil Moisture Uncertainty variable reflects the uncertainty (random error) of the original satellite observations and of the predictions used to fill observation data gaps.
    • sm_anomaly: Soil moisture anomalies (reference period 1991-2020) derived from the gap-filled values (`sm`)
    • sm_smoothed: Contains DCT-PLS predictions used to fill data gaps in the original soil moisture field. These values are also provided for cases where an observation was initially available (compare `gapmask`). In this case, they provided a smoothed version of the original data.
    • gapmask: (0 | 1) Indicates grid cells where a satellite observation is available (1), and where the interpolated (smoothed) values are used instead (0) in the 'sm' field.
    • frozenmask: (0 | 1) Indicates grid cells where ERA5 soil temperature is <0 °C. In this case, a linear interpolation over time is applied.

    Additional information for each variable is given in the netCDF attributes.

    Version Changelog

    Changes in v9.1r1 (previous version was v09.1):

    • This version uses a novel uncertainty estimation scheme as described in Preimesberger et al. (2025).

    Software to open netCDF files

    These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:

    References

    • Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: an independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data, 17, 4305–4329, https://doi.org/10.5194/essd-17-4305-2025, 2025.
    • Dorigo, W., Preimesberger, W., Stradiotti, P., Kidd, R., van der Schalie, R., van der Vliet, M., Rodriguez-Fernandez, N., Madelon, R., & Baghdadi, N. (2023). ESA Climate Change Initiative Plus - Soil Moisture Algorithm Theoretical Baseline Document (ATBD) Supporting Product Version 08.1 (version 1.1). Zenodo. https://doi.org/10.5281/zenodo.8320869
    • Garcia, D., 2010. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 54(4), pp.1167-1178. Available at: https://doi.org/10.1016/j.csda.2009.09.020
    • Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C.-J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., and Toll, D.: The Global Land Data Assimilation System, Bulletin of the American Meteorological Society, 85, 381 – 394, https://doi.org/10.1175/BAMS-85-3-381, 2004.

    Related Records

    The following records are all part of the ESA CCI Soil Moisture science data records community

    1

    ESA CCI SM MODELFREE Surface Soil Moisture Record

    <a href="https://doi.org/10.48436/svr1r-27j77" target="_blank"

  5. RibonanzaNet-Drop Train, Val, and Test Data

    • kaggle.com
    zip
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamish Blair (2024). RibonanzaNet-Drop Train, Val, and Test Data [Dataset]. https://www.kaggle.com/datasets/hmblair/ribonanzanet-drop-train-val-and-test-data
    Explore at:
    zip(402567233 bytes)Available download formats
    Dataset updated
    Feb 19, 2024
    Authors
    Hamish Blair
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    onemil1_1.nc is the train dataset. onemil1_2.nc is the validation dataset. onemil2.nc, p240.nc, and p390.nc are the test datasets.

    These files are in .nc format; use xarray with Python to interface with them.

  6. Z

    Storage and Transit Time Data and Code

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8136816
    Explore at:
    Dataset updated
    Jun 12, 2024
    Dataset provided by
    Montana State University
    Authors
    Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Author: Andrew J. FeltonDate: 5/5/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably in this project.

    Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/annual/multi_year_average/average_annual_turnover.nc" contains a global array summarizing five year (2016-2020) averages of annual transit, storage, canopy transpiration, and number of months of data. This is the core dataset for the analysis; however, each folder has much more data, including a dataset for each year of the analysis. Data are also available is separate .csv files for each land cover type. Oterh data can be found for the minimum, monthly, and seasonal transit time found in their respective folders. These data were produced using the python code found in the "supporting_code" folder given the ease of working with .nc and EASE grid in the xarray python module. R was used primarily for data visualization purposes. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here.

    Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a particular function:

    01_start.R: This script loads the R packages used in the analysis, sets thedirectory, and imports custom functions for the project. You can also load in the main transit time (turnover) datasets here using the source() function.

    02_functions.R: This script contains the custom function for this analysis, primarily to work with importing the seasonal transit data. Load this using the source() function in the 01_start.R script.

    03_generate_data.R: This script is not necessary to run and is primarilyfor documentation. The main role of this code was to import and wranglethe data needed to calculate ground-based estimates of aboveground water storage.

    04_annual_turnover_storage_import.R: This script imports the annual turnover andstorage data for each landcover type. You load in these data from the 01_start.R scriptusing the source() function.

    05_minimum_turnover_storage_import.R: This script imports the minimum turnover andstorage data for each landcover type. Minimum is defined as the lowest monthlyestimate.You load in these data from the 01_start.R scriptusing the source() function.

    06_figures_tables.R: This is the main workhouse for figure/table production and supporting analyses. This script generates the key figures and summary statistics used in the study that then get saved in the manuscript_figures folder. Note that allmaps were produced using Python code found in the "supporting_code"" folder.

  7. d

    Data from: Tidal Energy Resource Characterization, Bottom Lander...

    • catalog.data.gov
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Renewable Energy Laboratory (2025). Tidal Energy Resource Characterization, Bottom Lander Measurements, Cook Inlet, AK, 2021 [Dataset]. https://catalog.data.gov/dataset/tidal-energy-resource-characterization-bottom-lander-measurements-cook-inlet-ak-2021-7c225
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    National Renewable Energy Laboratory
    Area covered
    Cook Inlet
    Description

    These datasets are from tidal resource characterization measurements collected on the Terrasond High Energy Oceanographic Mooring (THEOM) from 1 July 2021 to 30 August 2021 (60 days) in Cook Inlet, Alaska. The lander was deployed at 60.7207031 N, 151.4294998 W in ~50 m of water. The dataset contains raw and processed data from the following two instruments: A Nortek Signature 500 kHz acoustic Doppler current profiler (ADCP). Data were recorded in 4 Hz in the beam coordinate system from all 5 beams. Processed data has been averaged into 5 minutes bins and converted to the East-North-Up (ENU) coordinate system. A Nortek Vector acoustic Doppler velocimeter (ADV). Data were recorded at 8 Hz in the beam coordinate system. Processed data has been averaged into 5 minutes bins and converted to the Streamwise - Cross-stream - Vertical (Principal) coordinate system. Turbulence statistics were calculated from 5-minute bins, with an FFT length equal to the bin length, and saved in the processed dataset. Data was read and analyzed using the DOLfYN (version 1.0.2) python package and saved in MATLAB (.mat) and netCDF (.nc) file formats. Files containing analyzed data (".b1") were standardized using the TSDAT (version 0.4.2) python package. NetCDF files can be opened using DOLfYN (e.g., dat = dolfyn.load(''*.nc")) or the xarray python package (e.g. `dat = xarray.open_dataset("*.nc"). All distances are in meters (e.g., depth, range, etc), and all velocities in m/s. See the DOLfYN documentation linked in the submission, and/or the Nortek documentation for additional details.

  8. Data and code used in "Distinct Lithologies in Jezero Crater, Mars"

    • zenodo.org
    zip
    Updated Apr 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allison Zastrow; Allison Zastrow; Timothy Glotch; Timothy Glotch (2021). Data and code used in "Distinct Lithologies in Jezero Crater, Mars" [Dataset]. http://doi.org/10.5281/zenodo.4408577
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 19, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Allison Zastrow; Allison Zastrow; Timothy Glotch; Timothy Glotch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the datasets and code used in the Zastrow and Glotch manuscript entitled "Distinct Lithologies in Jezero Crater, Mars".

    Abstract

    Jezero crater is the landing site for the Mars 2020 Perseverance rover. The Noachian-aged crater has undergone several periods of fluvial and lacustrine activity and many phyllosilicate- and carbonate-bearing rocks formed as a result. It also contains a large portion of the regional Nili Fossae olivine-carbonate unit. In this work, we performed spectral mixture analysis of visible/near-infrared hyperspectral imagery over Jezero. We modeled carbonate abundances up to ~35% and identified three distinct units containing different carbonate phases. Our work also suggests that the olivine in the regional unit is largely restricted to aeolian deposis overlying the carbonate-bearing rocks. The diversity of carbonate phases in Jezero points to multiple periods of carbonate formation under varying conditions.

    Description of Repository Datasets

    code_unmixing.zip:

    • This folder contains the necessary files to unmix CRISM data using davinci's sma() function. The model is run by launching davinci and entering source('ssa_unmixing_040ff'). To switch the CRISM image, edit line 10 in ssa_unmixing_040ff.
    • Unmixing code: ssa_unmixing_040ff
    • Davinci functions: unmixing_fxns.dvrc
    • Spectral library: n_k_jezeropaper.hdf

    data_input.zip:

    • This folder contains ENVI image files that have been DISORT-processed and are ready to be modeled using the unmixing code. Each of the 3 CRISM images has two data files and ENVI headers: 1) the full uncut image cube and 2) the cut image cube.

    data_models.zip:

    • This folder contains NetCDF files with the output of the unmixing model that have been registered to the HiRISE base image. Files contain the input CRISM measured spectra, output modeled spectra, normalized mineral concentration maps, and RMS error maps. Data were written to file using the xarray Python package and can be read into python using the xarray.open_dataset() function.
    • No Siderite, Added Kaolinite, Full Spectrum
      • These folders contain unregistered (but projected) model output for these three additional models.

    data_roi.zip:

    • This folder contains the shapefile for the ROI mapping.
  9. D

    Data for hybrid quantum memory leveraging slow-light and gradient-echo...

    • danebadawcze.uw.edu.pl
    nc
    Updated Sep 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kurzyna, Stanisław (2025). Data for hybrid quantum memory leveraging slow-light and gradient-echo duality experiment [Dataset]. http://doi.org/10.58132/VJVN4P
    Explore at:
    nc(4824944), nc(1116), nc(194056), nc(1996), nc(173328)Available download formats
    Dataset updated
    Sep 4, 2025
    Dataset provided by
    Dane Badawcze UW
    Authors
    Kurzyna, Stanisław
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    Ministerstwo Nauki i Szkolnictwa Wyższego
    National Science Centre (Poland)
    Foundation for Polish Science
    Description

    For this experiment we measured all of the data using heterodyne detection.File "fig2" contains the data for the expeiment performed by storing the light in the GEM and then reading it out with EIT and gaussian functions fitted to the collecte time traces.Files "fig3_a" and "fig3_c_sim" contains the parameters of the gaussian functions fitted to the experimental time traces and simulated time traces respectively.File "fig4" contains fourier transforms of the experimentally measured time traces for impulses stored in EIT and readout in GEM.File "fig5" contains time traces for impulse with two frequencies stored in GEM and readout in EIT with fitted gaussian functions and fourier transform of time traces for two impulse stored in EIT and readout in GEM with fitted gaussian functions.Files were generated with Xarray (v2025.3.1) Python library. Files are in the HDF5 format. Files can be loaded uisng Xarray function "xarray.load_dataset". We analyzed the data using Python programming language. Data for theoretical model were created using Python.The “Quantum Optical Technologies” (FENG.02.01-IP.05-0017/23) project is carried out within the Measure 2.1 International Research Agendas programme of the Foundation for Polish Science, co-financed by the European Union under the European Funds for Smart Economy 2021--2027 (FENG). This research was funded in whole or in part by the National Science Centre, Poland, grant no. 2024/53/B/ST2/04040. Publication co-financed from the state budget funds (Poland), awarded by the Minister of Science under the “Perły Nauki II” program, project No. PN/02/0027/2023, co-financing amount PLN 239,998.00, total project value PLN 239,998.00.

  10. Data from: Community Earth System Model v2 Large Ensemble (CESM2 LENS) Zarr...

    • gdex.ucar.edu
    • ckanprod.data-commons.k8s.ucar.edu
    • +1more
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gokhan Danabasoglu; Clara Deser; Keith Rodgers Axel Timmermann (2024). Community Earth System Model v2 Large Ensemble (CESM2 LENS) Zarr Subset [Dataset]. https://gdex.ucar.edu/datasets/d010092/
    Explore at:
    Dataset updated
    Nov 11, 2024
    Dataset provided by
    National Science Foundationhttp://www.nsf.gov/
    Authors
    Gokhan Danabasoglu; Clara Deser; Keith Rodgers Axel Timmermann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1850 - Dec 31, 2014
    Description

    The US National Center for Atmospheric Research partnered with the IBS Center for Climate Physics in South Korea to generate the CESM2 Large Ensemble which consists of 100 ensemble members at 1 degree spatial resolution covering the period 1850-2100 under CMIP6 historical and SSP370 future radiative forcing scenarios. Data sets from this ensemble were made downloadable via the Climate Data Gateway on June 14, 2021. NCAR has copied a subset (currently ~500 TB) of CESM2 LENS data to Amazon S3 as part of the AWS Public Datasets Program. To optimize for large-scale analytics we have represented the data as ~275 Zarr stores format accessible through the Python Xarray library. Each Zarr store contains a single physical variable for a given model run type and temporal frequency (monthly, daily).

  11. Z

    Selkie GIS Techno-Economic Tool input datasets

    • data.niaid.nih.gov
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cullinane, Margaret (2023). Selkie GIS Techno-Economic Tool input datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10083960
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    University College Cork
    Authors
    Cullinane, Margaret
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data was prepared as input for the Selkie GIS-TE tool. This GIS tool aids site selection, logistics optimization and financial analysis of wave or tidal farms in the Irish and Welsh maritime areas. Read more here: https://www.selkie-project.eu/selkie-tools-gis-technoeconomic-model/

    This research was funded by the Science Foundation Ireland (SFI) through MaREI, the SFI Research Centre for Energy, Climate and the Marine and by the Sustainable Energy Authority of Ireland (SEAI). Support was also received from the European Union's European Regional Development Fund through the Ireland Wales Cooperation Programme as part of the Selkie project.

    File Formats

    Results are presented in three file formats:

    tif Can be imported into a GIS software (such as ARC GIS) csv Human-readable text format, which can also be opened in Excel png Image files that can be viewed in standard desktop software and give a spatial view of results

    Input Data

    All calculations use open-source data from the Copernicus store and the open-source software Python. The Python xarray library is used to read the data.

    Hourly Data from 2000 to 2019

    • Wind - Copernicus ERA5 dataset 17 by 27.5 km grid
      10m wind speed

    • Wave - Copernicus Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis dataset 3 by 5 km grid

    Accessibility

    The maximum limits for Hs and wind speed are applied when mapping the accessibility of a site.
    The Accessibility layer shows the percentage of time the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5) are below these limits for the month.

    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined by checking if
    the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total number of hours for the month.

    Environmental data is from the Copernicus data store (https://cds.climate.copernicus.eu/). Wave hourly data is from the 'Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis' dataset.
    Wind hourly data is from the ERA 5 dataset.

    Availability

    A device's availability to produce electricity depends on the device's reliability and the time to repair any failures. The repair time depends on weather
    windows and other logistical factors (for example, the availability of repair vessels and personnel.). A 2013 study by O'Connor et al. determined the
    relationship between the accessibility and availability of a wave energy device. The resulting graph (see Fig. 1 of their paper) shows the correlation between accessibility at Hs of 2m and wind speed of 15.0m/s and availability. This graph is used to calculate the availability layer from the accessibility layer.

    The input value, accessibility, measures how accessible a site is for installation or operation and maintenance activities. It is the percentage time the
    environmental conditions, i.e. the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5), are below operational limits.
    Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined
    by checking if the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total
    number of hours for the month. Once the accessibility was known, the percentage availability was calculated using the O'Connor et al. graph of the relationship between the two. A mature technology reliability was assumed.

    Weather Window

    The weather window availability is the percentage of possible x-duration windows where weather conditions (Hs, wind speed) are below maximum limits for the
    given duration for the month.

    The resolution of the wave dataset (0.05° × 0.05°) is higher than that of the wind dataset
    (0.25° x 0.25°), so the nearest wind value is used for each wave data point. The weather window layer is at the resolution of the wave layer.

    The first step in calculating the weather window for a particular set of inputs (Hs, wind speed and duration) is to calculate the accessibility at each timestep.
    The accessibility is based on a simple boolean evaluation: are the wave and wind conditions within the required limits at the given timestep?

    Once the time series of accessibility is calculated, the next step is to look for periods of sustained favourable environmental conditions, i.e. the weather
    windows. Here all possible operating periods with a duration matching the required weather-window value are assessed to see if the weather conditions remain
    suitable for the entire period. The percentage availability of the weather window is calculated based on the percentage of x-duration windows with suitable
    weather conditions for their entire duration.The weather window availability can be considered as the probability of having the required weather window available
    at any given point in the month.

    Extreme Wind and Wave

    The Extreme wave layers show the highest significant wave height expected to occur during the given return period. The Extreme wind layers show the highest wind speed expected to occur during the given return period.

    To predict extreme values, we use Extreme Value Analysis (EVA). EVA focuses on the extreme part of the data and seeks to determine a model to fit this reduced
    portion accurately. EVA consists of three main stages. The first stage is the selection of extreme values from a time series. The next step is to fit a model
    that best approximates the selected extremes by determining the shape parameters for a suitable probability distribution. The model then predicts extreme values
    for the selected return period. All calculations use the python pyextremes library. Two methods are used - Block Maxima and Peaks over threshold.

    The Block Maxima methods selects the annual maxima and fits a GEVD probability distribution.

    The peaks_over_threshold method has two variable calculation parameters. The first is the percentile above which values must be to be selected as extreme (0.9 or 0.998). The second input is the time difference between extreme values for them to be considered independent (3 days). A Generalised Pareto Distribution is fitted to the selected
    extremes and used to calculate the extreme value for the selected return period.

  12. g

    cmomy: A python package to calculate and manipulate Central (co)moments. |...

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cmomy: A python package to calculate and manipulate Central (co)moments. | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_cmomy-a-python-package-to-calculate-and-manipulate-central-comoments/
    Explore at:
    Description

    cmomy is a python package to calculate central moments and co-moments in a numerical stable and direct way. Behind the scenes, cmomy makes use of Numba to rapidly calculate moments. cmomy provides utilities to calculate central moments from individual samples, precomputed central moments, and precomputed raw moments. It also provides routines to perform bootstrap resampling based on raw data, or precomputed moments. cmomy has numpy array and xarray DataArray interfaces.

  13. d

    Cape EGS and Utah FORGE: Empirical 3D Seismic Velocity Model

    • catalog.data.gov
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lawrence Berkeley National Laboratory (2025). Cape EGS and Utah FORGE: Empirical 3D Seismic Velocity Model [Dataset]. https://catalog.data.gov/dataset/cape-egs-and-utah-forge-empirical-3d-seismic-velocity-model
    Explore at:
    Dataset updated
    Nov 18, 2025
    Dataset provided by
    Lawrence Berkeley National Laboratory
    Description

    This dataset provides an empirical three-dimensional P- and S-wave velocity model covering a 30 x 30 km area and extending to 10 km depth around the Cape Modern EGS and Utah FORGE sites. It incorporates three-dimensional topography and a sediment/basement contact derived from geophysical and geological datasets collected by Utah FORGE or Fervo Energy. Basin velocities were estimated from a logarithmic fit to borehole velocity logs, while basement velocities were assigned constant values of 5.8 km/s for Vp and 3.392 km/s for Vs. No geophysical data inversion was performed in constructing this model. The dataset includes a manuscript describing the methods used to develop the model. The velocity model is provided in NetCDF format. Users will need software capable of reading NetCDF files. Common scientific libraries support this format without requiring proprietary tools. We recommend the Xarray package for Python, where methods like open_dataset() and .sel().plot() can be used to read and plot the data.

  14. Dataset for "Adjoint Waveform Tomography for Crustal and Upper Mantle...

    • zenodo.org
    bin, csv, nc
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arthur Rodgers; Arthur Rodgers (2023). Dataset for "Adjoint Waveform Tomography for Crustal and Upper Mantle Structure the Middle East and Southwest Asia for Improved Waveform Simulations Using Openly Available Broadband Data" [Dataset]. http://doi.org/10.5281/zenodo.8212589
    Explore at:
    csv, bin, ncAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Arthur Rodgers; Arthur Rodgers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Middle East
    Description

    This dataset contains the MESWA (Middle East and Southwest Asia) seismic model and auxiliary data used in the creation of the model (Rodgers, 2023). MESWA is a three-dimensional model of the seismic properties of crust and upper mantle of the Middle East and Southwest Asia. The MESWA model is provided in NetCDF format (readable by for example, xarray, Hoyer & Hamman, 2017) and HDF5 format for viewing with ParaView (Ahrens et al., 2005) and interaction with Salvus (Afanasiev et al., 2019).

    Also included are the earthquake source parameters for all 327 Global Centroid Moment Tensor events considered in this study in ASCII text format. Also included are lists of the selected 192 inversion events and 66 validation events in ASCII text format. Lastly, we include a list of all receivers used in the creation and validation of MESWA. This is a simple ASCII file with the event name and receiver name (composed of the network_code and station_code).

    The following table provides a listing of the files in the dataset:

    File

    Description

    MESWA.nc

    MESWA model in NetCDF format

    MESWA.h5

    MESWA model in HDF5 format, used by Salvus

    MESWA.xmdf

    Auxiliary file for MESWA.h5, used to import model into Paraview

    events_project.csv

    Table of event source parameters for all 327 events considered in the project

    inversion_events_192.csv

    Table of 192 inversion events

    (ASCII comma separated value)

    validation_events_66.csv

    Table of 66 validation events

    (ASCII comma separated value)

    events_receivers_inversion.csv

    Table of waveform (event-receiver-channel) data used in the inversion (ASCII comma separated value)

    events_receivers_validation.csv

    Table of waveform (event-receiver-channel) data used in the validation (ASCII comma separated value)

    References

    Afanasiev, M, C Boehm, M van Driel, L Krischer, M Rietmann, DA May, MG Knepley, and A Fichtner (2019). Modular and flexible spectral-element waveform modelling in two and three dimensions, Geophys. J. Int., 216(3), 1675–1692, doi: 10.1093/gji/ggy469

    Ahrens, J., Geveci, B., & Law, C. (2005). Paraview: An end-user tool for large data visualization. The Visualization Handbook, 717(8). https://doi.org/10.1016/b978-012387582-2/50038-1

    Hoyer, S., & Hamman, J. (2017). Xarray: N-D labeled arrays and datasets in Python. Journal of Open Research Software, 5(1). https://doi.org/10.5334/jors.148

    Rodgers, A. (2023). Adjoint Waveform Tomography for Crustal and Upper Mantle Structure the Middle East and Southwest Asia for Improved Waveform Simulations Using Openly Available Broadband Data, technical report, LLNL-TR- 851939.

    Acknowledgements

    This project was support by Lawrence Livermore National Laboratory’s Laboratory Directed Research and Development project 20-ERD-008 and the National Nuclear Security Administration. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-MI-852402

  15. Z

    Cropland Data Layer Data for the Snake River Basin, USA, 2010-2017

    • data.niaid.nih.gov
    Updated Jul 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro N. Flores; Kendra E. Kaiser; Vicken Hillis (2020). Cropland Data Layer Data for the Snake River Basin, USA, 2010-2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3958226
    Explore at:
    Dataset updated
    Jul 28, 2020
    Dataset provided by
    Boise State University
    Authors
    Alejandro N. Flores; Kendra E. Kaiser; Vicken Hillis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Snake River, United States
    Description

    Cropland Data Layer (CDL) data from the US Department of Agriculture's National Agricultural Statistics Service (NASS), subset spatially to cover the Snake River Basin, USA for years 2010-2017, inclusive. This data is the raw data used to support initialization of the Janus agent based model of land use land cover change. It was developed by downloading CDL data from the USDA NASS site for an area of interest encompassing the Snake River Basin for individual years from 2010-2017. Data were converted to a georeferenced GeoTiff format using the Geospatial Data Abstraction Library (GDAL) command line interface. They were then concatenated into a single dataset using the rioxarray python library and saved as a CF-compliant NetCDF4 file using the xarray python library. Note that this file is saved with zlib compression level 1 and, therefore, users may experience a slowdown upon initial reading of the file.

  16. D

    India DroughtSet: A village-level drought dataset for the past 43 years

    • phys-techsciences.datastations.nl
    • ssh.datastations.nl
    application/netcdf +5
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T Pareek; T Pareek (2023). India DroughtSet: A village-level drought dataset for the past 43 years [Dataset]. http://doi.org/10.17026/DANS-XFT-EPRJ
    Explore at:
    zip(20083), mid(163767407), mid(682839633), mif(1721580), pdf(204411), application/netcdf(79177037), application/netcdf(100224077), csv(266091148), mid(362607737), csv(563572003), mif(786781), mif(3215211)Available download formats
    Dataset updated
    Nov 9, 2023
    Dataset provided by
    DANS Data Station Physical and Technical Sciences
    Authors
    T Pareek; T Pareek
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    This database consists of a high-resolution village-level drought dataset for major Indian states for the past 43 years (1981 – 2022) for each month. It was created by utilising the CHIRPS precipitation and GLEAM evapotranspiration datasets. GLEAMS dataset based on the well recognised Priestley-Taylor equation to estimate potential evapotranspiration (PET) based on observations of surface net radiation and near-surface air temperature. The SPEI was calculated for spatial grids of 5x5 km for the SPEI 3-month time scale, suitable for agricultural drought monitoring.This high-resolution SPEI dataset was integrated with Indian village boundaries and associated census attribute dataset. This allows researchers to perform multi-disciplinary investigations, e.g., climate migration modelling, drought hazards, and exposure assessment. The development of the dataset has been performed while keeping potential users in mind. Therefore, the dataset can be integrated into a GIS system for visualization (using .mid/.mif format) and into Python programming for modelling and analysis (using .csv). For advanced analysis, I have also provided it in netCDF format, which can be read in Python using xarray or the netcdf4 library. More details are in the README.pdf file. Date Submitted: 2023-11-07 Issued: 2023-11-07

  17. d

    Daily histograms of wind speed (100m), wind direction (100m) and atmospheric...

    • data.dtu.dk
    zip
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Imberger (2025). Daily histograms of wind speed (100m), wind direction (100m) and atmospheric stability derived from ERA5 [Dataset]. http://doi.org/10.11583/DTU.27930399.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 28, 2025
    Dataset provided by
    Technical University of Denmark
    Authors
    Marc Imberger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains daily histograms of wind speed at 100m ("WS100"), wind direction at 100 m ("WD100") and an atmospheric stability proxy ("STAB") derived from the ERA5 hourly data on single levels [1] accessed via the Copernicus Climate Change Climate Data Store [2]. The dataset covers six geographical regions (illustrated in regions.png) on a reduced 0.5 x 0.5 degrees regular grid and covers the period 1994 to 2023 (both years included). The dataset is packaged as a zip folder per region which contains a range of monthly zip folders following the convention of zarr ZipStores (more details here: https://zarr.readthedocs.io/en/stable/api/storage.html). Thus, the monthly zip folders are intended to be used in connection with the xarray python package (no unzipping of the monthly files needed).Wind speed and wind direction are derived from the U- and V-components. The stability metric makes use of a 5-class classification scheme [3] based on the Obukhov length whereby the required Obukhov length was computed using [4]. The following bins (left edges) have been used to create the histograms:Wind speed: [0, 40) m/s (bin width 1 m/s)Wind direction: [0,360) deg (bin width 15 deg)Stability: 5 discrete stability classes (1: very unstable, 2: unstable, 3: neutral, 4: stable, 5: very stable)Main Purpose: The dataset serves as minimum input data for the CLIMatological REPresentative PERiods (climrepper) python package (https://gitlab.windenergy.dtu.dk/climrepper/climrepper) in preparation for public release).References:[1] Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J-N. (2023): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), DOI: 10.24381/cds.adbb2d47 (Accessed Nov. 2024)[2] Copernicus Climate Change Service, Climate Data Store, (2023): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), DOI: 10.24381/cds.adbb2d47 (Accessed Nov. 2024)'[3] Holtslag, M. C., Bierbooms, W. A. A. M., & Bussel, G. J. W. van. (2014). Estimating atmospheric stability from observations and correcting wind shear models accordingly. In Journal of Physics: Conference Series (Vol. 555, p. 012052). IOP Publishing. https://doi.org/10.1088/1742-6596/555/1/012052[4] Copernicus Knowledge Base, ERA5: How to calculate Obukhov Length, URL: https://confluence.ecmwf.int/display/CKB/ERA5:+How+to+calculate+Obukhov+Length, last accessed: Nov 2024

  18. Data from: Supporting data for "Satellite derived SO2 emissions from the...

    • figshare.com
    • produccioncientifica.ugr.es
    hdf
    Updated May 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Esse; Mike Burton; Catherine Hayer; Melissa Pfeffer; Sara Barsotti; Nicolas Theys; Talfan Barnie; Manuel Titos (2023). Supporting data for "Satellite derived SO2 emissions from the relatively low-intensity, effusive 2021 eruption of Fagradalsfjall, Iceland" [Dataset]. http://doi.org/10.6084/m9.figshare.22303435.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    May 2, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ben Esse; Mike Burton; Catherine Hayer; Melissa Pfeffer; Sara Barsotti; Nicolas Theys; Talfan Barnie; Manuel Titos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Fagradalsfjall
    Description

    Supporting data for the paper "Satellite derived SO2 emissions from the relatively low-intensity, effusive 2021 eruption of Fagradalsfjall, Iceland" by Esse et al. The data files are in netCDF4 format, created using the Python xarray library. Each is a separate xarray Dataset.

    2021-05-02_18403_Fagradalsfjall_results.nc contains the analysis results for TROPOMI orbit 18403 shown in Figure 2.

    Fagradalsfjall_2021_emission_intensity.nc contains the SO2 emission intensity data shown in Figures 3, 4 and 5.

    cloud_effective_altitude_difference.nc contains the daily cloud effective altitude difference shown in figure 6.

  19. GEMINI output used to develop volumetric reconstruction technique for EISCAT...

    • zenodo.org
    bin, nc +1
    Updated Jan 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jone Peter Reistad; Jone Peter Reistad; Matthew Zettergren; Matthew Zettergren (2024). GEMINI output used to develop volumetric reconstruction technique for EISCAT 3D [Dataset]. http://doi.org/10.5281/zenodo.10561479
    Explore at:
    bin, nc, text/x-pythonAvailable download formats
    Dataset updated
    Jan 24, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jone Peter Reistad; Jone Peter Reistad; Matthew Zettergren; Matthew Zettergren
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is used as a "ground truth" for investigating the performance of a volumetric reconstruction technique of electric current densities, intended to be applied to the EISCAT 3D radar system. The technique is outlined in a mnuscript in preparation, to be referred to here once submitted. The volumetric reconstruction code can be found here: https://github.com/jpreistad/e3dsecs

    This dataset contain four files:

    1) Dataset file 'gemini_dataset.nc'. This is a dump from the end of a GEMINI model run driven with a pair of up/down FAC above the region around the EISCAT 3D facility. Detailes of the GEMINI model can be found here: https://doi.org/10.5281/zenodo.3528915 . This is a NETCDF file, intended to be opened with xarray in python:

    import xaray

    dataset = xarray.open_dataset('gemini_dataset.nc')

    2) Grid file 'gemini_grid.h5'. This file is needed to get information about the grid that the values from GEMINI are represented in. The E3DSECS library (https://github.com/jpreistad/e3dsecs) has the necessary code to open this file and put it into the dictionary structure used in that package.

    3) The GEMINI simulation config file 'config.nml' used to produce the simulation.

    4) The GEMINI boundary file 'fac_said.py' used to produce the boundary conditions for the simulation

    Together files 3 and 4 could be used to reproduce the full simulation of the GEMINI model, which is freely available at https://github.com/gemini3d

    The configuration files for this particular run are also available at this location:

    https://github.com/gemini3d/gemini-examples/tree/main/init/aurora_curv

  20. Z

    QLKNN11D training set

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karel Lucas van de Plassche; Jonathan Citrin (2023). QLKNN11D training set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8011147
    Explore at:
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    DIFFER
    Authors
    Karel Lucas van de Plassche; Jonathan Citrin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    QLKNN11D training set

    This dataset contains a large-scale run of ~1 billion flux calculations of the quasilinear gyrokinetic transport model QuaLiKiz. QuaLiKiz is applied in numerous tokamak integrated modelling suites, and is openly available at https://gitlab.com/qualikiz-group/QuaLiKiz/. This dataset was generated with the 'QLKNN11D-hyper' tag of QuaLiKiz, equivalent to 2.8.1 apart from the negative magnetic shear filter being disabled. See https://gitlab.com/qualikiz-group/QuaLiKiz/-/tags/QLKNN11D-hyper for the in-repository tag.

    The dataset is appropriate for the training of learned surrogates of QuaLiKiz, e.g. with neural networks. See https://doi.org/10.1063/1.5134126 for a Physics of Plasmas publication illustrating the development of a learned surrogate (QLKNN10D-hyper) of an older version of QuaLiKiz (2.4.0) with a 300 million point 10D dataset. The paper is also available on arXiv https://arxiv.org/abs/1911.05617 and the older dataset on Zenodo https://doi.org/10.5281/zenodo.3497066. For an application example, see Van Mulders et al 2021 https://doi.org/10.1088/1741-4326/ac0d12, where QLKNN10D-hyper was applied for ITER hybrid scenario optimization. For any learned surrogates developed for QLKNN11D, the effective addition of the alphaMHD input dimension through rescaling the input magnetic shear (s) by s = s - alpha_MHD/2, as carried out in Van Mulders et al., is recommended.

    Related repositories:

    General QuaLiKiz documentation https://qualikiz.com

    QuaLiKiz/QLKNN input/output variables naming scheme https://qualikiz.com/QuaLiKiz/Input-and-output-variables

    Training, plotting, filtering, and auxiliary tools https://gitlab.com/Karel-van-de-Plassche/QLKNN-develop

    QuaLiKiz related tools https://gitlab.com/qualikiz-group/QuaLiKiz-pythontools

    FORTRAN QLKNN implementation with wrapper for Python and MATLAB https://gitlab.com/qualikiz-group/QLKNN-fortran

    Weights and biases of 'hyperrectangle style' QLKNN https://gitlab.com/qualikiz-group/qlknn-hype

    Data exploration

    The data is provided in 43 netCDF files. We advise opening single datasets using xarray or multiple datasets out-of-core using dask. For reference, we give the load times and sizes of a single variable that just depends on the scan size dimx below. This was tested single-core on a Intel Xeon 8160 CPU at 2.1 GHz and 192 GB of DDR4 RAM. Note that during loading, more memory is needed than the final number.

    Timing of dataset loading
    
    
        Amount of datasets
        Final in-RAM memory (GiB)
    

    Loading time single var (M:SS)

        1
        10.3
        0:09
    
    
        5
        43.9
        1:00
    
    
        10
        63.2
        2:01
    
    
        16
        98.0
        3:25
    
    
        17
        Out Of Memory
        x:xx
    

    Full dataset

    The full dataset of QuaLiKiz in-and-output data is available on request. Note that this is 2.2 TiB of netCDF files!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Young-Don Choi (2024). (HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets [Dataset]. http://doi.org/10.4211/hs.a52df87347ef47c388d9633925cde9ad

(HS 2) Automate Workflows using Jupyter notebook to create Large Extent Spatial Datasets

Explore at:
Dataset updated
Oct 19, 2024
Dataset provided by
Hydroshare
Authors
Young-Don Choi
Description

We implemented automated workflows using Jupyter notebooks for each state. The GIS processing, crucial for merging, extracting, and projecting GeoTIFF data, was performed using ArcPy—a Python package for geographic data analysis, conversion, and management within ArcGIS (Toms, 2015). After generating state-scale LES (large extent spatial) datasets in GeoTIFF format, we utilized the xarray and rioxarray Python packages to convert GeoTIFF to NetCDF. Xarray is a Python package to work with multi-dimensional arrays and rioxarray is rasterio xarray extension. Rasterio is a Python library to read and write GeoTIFF and other raster formats. Xarray facilitated data manipulation and metadata addition in the NetCDF file, while rioxarray was used to save GeoTIFF as NetCDF. These procedures resulted in the creation of three HydroShare resources (HS 3, HS 4 and HS 5) for sharing state-scale LES datasets. Notably, due to licensing constraints with ArcGIS Pro, a commercial GIS software, the Jupyter notebook development was undertaken on a Windows OS.

Search
Clear search
Close search
Google apps
Main menu