17 datasets found

Z
Curated list of HAR datasets
data.niaid.nih.gov
Updated May 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Králik, Matej (2020). Curated list of HAR datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3831957
Explore at:
Dataset updated
May 18, 2020
Dataset provided by
Student
Authors
Králik, Matej
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A curated list of preprocessed & ready to use under a minute Human Activity Recognition datasets.

All the datasets are preprocessed in HDF5 format, created using the h5py python library. Scripts used for data preprocessing are provided as well (Load.ipynb and load_jordao.py)

Each HDF5 file contains at least the keys:

x a single array of size [sample count, temporal length, sensor channel count], contains the actual sensor data. Metadata contains the names of individual sensor channel count. All samples are zero-padded for constant length in the file, original lengths before padding available under the meta keys.

y a single array of size [sample count] with integer values for target classes (zero-based). Metadata contains the names of the target classes.

meta contain various metadata, depends on the dataset (original length before padding, subject no., trial no., etc.)

Usage example

import h5py

with h5py.File(f'data/waveglove_multi.h5', 'r') as h5f: x = h5f['x'] y = h5f['y']['class'] print(f'WaveGlove-multi: {x.shape[0]} samples') print(f'Sensor channels: {h5f["x"].attrs["channels"]}') print(f'Target classes: {h5f["y"].attrs["labels"]}') first_sample = x[0]

Output:

WaveGlove-multi: 10044 samples

Sensor channels: ['acc1-x' 'acc1-y' 'acc1-z' 'gyro1-x' 'gyro1-y' 'gyro1-z' 'acc2-x'

'acc2-y' 'acc2-z' 'gyro2-x' 'gyro2-y' 'gyro2-z' 'acc3-x' 'acc3-y'

'acc3-z' 'gyro3-x' 'gyro3-y' 'gyro3-z' 'acc4-x' 'acc4-y' 'acc4-z'

'gyro4-x' 'gyro4-y' 'gyro4-z' 'acc5-x' 'acc5-y' 'acc5-z' 'gyro5-x'

'gyro5-y' 'gyro5-z']

Target classes: ['null' 'hand swipe left' 'hand swipe right' 'pinch in' 'pinch out'

'thumb double tap' 'grab' 'ungrab' 'page flip' 'peace' 'metal']

Current list of datasets:

WaveGlove-single (waveglove_single.h5)

WaveGlove-multi (waveglove_multi.h5)

uWave (uwave.h5)

OPPORTUNITY (opportunity.h5)

PAMAP2 (pamap2.h5)

SKODA (skoda.h5)

MHEALTH (non overlapping windows) (mhealth.h5)

Six datasets with all four predefined train/test folds as preprocessed by Jordao et al. originally in WearableSensorData (FNOW, LOSO, LOTO and SNOW prefixed .h5 files)
D
Dataset for Design Ideation Study
dataverse.azure.uit.no
dataverse.no
application/x-h5, pdf +3
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Filip Gornitzka Abelson; Filip Gornitzka Abelson; Henrikke Dybvik; Henrikke Dybvik; Martin Steinert; Martin Steinert (2024). Dataset for Design Ideation Study [Dataset]. http://doi.org/10.18710/PZQC4A
Explore at:
tsv(7501), txt(13093), application/x-h5(25860340), application/x-h5(286920385), zip(581532), tsv(295160), application/x-h5(540715825), tsv(767327), application/x-h5(49209334), application/x-h5(510702725), tsv(1336354), tsv(2010), tsv(1935109), pdf(33267), application/x-h5(272694817)Available download formats
Unique identifier
https://doi.org/10.18710/PZQC4A
Dataset updated
Feb 28, 2024
Dataset provided by
DataverseNO
Authors
Filip Gornitzka Abelson; Filip Gornitzka Abelson; Henrikke Dybvik; Henrikke Dybvik; Martin Steinert; Martin Steinert
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Study information Design ideation study (N = 24) using eye tracking technology. Participants solved a total of twelve design problems while receiving inspirational stimuli on a monitor. Their task was to generate as many solutions to each problem and explain their solution briefly by thinking aloud. The study allows for getting further insight into how inspirational stimuli improve idea fluency during design ideation. This dataset features processed data from the experiment. Eye tracking data includes gaze data, fixation data, blink data, and pupillometry data for all participants. The study is based on the following research paper and follows the same experimental setup: Goucher-Lambert, K., Moss, J., & Cagan, J. (2019). A neuroimaging investigation of design ideation with and without inspirational stimuli—understanding the meaning of near and far stimuli. Design Studies, 60, 1-38. DOI Dataset Most files in the dataset are saved as CSV files or other human readable file formats. Large files are saved in Hierarchical Data Format (HDF5/H5) to allow for smaller file sizes and higher compression. All data is described thoroughly in 00_ReadMe.txt. The following processed data is included in the dataset: Concatenated annotations file of experimental flow for all participants (CSV). All eye tracking raw data in concatenated files. Annotated with only participant ID. (CSV/HDF5) Annotated eye tracking data for ideation routines only. A subset of the files above. (CSV/HDF5) Audio transcriptions from Google Cloud Speech-to-Text API of each recording with annotations. (CSV) Raw API response for each transcription. These files include time offset for each word in a recording. (JSON) Data for questionnaire feedback and ideas generated during the experiment. (CSV) Data for the post-experiment survey, including demographic information (TSV). Python code used for the open-source experimental setup and dataset construction is hosted at GitHub. Repository also includes code of how the dataset has been further processed.

ISIC 2020 - 256x256

kaggle.com

zip

Updated Aug 6, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Mehran Ziadloo (2024). ISIC 2020 - 256x256 [Dataset]. https://www.kaggle.com/datasets/ziadloo/isic-2020-256x256

Explore at:

zip(442027409 bytes)Available download formats

Dataset updated

Aug 6, 2024

Authors

Mehran Ziadloo

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

This dataset is derived from the ISIC Archive with the following changes:

A new integer column is added named "target" with values 0, 1, null. This column is populated using two other columns: "bengin_malignant" and "diagnosis". If the first column explicitly confirms that the record is either "benign" or "malignant", the target is set to "0" and "1" respectively. If the "benign_malignant" column is null, then the value of "diagnosis" column is used to determine the value for "target". The following diagnosis values are considered cancerous and as the result, "target" is set to "1":

squamous cell carcinoma
basal cell carcinoma
melanoma
squamous cell carcinoma

If the "benign_malignant" column is null and the "diagnosis" column is "vascular lesion", the target is set to null.

DISCLAIMER I'm not a dermatologist and I'm not affiliated with ISIC in any way. I don't know if my approach to setting the target value is acceptable by the ISIC competition. Use at your own risk.

All the images are resized to 256x256 using the following Python code:

import os
import multiprocessing as mp
from PIL import Image, ImageOps
import glob
from functools import partial


def list_jpg_files(folder_path):
  # Ensure the folder path ends with a slash
  if not folder_path.endswith('/'):
    folder_path += '/'

  # Use glob to find all .jpg files in the specified folder (non-recursive)
  jpg_files = glob.glob(folder_path + '*.jpg')

  return jpg_files



def resize_image(image_path, destination_folder):
  # Open the image file
  with Image.open(image_path) as img:
    # Get the original dimensions
    original_width, original_height = img.size

    # Calculate the aspect ratio
    aspect_ratio = original_width / original_height

    # Determine the new dimensions based on the aspect ratio
    if aspect_ratio > 1:
      # Width is larger, so we will crop the width
      new_width = int(256 * aspect_ratio)
      new_height = 256
    else:
      # Height is larger, so we will crop the height
      new_width = 256
      new_height = int(256 / aspect_ratio)

    # Resize the image while maintaining the aspect ratio
    img = img.resize((new_width, new_height))

    # Calculate the crop box to center the image
    left = (new_width - 256) / 2
    top = (new_height - 256) / 2
    right = (new_width + 256) / 2
    bottom = (new_height + 256) / 2

    # Crop the image if it results in shrinking
    if new_width > 256 or new_height > 256:
      img = img.crop((left, top, right, bottom))
    else:
      # Add black edges if it results in scaling up
      img = ImageOps.expand(img, border=(int(left), int(top), int(left), int(top)), fill='black')

    # Resize the image to the final dimensions
    img = img.resize((256, 256))

  img.save(os.path.join(destination_folder, os.path.basename(image_path)))


source_folder = ""
destination_folder = ""

images = list_jpg_files(source_folder)

with mp.Pool(processes=12) as pool:
  images = pool.map(partial(resize_image, destination_folder=destination_folder), images)
print("All images resized")

This code will shrink (down-sample) the image if it is larger than 256x256. But if the image is smaller than 256x256, it will add either vertical or horizontal black edges after scaling up the image. In both scenarios, it will keep the center of the input image in the center of the output image.

The HDF5 file is created using the following code:

import os
import pandas as pd
from PIL import Image
import h5py
import io
import numpy as np

# File paths
base_folder = "./isic-2020-256x256"
csv_file_path = 'train-metadata.csv'
image_folder_path = 'train-image/image'
hdf5_file_path = 'train-image.hdf5'

# Read the CSV file
df = pd.read_csv(os.path.join(base_folder, csv_file_path))

# Open an HDF5 file
with h5py.File(os.path.join(base_folder, hdf5_file_path), 'w') as hdf5_file:
  for index, row in df.iterrows():
    isic_id = row['isic_id']
    image_file_path = os.path.join(base_folder, image_folder_path, f'{isic_id}.jpg')
    
    if os.path.exists(image_file_path):
      # Open the image file
      with Image.open(image_file_path) as img:
        # Convert the image to a byte buffer
        img_byte_arr = io.BytesIO()
        img.save(img_byte_arr, format=img.format)
        img_byte_arr = img_byte_arr.getvalue()
        hdf5_file.create_dataset(isic_id, data=np.void(img_byte_arr))
    else:
      print(f"Image file for {isic_id} not found.")

print("HDF5 file created successfully.")

To read the hdf5 file, use the following code:

import h5py
from PIL import Image


with h...

t
Transformer network trained on simulated X-ray photoelectron spectroscopy...
researchdata.tuwien.at
bin, csv, json, zip
Updated Oct 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Simperl; Florian Simperl; Florian Simperl; Florian Simperl (2025). Transformer network trained on simulated X-ray photoelectron spectroscopy data for organic and inorganic compounds [Dataset]. http://doi.org/10.48436/eybcx-t0a02
Explore at:
csv, json, bin, zipAvailable download formats
Unique identifier
https://doi.org/10.48436/eybcx-t0a02
Dataset updated
Oct 17, 2025
Dataset provided by
TU Wien
Authors
Florian Simperl; Florian Simperl; Florian Simperl; Florian Simperl
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Description

This data repository provides the underlying data and neural network training scripts associated with the manuscript titled "A Transformer Network for High-Throughput Materials Characterization with X-ray Photoelectron Spectroscopy" by Simperl and Werner published in the Journal of Applied Physics (https://doi.org/10.1063/5.0296600) (2025)

All data files are released under the Creative Commons Attribution 4.0 International (CC-BY) license, while all code files are distributed under the MIT license.

The repository contains simulated X-ray photoelectron spectroscopy (XPS) spectra stored as hdf5 files in the zipped (h5_files.zip) folder, which was generated using the software developed by the authors. The NIST Standard Reference Database 100 – Simulation of Electron Spectra for Surface Analysis (SESSA) is freely available at https://www.nist.gov/srd/nist-standard-reference-database-100.

The neural network architecture is implemented using the PyTorch Lightning framework and is fully available within the attached materials as Transformer_SimulatedSpectra.py contained in the python_scripts.zip.

The trained model and the list of materials for the train, test and validation sets are contained in the models.zip folder.

The repository contains all the data necessary to replot the figures from the manuscript. These data are available in the form of .csv files or .h5 files for the spectra. In addition, the repository also contains a Python script (Plot_Data_Manuscript.ipynb) which is contained in the python_scripts.zip file.

Context and methodology

The dataset and accompanying Python code files included in this repository were used to train a transformer-based neural network capable of directly inferring chemical concentrations from simulated survey X-ray photoelectron spectroscopy (XPS) spectra of bulk compounds.

The spectral dataset provided here represents the raw output from the SESSA software (version 2.2.2), prior to the normalization procedure described in the associated manuscript. This step of normalisation is of paramount importance for the effective training of the neural network.

The repository contains the Python scripts utilised to execute the spectral simulations and the neural network training on the Vienna Scientific Cluster (VSC5) which is part of the Austrian Scientific Computing Infrastructure (ASC). In order to obtain guidance on the proper configuration of the Command Line Interface (CLI) tools required for SESSA, users are advised to consult the official SESSA manual, which is available at the following address: https://nvlpubs.nist.gov/nistpubs/NSRDS/NIST.NSRDS.100-2024.pdf.

To run the neural network training we provided the requirements_nn_training.txt file that contains all the necessary python packages and version numbers. All other python scripts can be run locally with the python libraries listed in requirements_data_analysis.txt.

Data details

HDF5 (in zip folder): As described in the manuscript, we simulate X-ray photoelectron spectra for each of the 7,587 inorganic [1] and organic [2] materials in our dataset. To reflect realistic experimental conditions, each simulated spectrum was augmented by systematically varying parameters such as peak width, peak shift, and peak type—all configurable within the SESSA software—as well as by applying statistical Poisson noise to simulate varying signal-to-noise ratios. These modifications account for experimentally observed and material-specific spectral broadening, peak shifts, and detector-induced noise. Each material is represented by an individual HDF5 (.h5) file, named according to its chemical formula and mass density (in g/cm³). For example, the file for SiO2 with a density of 2.196 gcm-3 is named SiO2_2.196.h5. For more complex chemical formulas, such as Co(ClO4)2 with a density of 3.33 gcm-3, the file is named Co_ClO4_2_3.33.h5. Within each HDF5 file, the metadata for each spectrum is stored alongside a fixed energy axis and the corresponding intensity values. The spectral data are organized hierarchically by augmentation parameters in the following directory structure, e.g. for Ac_10.0.h5 we have SNR_0/WIDTH_0.3/SHIFT_-3.0/PEAK_gauss/Ac_10.0/. These files can be easily inspected with H5Web in Visual Studio Code or using h5py in Python or any other h5 interpretable program.

Session Files: The .ses files are SESSA specific input files that can be directly loaded into SESSA to specify certain input parameters for the initilization (ini), the geometry (geo) and the simulation parameters (sim_para) and are required by the python script Simulation_Script_VSC_json.py to run the simulation on the cluster.

Json Files: The two json files (MaterialsListVSC_gauss.json, MaterialsListVSC_lorentz.json) are used as the input files to the Python script Simulation_Script_VSC_json.py. These files contain all the material specific information for the SESSA simulation.

csv files: The csv files are used to generate the plots from the manuscript described in the section "Plotting Scripts".

npz files: The two .npz files (element_counts.npz, single_elements.npz) are python arrays that are needed by the Transformer_SimulatedSpectra.py script and contain the number of each single element in the dataset and an array of each single element present, respectively.

SESSA Simulation Script

There is one python file that sets the communication with SESSA:

Simulation_Script_VSC_json.py: This script is the heart of the simulation as it controls the communication through the CLI with SESSA using the specified input paramters in the .json and .ses files together with external functions specified in VSC_function.py

Technical Details

Simulation_Script_VSC_json.py: This script uses the functions of the VSC_function.py script (therefore needs to be placed in the same directory as this script) and can be called with the following command:

python3 Simulation_Script_VSC_json.py MaterialsListVSC_gauss.json 0

It simulates the spectrum for the material at index 0 in the .json file and with the corresponding parameters specified in the .json file.

It is important that before running this script the following paths need to be specified:

sessa_path: The path to their SESSA installation in sessa_path and the path to their session files in

folder_path: The path to their .ses files. In this directory an output folder will be generated where all the output files, including the simulated spectra, are written to.

To run SESSA on a computing cluster it is important to have a working Xvfb (virtual frame buffer) or a similar tool available to which any graphical output from SESSA can be written to.

Neural Network Training Script

Before running the training script it is important to normalize the data such that the squared integral of the spectrum is 1 (as described in the manuscript) and shown in the code: normalize_spectra.py

For the neural network training we use the Transformer_SimulatedSpectra.py where the external functions used are specified in external_functions.py. This script contains the full description of the neural network architecture, the hyperparameter tuning and the Wandb logging.

In the models.zip folder the fully trained network final_trained_model.ckpt presented in the manuscript is available as well as the list of training, validation and testing materials (test_materials_list.pt, train_materials_list.pt, val_materials_list.pt) where the corresponding spectra are extracted from the hdf5 files. The file types .ckpt and .pt can be read in by using the pytorch specific load functions in Python, e.g.

torch.load(train_materials_list)

Technical Details

normalize_spectra.py: To run this script properly it is important to set up a python environment with the necessary libraries specified in the requirements_data_analysis.txt file. Then it can be called with

python3 normalize_spectra.py

where it is important to specify the path to the .h5 files containing the unnormalized spectra.

Transformer_SimulatedSpectra.py: To run this script properly on the cluster it is important to set up a python environment with the necessary libraries specified in the requirements_nn_training.txt file. This script also relies on external_functions.py, single_elements.npz and element_counts.npz (that should be placed in the same directory as the python script) file. This is important for creating the datasets for training, validation and testing and ensures that all the single elements appear in the testing set. You can call this script (on the cluster) within a slurm script to start the GPU training.

python3 Transformer_SimulatedSpectra.py

It is important that before running this script the following paths need to be specified:

data_path: General path where all the data is stored

neural_network_data: The location where you keep your normalized hdf5 files

wandb_api_key: The api key to use
Rescaled CIFAR-10 dataset
zenodo.org
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg (2025). Rescaled CIFAR-10 dataset [Dataset]. http://doi.org/10.5281/zenodo.15188748
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15188748
Dataset updated
Jun 27, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg
Description
Motivation

The goal of introducing the Rescaled CIFAR-10 dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data.

The Rescaled CIFAR-10 dataset was introduced in the paper:

[1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, 67(29), https://doi.org/10.1007/s10851-025-01245-x.

with a pre-print available at arXiv:

[2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140.

Importantly, the Rescaled CIFAR-10 dataset contains substantially more natural textures and patterns than the MNIST Large Scale dataset, introduced in:

[3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2

and is therefore significantly more challenging.

Access and rights

The Rescaled CIFAR-10 dataset is provided on the condition that you provide proper citation for the original CIFAR-10 dataset:

[4] Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Tech. rep., University of Toronto.

and also for this new rescaled version, using the reference [1] above.

The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible.

The dataset

The Rescaled CIFAR-10 dataset is generated by rescaling 32×32 RGB images of animals and vehicles from the original CIFAR-10 dataset [4]. The scale variations are up to a factor of 4. In order to have all test images have the same resolution, mirror extension is used to extend the images to size 64x64. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1].

There are 10 distinct classes in the dataset: “airplane”, “automobile”, “bird”, “cat”, “deer”, “dog”, “frog”, “horse”, “ship” and “truck”. In the dataset, these are represented by integer labels in the range [0, 9].

The dataset is split into 40 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 40 000 samples from the original CIFAR-10 training set. The validation dataset, on the other hand, is formed from the final 10 000 image batch of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original CIFAR-10 test set.

The h5 files containing the dataset

The training dataset file (~5.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is:

cifar10_with_scale_variations_tr40000_vl10000_te10000_outsize64-64_scte1p000_scte1p000.h5

Additionally, for the Rescaled CIFAR-10 dataset, there are 9 datasets (~1 GB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2^k/4, with k being integers in the range [-4, 4]:

cifar10_with_scale_variations_te10000_outsize64-64_scte0p500.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte0p595.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte0p707.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte0p841.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte1p000.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte1p189.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte1p414.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte1p682.h5
cifar10_with_scale_variations_te10000_outsize64-64_scte2p000.h5

These dataset files were used for the experiments presented in Figures 9, 10, 15, 16, 20 and 24 in [1].

Instructions for loading the data set

The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as
('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used.

The training dataset can be loaded in Python as:

with h5py.File(`

x_train = np.array( f["/x_train"], dtype=np.float32)
x_val = np.array( f["/x_val"], dtype=np.float32)
x_test = np.array( f["/x_test"], dtype=np.float32)
y_train = np.array( f["/y_train"], dtype=np.int32)
y_val = np.array( f["/y_val"], dtype=np.int32)
y_test = np.array( f["/y_test"], dtype=np.int32)

We also need to permute the data, since Pytorch uses the format [num_samples, channels, width, height], while the data is saved as [num_samples, width, height, channels]:

x_train = np.transpose(x_train, (0, 3, 1, 2))
x_val = np.transpose(x_val, (0, 3, 1, 2))
x_test = np.transpose(x_test, (0, 3, 1, 2))

The test datasets can be loaded in Python as:

with h5py.File(`

x_test = np.array( f["/x_test"], dtype=np.float32)
y_test = np.array( f["/y_test"], dtype=np.int32)

The test datasets can be loaded in Matlab as:

x_test = h5read(`

The images are stored as [num_samples, x_dim, y_dim, channels] in HDF5 files. The pixel intensity values are not normalised, and are in a [0, 255] range.
Explore data formats and ingestion methods
kaggle.com
zip
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel Preda (2021). Explore data formats and ingestion methods [Dataset]. https://www.kaggle.com/gpreda/iris-dataset
Explore at:
zip(31084 bytes)Available download formats
Dataset updated
Feb 12, 2021
Authors
Gabriel Preda
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Why this Dataset

This dataset brings to you Iris Dataset in several data formats (see more details in the next sections).

You can use it to test the ingestion of data in all these formats using Python or R libraries. We also prepared Python Jupyter Notebook and R Markdown report that input all these formats:

Test Data Formats in Python

Test Data Formats in R

Iris Dataset

Iris Dataset was created by R. A. Fisher and donated by Michael Marshall.

Repository on UCI site: https://archive.ics.uci.edu/ml/datasets/iris

Data Source: https://archive.ics.uci.edu/ml/machine-learning-databases/iris/

The file downloaded is iris.data and is formatted as a comma delimited file.

This small data collection was created to help you test your skills with ingesting various data formats.

Content

This file was processed to convert the data in the following formats: * csv - comma separated values format * tsv - tab separated values format * parquet - parquet format
* feather - feather format * parquet.gzip - compressed parquet format * h5 - hdf5 format * pickle - Python binary object file - pickle format * xslx - Excel format
* npy - Numpy (Python library) binary format * npz - Numpy (Python library) binary compressed format * rds - Rds (R specific data format) binary format

Acknowledgements

I would like to acknowledge the work of the creator of the dataset - R. A. Fisher and of the donor - Michael Marshall.

Inspiration

Use these data formats to test your skills in ingesting data in various formats.
t
Schrammer, Stefan (2023). Dataset: Numerical experiments for "on dynamical...
service.tib.eu
Updated Aug 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Schrammer, Stefan (2023). Dataset: Numerical experiments for "on dynamical low-rank integrators for matrix differential equations". https://doi.org/10.35097/1343 [Dataset]. https://service.tib.eu/ldmservice/dataset/rdr-doi-10-35097-1343
Explore at:
Dataset updated
Aug 4, 2023
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Abstract: This code has been used for the numerical experiments in the thesis "On dynamical low-rank integrators for matrix differential equations" by Stefan Schrammer, see https://www.doi.org/10.5445/IR/1000148853. TechnicalRemarks: #### Instructions: The scripts inside the subfolders are intended to reproduce the figures from the thesis On dynamical low-rank integrators for matrix differential equations. by Stefan Schrammer We provide two different versions of the code: - Code_Prom_wo_ref.zip provides the scripts for computing and plotting the data for all numerical experiments. - Code_Prom_incl_ref.zip addtionally provides the reference solutions to all considered problems as hdf5-files. Requirements The codes are tested with Ubuntu 20.04.2 LTS and Python 3.8.5 and the following version of its modules: numpy 1.19.2 scipy 1.5.2 numba 0.51.2 colorama 0.4.4 h5py 2.10.0 matplotlib 3.3.2 tikzplotlib 0.9.6 Generation of figures (tex files containing the data are also created) In the folder fracginz open a console and run the commands 1. to create the data for Figures (7.1) and (7.2) python3 fgl.py 2. to create Figures (7.1) and (7.2) python3 fgl_results.py In the folder fracschr open a console and run the commands 3. to create the data for Figure (7.3) python3 fsr.py 4. to create Figure (7.3) python3 fsr_results.py In the folder laserplasma open a console and run the commands 5. to create the data for Figure (7.4) python3 lpi_hom.py 6. to create Figure (7.4) python3 lpi_hom_plots.py 7. to create the data for Figures (7.5), (7.6), and (7.7) python3 lpi.py 8. to create Figures (7.5), (7.6), and (7.7) python3 lpi_globalerr.py python3 lpi_1d_plot python3 lpi_svals_maxint.py In the folder sinegordon open a console and run the commands 9. to create the data for Figures (7.8) and (7.9) python3 sineg.py 10. to create Figures (7.8) and (7.9) python3 sineg_globalerr_ranks.py If the reference solutions shall be recomputed, uncomment the line
3D MNIST
kaggle.com
zip
Updated Oct 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David de la Iglesia Castro (2019). 3D MNIST [Dataset]. https://www.kaggle.com/daavoo/3d-mnist
Explore at:
zip(160210751 bytes)Available download formats
Dataset updated
Oct 18, 2019
Authors
David de la Iglesia Castro
Description
Context

The aim of this dataset is to provide a simple way to get started with 3D computer vision problems such as 3D shape recognition.

Accurate 3D point clouds can (easily and cheaply) be adquired nowdays from different sources:

RGB-D devices: Google Tango, Microsoft Kinect, etc.

Lidar.

3D reconstruction from multiple images.

However there is a lack of large 3D datasets (you can find a good one here based on triangular meshes); it's especially hard to find datasets based on point clouds (wich is the raw output from every 3D sensing device).

This dataset contains 3D point clouds generated from the original images of the MNIST dataset to bring a familiar introduction to 3D to people used to work with 2D datasets (images).

In the 3D_from_2D notebook you can find the code used to generate the dataset.

You can use the code in the notebook to generate a bigger 3D dataset from the original.

Content

full_dataset_vectors.h5

The entire dataset stored as 4096-D vectors obtained from the voxelization (x:16, y:16, z:16) of all the 3D point clouds.

In adition to the original point clouds, it contains randomly rotated copies with noise.

The full dataset is splitted into arrays:

X_train (10000, 4096)

y_train (10000)

X_test(2000, 4096)

y_test (2000)

Example python code reading the full dataset:

with h5py.File("../input/train_point_clouds.h5", "r") as hf: X_train = hf["X_train"][:] y_train = hf["y_train"][:] X_test = hf["X_test"][:] y_test = hf["y_test"][:]

train_point_clouds.h5 & test_point_clouds.h5

5000 (train), and 1000 (test) 3D point clouds stored in HDF5 file format. The point clouds have zero mean and a maximum dimension range of 1.

Each file is divided into HDF5 groups

Each group is named as its corresponding array index in the original mnist dataset and it contains:

"points" dataset: x, y, z coordinates of each 3D point in the point cloud.

"normals" dataset: nx, ny, nz components of the unit normal associate to each point.

"img" dataset: the original mnist image.

"label" attribute: the original mnist label.

Example python code reading 2 digits and storing some of the group content in tuples:

with h5py.File("../input/train_point_clouds.h5", "r") as hf: a = hf["0"] b = hf["1"] digit_a = (a["img"][:], a["points"][:], a.attrs["label"]) digit_b = (b["img"][:], b["points"][:], b.attrs["label"])

voxelgrid.py

Simple Python class that generates a grid of voxels from the 3D point cloud. Check kernel for use.

plot3D.py

Module with functions to plot point clouds and voxelgrid inside jupyter notebook. You have to run this locally due to Kaggle's notebook lack of support to rendering Iframes. See github issue here

Functions included:

array_to_color Converts 1D array to rgb values use as kwarg color in plot_points()

plot_points(xyz, colors=None, size=0.1, axis=False)

plot_voxelgrid(v_grid, cmap="Oranges", axis=False)

Acknowledgements

Website of the original MNIST dataset

Website of the 3D MNIST dataset

Have fun!
d
Data from: Closed Loop Geothermal Working Group: GeoCLUSTER App, Subsurface...
datasets.ai
gdr.openei.org
+4more
0, 57
Updated May 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Energy (2023). Closed Loop Geothermal Working Group: GeoCLUSTER App, Subsurface Simulation Results, and Publications [Dataset]. https://datasets.ai/datasets/closed-loop-geothermal-working-group-geocluster-app-subsurface-simulation-results-and-publ
Explore at:
0, 57Available download formats
Dataset updated
May 1, 2023
Dataset authored and provided by
Department of Energy
Description
To better understand the heat production, electricity generation performance, and economic viability of closed-loop geothermal systems in hot-dry rock, the Closed-Loop Geothermal Working Group -- a consortium of several national labs and academic institutions has tabulated time-dependent numerical solutions and levelized cost results of two popular closed-loop heat exchanger designs (u-tube and co-axial). The heat exchanger designs were evaluated for two working fluids (water and supercritical CO2) while varying seven continuous independent parameters of interest (mass flow rate, vertical depth, horizontal extent, borehole diameter, formation gradient, formation conductivity, and injection temperature). The corresponding numerical solutions (approximately 1.2 million per heat exchanger design) are stored as multi-dimensional HDF5 datasets and can be queried at off-grid points using multi-dimensional linear interpolation. A Python script was developed to query this database and estimate time-dependent electricity generation using an organic Rankine cycle (for water) or direct turbine expansion cycle (for CO2) and perform a cost assessment. This document aims to give an overview of the HDF5 database file and highlights how to read, visualize, and query quantities of interest (e.g., levelized cost of electricity, levelized cost of heat) using the accompanying Python scripts. Details regarding the capital, operation, and maintenance and levelized cost calculation using the techno-economic analysis script are provided.

This data submission will contain results from the Closed Loop Geothermal Working Group study that are within the public domain, including publications, simulation results, databases, and computer codes.

GeoCLUSTER is a Python-based web application created using Dash, an open-source framework built on top of Flask that streamlines the building of data dashboards. GeoCLUSTER provides users with a collection of interactive methods for streamlining the exploration and visualization of an HDF5 dataset. The GeoCluster app and database are contained in the compressed file geocluster_vx.zip, where the "x" refers to the version number. For example, geocluster_v1.zip is Version 1 of the app. This zip file also contains installation instructions.

**To use the GeoCLUSTER app in the cloud, click the link to "GeoCLUSTER on AWS" in the Resources section below. To use the GeoCLUSTER app locally, download the geocluster_vx.zip to your computer and uncompress this file. When uncompressed this file comprises two directories and the geocluster_installation.pdf file. The geo-data app contains the HDF5 database in condensed format, and the GeoCLUSTER directory contains the GeoCLUSTER app in the subdirectory dash_app, as app.py. The geocluster_installation.pdf file provides instructions on installing Python, the needed Python modules, and then executing the app.

ISIC 2018 Task 3 - 256x256

kaggle.com

zip

Updated Aug 7, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Mehran Ziadloo (2024). ISIC 2018 Task 3 - 256x256 [Dataset]. https://www.kaggle.com/datasets/ziadloo/isic-2018-task-3-256x256/code

Explore at:

zip(164325064 bytes)Available download formats

Dataset updated

Aug 7, 2024

Authors

Mehran Ziadloo

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

This dataset is derived from the ISIC Archive with the following changes:

A new integer column is added named "target" with values 0, 1, null. This column is populated using two other columns: "bengin_malignant" and "diagnosis". If the first column explicitly confirms that the record is either "benign" or "malignant", the target is set to "0" and "1" respectively. If the "benign_malignant" column is null, then the value of "diagnosis" column is used to determine the value for "target". The following diagnosis values are considered cancerous and as the result, "target" is set to "1":

squamous cell carcinoma
basal cell carcinoma
melanoma
squamous cell carcinoma

If the "benign_malignant" column is null and the "diagnosis" column is "vascular lesion", the target is set to null.

All the images are resized to 256x256 using the following Python code:

import os
import multiprocessing as mp
from PIL import Image, ImageOps
import glob
from functools import partial


def list_jpg_files(folder_path):
  # Ensure the folder path ends with a slash
  if not folder_path.endswith('/'):
    folder_path += '/'

  # Use glob to find all .jpg files in the specified folder (non-recursive)
  jpg_files = glob.glob(folder_path + '*.jpg')

  return jpg_files



def resize_image(image_path, destination_folder):
  # Open the image file
  with Image.open(image_path) as img:
    # Get the original dimensions
    original_width, original_height = img.size

    # Calculate the aspect ratio
    aspect_ratio = original_width / original_height

    # Determine the new dimensions based on the aspect ratio
    if aspect_ratio > 1:
      # Width is larger, so we will crop the width
      new_width = int(256 * aspect_ratio)
      new_height = 256
    else:
      # Height is larger, so we will crop the height
      new_width = 256
      new_height = int(256 / aspect_ratio)

    # Resize the image while maintaining the aspect ratio
    img = img.resize((new_width, new_height))

    # Calculate the crop box to center the image
    left = (new_width - 256) / 2
    top = (new_height - 256) / 2
    right = (new_width + 256) / 2
    bottom = (new_height + 256) / 2

    # Crop the image if it results in shrinking
    if new_width > 256 or new_height > 256:
      img = img.crop((left, top, right, bottom))
    else:
      # Add black edges if it results in scaling up
      img = ImageOps.expand(img, border=(int(left), int(top), int(left), int(top)), fill='black')

    # Resize the image to the final dimensions
    img = img.resize((256, 256))

  img.save(os.path.join(destination_folder, os.path.basename(image_path)))


source_folder = ""
destination_folder = ""

images = list_jpg_files(source_folder)

with mp.Pool(processes=12) as pool:
  images = pool.map(partial(resize_image, destination_folder=destination_folder), images)
print("All images resized")

The HDF5 file is created using the following code:

import os
import pandas as pd
from PIL import Image
import h5py
import io
import numpy as np

# File paths
base_folder = "./isic-2018-task-3-256x256"
csv_file_path = 'train-metadata.csv'
image_folder_path = 'train-image/image'
hdf5_file_path = 'train-image.hdf5'

# Read the CSV file
df = pd.read_csv(os.path.join(base_folder, csv_file_path))

# Open an HDF5 file
with h5py.File(os.path.join(base_folder, hdf5_file_path), 'w') as hdf5_file:
  for index, row in df.iterrows():
    isic_id = row['isic_id']
    image_file_path = os.path.join(base_folder, image_folder_path, f'{isic_id}.jpg')
    
    if os.path.exists(image_file_path):
      # Open the image file
      with Image.open(image_file_path) as img:
        # Convert the image to a byte buffer
        img_byte_arr = io.BytesIO()
        img.save(img_byte_arr, format=img.format)
        img_byte_arr = img_byte_arr.getvalue()
        hdf5_file.create_dataset(isic_id, data=np.void(img_byte_arr))
    else:
      print(f"Image file for {isic_id} not found.")

print("HDF5 file created successfully.")

To read the hdf5 file, use the following code:

import h5py
from PIL import Image

...

Rescaled Fashion-MNIST dataset
zenodo.org
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg (2025). Rescaled Fashion-MNIST dataset [Dataset]. http://doi.org/10.5281/zenodo.15187793
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15187793
Dataset updated
Jun 27, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg
Time period covered
Apr 10, 2025
Description
Motivation

The goal of introducing the Rescaled Fashion-MNIST dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data.

The Rescaled Fashion-MNIST dataset was introduced in the paper:

[1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, 67(29), https://doi.org/10.1007/s10851-025-01245-x.

with a pre-print available at arXiv:

[2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140.

Importantly, the Rescaled Fashion-MNIST dataset is more challenging than the MNIST Large Scale dataset, introduced in:

[3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2.

Access and rights

The Rescaled Fashion-MNIST dataset is provided on the condition that you provide proper citation for the original Fashion-MNIST dataset:

[4] Xiao, H., Rasul, K., and Vollgraf, R. (2017) “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms”, arXiv preprint arXiv:1708.07747

and also for this new rescaled version, using the reference [1] above.

The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible.

The dataset

The Rescaled FashionMNIST dataset is generated by rescaling 28×28 gray-scale images of clothes from the original FashionMNIST dataset [4]. The scale variations are up to a factor of 4, and the images are embedded within black images of size 72x72, with the object in the frame always centred. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1].

There are 10 different classes in the dataset: “T-shirt/top”, “trouser”, “pullover”, “dress”, “coat”, “sandal”, “shirt”, “sneaker”, “bag” and “ankle boot”. In the dataset, these are represented by integer labels in the range [0, 9].

The dataset is split into 50 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 50 000 samples from the original Fashion-MNIST training set. The validation dataset, on the other hand, is formed from the final 10 000 images of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original Fashion-MNIST test set.

The h5 files containing the dataset

The training dataset file (~2.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is:

fashionmnist_with_scale_variations_tr50000_vl10000_te10000_outsize72-72_scte1p000_scte1p000.h5

Additionally, for the Rescaled FashionMNIST dataset, there are 9 datasets (~415 MB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2^k/4, with k being integers in the range [-4, 4]:

fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p500.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p595.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p707.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p841.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p000.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p189.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p414.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p682.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte2p000.h5

These dataset files were used for the experiments presented in Figures 6, 7, 14, 16, 19 and 23 in [1].

Instructions for loading the data set

The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as
('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used.

The training dataset can be loaded in Python as:

with h5py.File(`

x_train = np.array( f["/x_train"], dtype=np.float32)
x_val = np.array( f["/x_val"], dtype=np.float32)
x_test = np.array( f["/x_test"], dtype=np.float32)
y_train = np.array( f["/y_train"], dtype=np.int32)
y_val = np.array( f["/y_val"], dtype=np.int32)
y_test = np.array( f["/y_test"], dtype=np.int32)

We also need to permute the data, since Pytorch uses the format [num_samples, channels, width, height], while the data is saved as [num_samples, width, height, channels]:

x_train = np.transpose(x_train, (0, 3, 1, 2))
x_val = np.transpose(x_val, (0, 3, 1, 2))
x_test = np.transpose(x_test, (0, 3, 1, 2))

The test datasets can be loaded in Python as:

with h5py.File(`

x_test = np.array( f["/x_test"], dtype=np.float32)
y_test = np.array( f["/y_test"], dtype=np.int32)

The test datasets can be loaded in Matlab as:

x_test = h5read(`

The images are stored as [num_samples, x_dim, y_dim, channels] in HDF5 files. The pixel intensity values are not normalised, and are in a [0, 255] range.

There is also a closely related Fashion-MNIST with translations dataset, which in addition to scaling variations also comprises spatial translations of the objects.
Dataset for Arxiv:2507.02458
zenodo.org
bin, text/x-python +1
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tingyang Shen; Tingyang Shen (2025). Dataset for Arxiv:2507.02458 [Dataset]. http://doi.org/10.5281/zenodo.15825958
Explore at:
text/x-python, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15825958
Dataset updated
Jul 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tingyang Shen; Tingyang Shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contains zipfile which includes h5 file needed to reproduce the figures for arxiv:2507.02458 and sample python codes to generate figures as well as signal injection codes to generate TD signal and PSD.
Z
Synchrotron X-ray Computed Tomography scan of a wasp
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jan 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iori, Gianluca (2024). Synchrotron X-ray Computed Tomography scan of a wasp [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10075276
Explore at:
Dataset updated
Jan 19, 2024
Dataset provided by
Synchrotron-light for Experimental Science and Applications in the Middle East (SESAME)
Authors
Iori, Gianluca
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contents:

bee_yazeed-20231001T170032.h5 - SXCT scan of a wasp performed at beamline ID10-BEATS of SESAME. SESAME_wasp_yazeed.avi - 3D video rendering of phase-contrast CT reconstruction of bee_yazeed-20231001T170032. The dataset was reconstructed using alrecon. The video was created using ORS Dragonfly. H5 dataset information:

Raw experimental data (sinogram, flat fields and dark fields) and metadata are stored in a common .H5 file. The HDF5 file is organized hierarchically following the Scientific Data Exchange (DXfile) community standard. How to reconstruct:

You can use Silx to read and explore the .H5 dataset. The file can be read within Python using the DXChange package. See the ID10-BEATS beamline user guide for a detailed description on how to process and reconstruct the scan.
t
Thermodynamic stability at the two-particle level - numerical results for...
service.tib.eu
Updated Nov 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Thermodynamic stability at the two-particle level - numerical results for the two-orbital hubbard model - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/rdr-doi-10-58160-pcgscfdwlyjyiedv
Explore at:
Dataset updated
Nov 28, 2024
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Abstract: This dataset contains the DMFT/QMC results for the example of the two-orbital Hubbard model shown in the article "Thermodynamic Stability at the Two-Particle Level". It contains parameters, one-particle Green's functions, and observables in w2dynamics output format as well as patches for relevant functionality not contained in current versions of w2dynamics at the time of publication and scripts used for post-processing of the data and the creation of some of the graphs. For size reasons, the data files containing the corresponding two-particle Green's functions are split into multiple subdatasets whose identifiers are listed above. TechnicalRemarks: The data files are contained in directories named beta35 and beta50 for the inverse temperature used in the respective calculations, with files containing the two-particle Green's functions contained in the subdatasets listed above and indicated by names containing 'G2'. All calculations were performed for two-orbital Hubbard models on a Bethe lattice with density-density interaction with fixed ratios between the interaction coefficients. The individual file names contain the inverse temperature, e.g. 'b35' for beta=35, Hubbard-U interaction strength, e.g. 'U1.44' for U=1.44, and usually the chemical potential μ, e.g. 'mu1.26000' for μ=1.26. The file name segment 'ma...' present in some file names redundantly gives the difference of the used chemical potential from that necessary for half-filling. In the coexistence region, the phase of the solution depends on the procedure which is indicated by the name segment 'upward' / 'downward' / 'instable' (also sometimes shortened to just the initial letter) indicating the insulating or strongly correlated metallic phase, the weakly correlated metallic phase, and the unstable phase respectively. For some of the files containing unstable solutions, the targeted value of the quasiparticle weight Z calculated from the self-energy value at the first Matsubara frequency is given in the 'Ztarget...' segment instead of an approximate value of the chemical potential (which is not preset as a fixed parameter for calculating unstable solutions). File names of files containing two-particle Green's functions additionally contain 's...' indicating separate calculations differing only in the used PRNG seed that allow further statistical post-processing beyond that done automatically by w2dynamics. n(mu) plots as shown in Figs. 2 and 3 of the article can be created using the script 'kappa_2band_create_mu_n_plot.py' by calling it with the appropriate arguments, e.g. using commands like python kappa_2band_create_mu_n_plot.py -r "kappa_2band_bethe_dens_b35_U([0-9.]*)_([muZtarget0-9.]*).*hdf.*" --axisgroup 1 -k '$U/D = {grp[0]}$' --imsiwsort --nmin 2.0 --nmax 2.08 --mumin 0.0 --mumax 0.15 --nmu --onecolsize .hdf5.zst in the beta35 directory to create a plot like in Fig. 2 and python kappa_2band_create_mu_n_plot.py -r "kappa_2band_bethe_dens_b50_U([0-9.])_([muZtarget0-9.]*).*hdf.*" --axisgroup 1 -k '$U/D = {grp[0]}$' --imsiwsort --nmin 2.0 --nmax 2.14 --mumin 0.0 --mumax 0.22 --nmu --onecolsize *.hdf5.zst in the beta50 directory to create a plot like in Fig. 3. The script 'chi_d_orblt_diagonalize.py' can be used to compute and diagonalize the generalized susceptibility by passing a data file with the one-particle Green's function as argument after '--onepfile' and one with the corresponding two-particle Green's function after '--twopfile'. From the created .npz files, a plot like in Fig. 1 of the supplemental material can be created using the script 'chi_eigenbasis_multi_barcontribs.py' by calling it with the appropriate arguments, e.g. python chi_eigenbasis_multi_barcontribs.py --force-centrosymm-contribs --onecolsize --bargraph 2 --beta 50 --hopping 0.5 --contrib real --barorder contrib kappa_2band_bethe_dens_b50_U1.4910_mu1.4924_u_chi_orblt.npz kappa_2band_bethe_dens_b50_U1.4915_mu1.4937_u_chi_orblt.npz kappa_2band_bethe_dens_b50_U1.4920_mu1.49510_u_chi_orblt.npz kappa_2band_bethe_dens_b50_U1.4930_mu1.49780_u_chi_orblt.npz kappa_2band_bethe_dens_b50_U1.50_mu1.51740_u_chi_orblt.npz --tickstrings '$U/D = 1.4910$' '$U/D = 1.4915$' '$U/D = 1.4920$' '$U/D = 1.4930$' '$U/D = 1.5000$' to create a similar plot showing the same data after the listed .npz files with the generalized susceptibility data have been created. Patches in the patch directory can be applied to w2dynamics 1.1.5 as published on GitHub to add functionality that allows performing calculations converging toward unstable solutions like those contained in this data set. This information is also contained in the markdown-formatted file README.md contained in the datasets. Other: We are grateful for funding support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy through the Würzburg-Dresden Cluster of Excellence on Complexity and Topology in Quantum Matter ct.qmat (EXC 2147, Project ID 390858490) as well as through the Collaborative Research Center SFB 1170 ToCoTronics (Project ID 258499086).
Structure Assisted Compressed Sensing Reconstruction of Undersampled AFM...
zenodo.org
bin, text/x-python
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Schou Oxvig; Thomas Arildsen; Thomas Arildsen; Torben Larsen; Christian Schou Oxvig; Torben Larsen (2020). Structure Assisted Compressed Sensing Reconstruction of Undersampled AFM Images Dataset 2 [Dataset]. http://doi.org/10.5281/zenodo.60512
Explore at:
bin, text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.60512
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Christian Schou Oxvig; Thomas Arildsen; Thomas Arildsen; Torben Larsen; Christian Schou Oxvig; Torben Larsen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This deposition contains the results from a simulation of reconstructions of undersampled atomic force microscopy (AFM) images. The reconstructions were obtained using weighted iterative thresholding compressed sensing algorithms.

The deposition consists of:

An HDF5 database containing the results from simulations of reconstructions of undersampled atomic force microscopy images (weighted_it_reconstructions.hdf5).

The Python script which was used to create the database (weighted_it_reconstructions.py).

MD5 and SHA256 checksums of the database and Python script files (weighted_it_reconstructions.MD5SUMS / weighted_it_reconstructions.SHA256SUMS).

The HDF5 database is licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/) . Since the CC BY 4.0 license is not well suited for source code, the Python script is licensed under the BSD 2-Clause license (http://opensource.org/licenses/BSD-2-Clause) .

The files are provided as-is with no warranty as detailed in the above mentioned licenses.

The database is split into ten parts:

weighted_it_reconstructions.hdf5.tar.xz.part-00

weighted_it_reconstructions.hdf5.tar.xz.part-01

weighted_it_reconstructions.hdf5.tar.xz.part-02

weighted_it_reconstructions.hdf5.tar.xz.part-03

weighted_it_reconstructions.hdf5.tar.xz.part-04

weighted_it_reconstructions.hdf5.tar.xz.part-05

weighted_it_reconstructions.hdf5.tar.xz.part-06

weighted_it_reconstructions.hdf5.tar.xz.part-07

weighted_it_reconstructions.hdf5.tar.xz.part-08

weighted_it_reconstructions.hdf5.tar.xz.part-09

These tem parts must be concatenated before the database can be extracted from the tar.xz archive. On Unix-like systems this may be done using:

$ cat weighted_it_reconstructions.hdf5.tar.xz.part-* > weighted_it_reconstructions.hdf5.tar.xz

after which the archive may be extracted, e.g., using:

$ tar xfJ weighted_it_reconstructions.hdf5.tar.xz

WARNING: The extracted HDF5 database has a size of 114 GiB.

The simulation results in the database are based on "Atomic Force Microscopy Images of Cell Specimens" and "Atomic Force Microscopy Images of Various Specimens" by Christian Rankl licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). The original images are available at http://dx.doi.org/10.5281/zenodo.17573 and http://dx.doi.org/10.5281/zenodo.60434. The original images are provided as-is without warranty of any kind. Both the original images as well as adapted images are part of the dataset.
Z
68 image lysozyme dataset recorded on the Jungfrau 16M detector at SwissFEL...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brewster, Aaron; Wang, Meitian; Bernstein, Herbert J (2020). 68 image lysozyme dataset recorded on the Jungfrau 16M detector at SwissFEL and formatted as a NeXus file, revised for clean cnxvalidate error report [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3526737
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
PSI
LBL
Ronin Institute
Authors
Brewster, Aaron; Wang, Meitian; Bernstein, Herbert J
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Jungfrau
Description
This kit includes an additional revised master file, lyso009a_0087.JF07T32V01_master_rev.h5 that provides compliance with the October 2019 NXmx specification as proposed in https://github.com/HDRMX/definitions.git

To create a new NeXus master file, assuming DIALS is installed in the folder $DIALS, use this command:

libtbx.python $DIALS/modules/cctbx_project/xfel/swissfel/jf16m_cxigeom2nexus.py unassembled_file=lyso009a_0087.JF07T32V01.h5 geom_file=16M_bernina_backview_optimized_adu_quads.geom wavelength=1.368479 detector_distance=97.830 mask_file=lyso009a_0087.JF07T32V01.mask.h5

Geometry file is in CrystFEL format but has been realigned to group the modules hierarchically into quadrants.

View the data using DIALS: dials.image_viewer lyso009a_0087.JF07T32V01_master.h5

Process the data using DIALS, treating the images as stills, assuming 64 cores available on the system: dials.stills_process mp.nproc=64 lyso009a_0087.JF07T32V01_master.h5 dispersion.gain=10 known_symmetry.space_group=P43212 known_symmetry.unit_cell=77,77,37,90,90,90 refinement_protocol.d_min_start=2.5

Download DIALS at dials.github.io.

After the DIALS run, for full NXmx compliance you will need the jungfrau portions of the script that was used to generate lyso009a_0087.JF07T32V01_master_rev.h5

!/bin/bash

cp Therm_6_2.nxs Therm_6_2_rev.nxs cp Therm_6_2_master.h5 Therm_6_2_master_rev.h5 cp jungfrau/lyso009a_0087.JF07T32V01_master.h5 jungfrau/lyso009a_0087.JF07T32V01_master_rev.h5 export curdat=date +%FT%T.%3 export LD_LIBRARY_PATH=$HOME/lib export HDF5_PLUGIN_PATH=$HOME/lib export PATH=$HOME/bin:$PATH h5copy -i Therm_6_2_rev.nxs -o Therm_6_2_master_rev.h5 -s /entry/instrument/name -d /entry/instrument/name -f ref h5copy -i Therm_6_2_rev.nxs -o Therm_6_2_master_rev.h5 -s /entry/instrument/source -d /entry/source -f ref h5copy -i Therm_6_2_rev.nxs -o Therm_6_2_rev.nxs -s /entry/instrument/source -d /entry/source -f ref h5copy -i jungfrau/lyso009a_0087.JF07T32V01_master.h5 -o jungfrau/lyso009a_0087.JF07T32V01_master_rev.h5 -s /entry/sample/beam -d /entry/instrument/beam -f ref export end_time=h5dump -d "/entry/end_time" Therm_6_2_master.h5 | grep ":" | sed 's/^.........//'|sed 's/.\$//' echo "end_time: $end_time" python << 'EOL' import h5py as h5 import numpy as np import os end_time=os.environ['end_time'] curdat=os.environ['curdat'] fvds = h5.File('Therm_6_2_rev.nxs','r+') fmaster = h5.File('Therm_6_2_master_rev.h5','r+') jungfrau= h5.File('jungfrau/lyso009a_0087.JF07T32V01_master_rev.h5','r+') fvds_keys=fvds.keys() fmaster_keys=fmaster.keys() jungfrau_keys=jungfrau.keys() fvds_entry=fvds['entry'] fmaster_entry=fmaster['entry'] jungfrau_entry=jungfrau['entry'] fvds_entry_keys=fvds_entry.keys() fmaster_entry_keys=fmaster_entry.keys() jungfrau_entry_keys=jungfrau_entry.keys() fvds_entry_instrument=fvds['entry']['instrument'] fmaster_entry_instrument=fmaster['entry']['instrument'] jungfrau_entry_instrument=jungfrau['entry']['instrument'] fvds_entry_instrument_keys=fvds_entry_instrument.keys() fmaster_entry_instrument_keys=fmaster_entry_instrument.keys() jungfrau_entry_instrument_keys=jungfrau_entry_instrument.keys() fvds_entry_instrument_name=(fvds['entry']['instrument']['name']) fmaster_entry_instrument_name=(fmaster['entry']['instrument']['name']) jungfrau['entry']['instrument'].create_dataset("name", data=np.string_("Paul Scherrer Institute SwissFEL Aramis 1 (Alvra)")) jungfrau_entry_instrument_name=(jungfrau['entry']['instrument']['name']) fvds_entry_instrument_short_name=fvds_entry_instrument.attrs['short_name'] fmaster_entry_instrument_short_name=fmaster_entry_instrument.attrs['short_name'] jungfrau_entry_instrument_name.attrs.modify('short_name',np.string_("Alvra")) jungfrau_entry_instrument_short_name=jungfrau_entry_instrument_name.attrs['short_name'] zero_offset=fmaster_entry_instrument['detector']['module']['fast_pixel_direction'].attrs['offset'] fmaster_det_z=fmaster_entry_instrument['transformations']['det_z'] fvds_det_z=fvds_entry_instrument['transformations']['det_z'] print('fvds_keys: ',fvds_keys) print('fmaster_keys: ',fmaster_keys) print('jungfrau_keys: ',jungfrau_keys) print('fvds_entry_keys: ',fvds_entry_keys) print('fmaster_entry_keys: ',fmaster_entry_keys) print('jungfrau_entry_keys: ',jungfrau_entry_keys) print('fvds_entry_instrument_keys: ',fvds_entry_instrument_keys) print('fmaster_entry_instrument_keys: ',fmaster_entry_instrument_keys) print('jungfrau_entry_instrument_keys: ',jungfrau_entry_instrument_keys) print('fvds_entry_instrument_name: ',fvds_entry_instrument_name) print('fmaster_entry_instrument_name: ',fmaster_entry_instrument_name) print('jungfrau_entry_instrument_name: ',jungfrau_entry_instrument_name) print('fvds_entry_instrument_short_name: ',fvds_entry_instrument_short_name) print('fmaster_entry_instrument_short_name: ',fmaster_entry_instrument_short_name) print('jungfrau_entry_instrument_short_name: ',jungfrau_entry_instrument_short_name) print('fmaster_entry_instrument_detector_module_fast_pixel_direction_offset: ',zero_offset) print('fmaster_entry_instrument_detector_detector_z_det_z: ',fmaster_det_z) print('fmaster_entry_end_time: ',end_time) fmaster.attrs.modify('file_time',np.string_(end_time)) fmaster.attrs.modify('file_name',np.string_('Therm_6_2_master_rev.h5')) fmaster.attrs.modify('HDF5_Version',np.string_('hdf5-1.8.18')) fvds.attrs.modify('file_time',np.string_(end_time)) fvds.attrs.modify('file_name',np.string_('Therm_6_2_master_rev.h5')) fvds.attrs.modify('HDF5_Version',np.string_('hdf5-1.10.5')) jungfrau.attrs.modify('file_time',np.string_(curdat)) jungfrau.attrs.modify('file_name',np.string_('lyso009a_0087.JF07T32V01_master.h5')) jungfrau.attrs.modify('HDF5_Version',np.string_('hdf5-1.10.5')) fvds_entry_instrument_name.attrs.modify('short_name',np.string_(fvds_entry_instrument.attrs['short_name'])) fmaster_entry_instrument_name.attrs.modify('short_name',np.string_(fmaster_entry_instrument.attrs['short_name'])) fmaster_entry_instrument['attenuator']['attenuator_transmission'].attrs.modify('units',np.string_("")) fmaster_entry_instrument['detector']['count_time'].attrs.modify('units',np.string_("s")) fvds_entry_instrument_name.attrs.modify('short_name',np.string_(fvds_entry_instrument.attrs['short_name'])) fvds_entry_instrument['attenuator']['attenuator_transmission'].attrs.modify('units',np.string_("")) fvds_entry_instrument['detector']['count_time'].attrs.modify('units',np.string_("s")) fmaster_det_z.attrs.modify('offset',zero_offset) fvds_det_z.attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['phi'].attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['chi'].attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['sam_x'].attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['sam_y'].attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['sam_z'].attrs.modify('offset',zero_offset) fmaster_entry['sample']['transformations']['omega'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['phi'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['chi'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['sam_x'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['sam_y'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['sam_z'].attrs.modify('offset',zero_offset) fvds_entry['sample']['transformations']['omega'].attrs.modify('offset',zero_offset) print(fmaster['entry']['instrument']['name'].attrs['short_name']) print(fmaster['entry']['instrument']['name'].attrs['short_name'].shape) print(fmaster['entry']['instrument']['name'].attrs['short_name'].dtype) print("/entry/instrument/ELE_D0/pixel_mask_applied :",jungfrau_entry_instrument['ELE_D0']['pixel_mask_applied']) del jungfrau_entry_instrument['ELE_D0']['pixel_mask_applied'] jungfrau_entry_instrument['ELE_D0'].create_dataset("pixel_mask_applied",dtype='int8', data=1) print("/entry/instrument/ELE_D0/pixel_mask_applied :",jungfrau_entry_instrument['ELE_D0']['pixel_mask_applied']) jungfrau_entry_source=jungfrau_entry.create_group('source') jungfrau_entry_source=jungfrau_entry['source'] jungfrau_entry_source.attrs.modify('NX_class',np.string_("NXsource")) jungfrau_entry_source.create_dataset("name",data=np.string_("Paul Scherrer Institute SwissFEL")) jungfrau_entry_source['name'].attrs.modify('short_name',np.string_("SwissFEL"))

jungfrau_entry_instrument.create_group['beam']

jungfrau_entry_instrument['beam']=jungfrau_entry['sample']['beam']

jungfrau_entry_instrument['beam'].create_dataset('total_flux',dtype='float64',data=1000000000000.) jungfrau_entry_instrument['beam']['total_flux'].attrs.modify('units',np.string_('/pulse')) del jungfrau_entry['sample']['beam'] del fvds_entry_instrument.attrs['short_name'] del fmaster_entry_instrument.attrs['short_name'] del fmaster_entry_instrument['source'] fvds.close() fmaster.close() jungfrau.close() quit() EOL $HOME/bin/nxvalidate -a NXmx -l /home/yaya/hdrmx_rev_29Sep19/hdrmx/definitions Therm_6_2_master_rev.h5 $HOME/bin/nxvalidate -a NXmx -l /home/yaya/hdrmx_rev_29Sep19/hdrmx/definitions Therm_6_2_rev.nxs $HOME/bin/nxvalidate -a NXmx -l /home/yaya/hdrmx_rev_29Sep19/hdrmx/definitions jungfrau/lyso009a_0087.JF07T32V01_master_rev.h5
Z
Dataset of "Ultrafast momentum-resolved visualization of the interplay...
data-staging.niaid.nih.gov
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barantani, Francesco; Claude, Rémi (2025). Dataset of "Ultrafast momentum-resolved visualization of the interplay between phonon-mediated scattering and plasmons in graphite" [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14760925
Explore at:
Dataset updated
Mar 20, 2025
Dataset provided by
École Polytechnique Fédérale de Lausanne
Authors
Barantani, Francesco; Claude, Rémi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
UED_RAW_sorted: During the experiment, we scanned along the delay stage to vary the time delay between the pump and the probe. Each h5 file contains two diffraction patterns (pump on and pump off) of one second of exposure at a given delay. We merged h5 file with the same delay with the home-made software 'data_explorer' in https://github.com/remiclaude/UED_interface and extract a pickle file containing the diffraction patterns along the delay for imgON (with the pump) and imgOFF (without the pump) and the metadata. - UED_PROCESSED: The jupyter notebook 'treat_pickle.ipynb' in https://github.com/remiclaude/UED_processing used the pickles files in folder 'RAW_sorted' and process them: removes hot pixel, shift images to keep the unscattered beam at the same position, and average the diffraction map along the symmetry axis.

The python files to process and create the figures shown in the publication are shown in the section "related work"
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Králik, Matej (2020). Curated list of HAR datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3831957

Curated list of HAR datasets

Explore at:

Dataset updated

May 18, 2020

Dataset provided by

Student

Authors

Králik, Matej

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A curated list of preprocessed & ready to use under a minute Human Activity Recognition datasets.

All the datasets are preprocessed in HDF5 format, created using the h5py python library. Scripts used for data preprocessing are provided as well (Load.ipynb and load_jordao.py)

Each HDF5 file contains at least the keys:

x a single array of size [sample count, temporal length, sensor channel count], contains the actual sensor data. Metadata contains the names of individual sensor channel count. All samples are zero-padded for constant length in the file, original lengths before padding available under the meta keys.

y a single array of size [sample count] with integer values for target classes (zero-based). Metadata contains the names of the target classes.

meta contain various metadata, depends on the dataset (original length before padding, subject no., trial no., etc.)

Usage example

import h5py

with h5py.File(f'data/waveglove_multi.h5', 'r') as h5f: x = h5f['x'] y = h5f['y']['class'] print(f'WaveGlove-multi: {x.shape[0]} samples') print(f'Sensor channels: {h5f["x"].attrs["channels"]}') print(f'Target classes: {h5f["y"].attrs["labels"]}') first_sample = x[0]

Output:

WaveGlove-multi: 10044 samples

Sensor channels: ['acc1-x' 'acc1-y' 'acc1-z' 'gyro1-x' 'gyro1-y' 'gyro1-z' 'acc2-x'

'acc2-y' 'acc2-z' 'gyro2-x' 'gyro2-y' 'gyro2-z' 'acc3-x' 'acc3-y'

'acc3-z' 'gyro3-x' 'gyro3-y' 'gyro3-z' 'acc4-x' 'acc4-y' 'acc4-z'

'gyro4-x' 'gyro4-y' 'gyro4-z' 'acc5-x' 'acc5-y' 'acc5-z' 'gyro5-x'

'gyro5-y' 'gyro5-z']

Target classes: ['null' 'hand swipe left' 'hand swipe right' 'pinch in' 'pinch out'

'thumb double tap' 'grab' 'ungrab' 'page flip' 'peace' 'metal']

Current list of datasets:

WaveGlove-single (waveglove_single.h5)

WaveGlove-multi (waveglove_multi.h5)

uWave (uwave.h5)

OPPORTUNITY (opportunity.h5)

PAMAP2 (pamap2.h5)

SKODA (skoda.h5)

MHEALTH (non overlapping windows) (mhealth.h5)

Six datasets with all four predefined train/test folds as preprocessed by Jordao et al. originally in WearableSensorData (FNOW, LOSO, LOTO and SNOW prefixed .h5 files)

Clear search

Close search

Google apps

Main menu

Curated list of HAR datasets

Output:

WaveGlove-multi: 10044 samples

Sensor channels: ['acc1-x' 'acc1-y' 'acc1-z' 'gyro1-x' 'gyro1-y' 'gyro1-z' 'acc2-x'

'acc2-y' 'acc2-z' 'gyro2-x' 'gyro2-y' 'gyro2-z' 'acc3-x' 'acc3-y'

'acc3-z' 'gyro3-x' 'gyro3-y' 'gyro3-z' 'acc4-x' 'acc4-y' 'acc4-z'

'gyro4-x' 'gyro4-y' 'gyro4-z' 'acc5-x' 'acc5-y' 'acc5-z' 'gyro5-x'

'gyro5-y' 'gyro5-z']

Target classes: ['null' 'hand swipe left' 'hand swipe right' 'pinch in' 'pinch out'

'thumb double tap' 'grab' 'ungrab' 'page flip' 'peace' 'metal']

Dataset for Design Ideation Study

ISIC 2020 - 256x256

Transformer network trained on simulated X-ray photoelectron spectroscopy...

Dataset Description

Context and methodology

Data details

SESSA Simulation Script

Technical Details

Neural Network Training Script

Technical Details

Rescaled CIFAR-10 dataset

Motivation

Access and rights

The dataset

The h5 files containing the dataset

Instructions for loading the data set

Explore data formats and ingestion methods

Why this Dataset

Iris Dataset

Content

Acknowledgements

Inspiration

Schrammer, Stefan (2023). Dataset: Numerical experiments for "on dynamical...

3D MNIST

Context

Content

full_dataset_vectors.h5

train_point_clouds.h5 & test_point_clouds.h5

voxelgrid.py

plot3D.py

Acknowledgements

Have fun!

Data from: Closed Loop Geothermal Working Group: GeoCLUSTER App, Subsurface...

ISIC 2018 Task 3 - 256x256

Rescaled Fashion-MNIST dataset

Motivation

Access and rights

The dataset

The h5 files containing the dataset

Instructions for loading the data set

Dataset for Arxiv:2507.02458

Synchrotron X-ray Computed Tomography scan of a wasp

Thermodynamic stability at the two-particle level - numerical results for...

Structure Assisted Compressed Sensing Reconstruction of Undersampled AFM...

68 image lysozyme dataset recorded on the Jungfrau 16M detector at SwissFEL...

!/bin/bash

jungfrau_entry_instrument.create_group['beam']

jungfrau_entry_instrument['beam']=jungfrau_entry['sample']['beam']

Dataset of "Ultrafast momentum-resolved visualization of the interplay...

Curated list of HAR datasets

Output:

WaveGlove-multi: 10044 samples

Sensor channels: ['acc1-x' 'acc1-y' 'acc1-z' 'gyro1-x' 'gyro1-y' 'gyro1-z' 'acc2-x'

'acc2-y' 'acc2-z' 'gyro2-x' 'gyro2-y' 'gyro2-z' 'acc3-x' 'acc3-y'

'acc3-z' 'gyro3-x' 'gyro3-y' 'gyro3-z' 'acc4-x' 'acc4-y' 'acc4-z'

'gyro4-x' 'gyro4-y' 'gyro4-z' 'acc5-x' 'acc5-y' 'acc5-z' 'gyro5-x'

'gyro5-y' 'gyro5-z']

Target classes: ['null' 'hand swipe left' 'hand swipe right' 'pinch in' 'pinch out'

'thumb double tap' 'grab' 'ungrab' 'page flip' 'peace' 'metal']