Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a subsampled version of the STEAD dataset, specifically tailored for training our CDiffSD model (Cold Diffusion for Seismic Denoising). It consists of four HDF5 files, each saved in a format that requires Python's `h5py` method for opening.
The dataset includes the following files:
Each file is structured to support the training and evaluation of seismic denoising models.
The HDF5 files named noise contain two main datasets:
Similarly, the train and test files, which contain earthquake data, include the same traces and metadata datasets, but also feature two additional datasets:
To load these files in a Python environment, use the following approach:
```python
import h5py
import numpy as np
# Open the HDF5 file in read mode
with h5py.File('train_noise.hdf5', 'r') as file:
# Print all the main keys in the file
print("Keys in the HDF5 file:", list(file.keys()))
if 'traces' in file:
# Access the dataset
data = file['traces'][:10] # Load the first 10 traces
if 'metadata' in file:
# Access the dataset
trace_name = file['metadata'][:10] # Load the first 10 metadata entries```
Ensure that the path to the file is correctly specified relative to your Python script.
To use this dataset, ensure you have Python installed along with the Pandas library, which can be installed via pip if not already available:
```bash
pip install numpy
pip install h5py
```
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A curated list of preprocessed & ready to use under a minute Human Activity Recognition datasets.
All the datasets are preprocessed in HDF5 format, created using the h5py python library. Scripts used for data preprocessing are provided as well (Load.ipynb and load_jordao.py)
Each HDF5 file contains at least the keys:
x
a single array of size [sample count, temporal length, sensor channel count]
, contains the actual sensor data. Metadata contains the names of individual sensor channel count. All samples are zero-padded for constant length in the file, original lengths before padding available under the meta
keys.y
a single array of size [sample count]
with integer values for target classes (zero-based). Metadata contains the names of the target classes.meta
contain various metadata, depends on the dataset (original length before padding, subject no., trial no., etc.)Usage example
import h5py
with h5py.File(f'data/waveglove_multi.h5', 'r') as h5f:
x = h5f['x']
y = h5f['y']['class']
print(f'WaveGlove-multi: {x.shape[0]} samples')
print(f'Sensor channels: {h5f["x"].attrs["channels"]}')
print(f'Target classes: {h5f["y"].attrs["labels"]}')
first_sample = x[0]
# Output:
# WaveGlove-multi: 10044 samples
# Sensor channels: ['acc1-x' 'acc1-y' 'acc1-z' 'gyro1-x' 'gyro1-y' 'gyro1-z' 'acc2-x'
# 'acc2-y' 'acc2-z' 'gyro2-x' 'gyro2-y' 'gyro2-z' 'acc3-x' 'acc3-y'
# 'acc3-z' 'gyro3-x' 'gyro3-y' 'gyro3-z' 'acc4-x' 'acc4-y' 'acc4-z'
# 'gyro4-x' 'gyro4-y' 'gyro4-z' 'acc5-x' 'acc5-y' 'acc5-z' 'gyro5-x'
# 'gyro5-y' 'gyro5-z']
# Target classes: ['null' 'hand swipe left' 'hand swipe right' 'pinch in' 'pinch out'
# 'thumb double tap' 'grab' 'ungrab' 'page flip' 'peace' 'metal']
Current list of datasets:
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a subsampled version of the STEAD dataset, specifically tailored for training our CDiffSD model (Cold Diffusion for Seismic Denoising). It consists of four HDF5 files, each saved in a format that requires Python's `h5py` method for opening.
The dataset includes the following files:
Each file is structured to support the training and evaluation of seismic denoising models.
The HDF5 files named noise contain two main datasets:
Similarly, the train and test files, which contain earthquake data, include the same traces and metadata datasets, but also feature two additional datasets:
To load these files in a Python environment, use the following approach:
```python
import h5py
import numpy as np
# Open the HDF5 file in read mode
with h5py.File('train_noise.hdf5', 'r') as file:
# Print all the main keys in the file
print("Keys in the HDF5 file:", list(file.keys()))
if 'traces' in file:
# Access the dataset
data = file['traces'][:10] # Load the first 10 traces
if 'metadata' in file:
# Access the dataset
trace_name = file['metadata'][:10] # Load the first 10 metadata entries```
Ensure that the path to the file is correctly specified relative to your Python script.
To use this dataset, ensure you have Python installed along with the Pandas library, which can be installed via pip if not already available:
```bash
pip install numpy
pip install h5py
```