Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A curated list of preprocessed & ready to use under a minute Human Activity Recognition datasets.
All the datasets are preprocessed in HDF5 format, created using the h5py python library. Scripts used for data preprocessing are provided as well (Load.ipynb and load_jordao.py)
Each HDF5 file contains at least the keys:
x a single array of size [sample count, temporal length, sensor channel count], contains the actual sensor data. Metadata contains the names of individual sensor channel count. All samples are zero-padded for constant length in the file, original lengths before padding available under the meta keys.
y a single array of size [sample count] with integer values for target classes (zero-based). Metadata contains the names of the target classes.
meta contain various metadata, depends on the dataset (original length before padding, subject no., trial no., etc.)
Usage example
import h5py
with h5py.File(f'data/waveglove_multi.h5', 'r') as h5f: x = h5f['x'] y = h5f['y']['class'] print(f'WaveGlove-multi: {x.shape[0]} samples') print(f'Sensor channels: {h5f["x"].attrs["channels"]}') print(f'Target classes: {h5f["y"].attrs["labels"]}') first_sample = x[0]
Current list of datasets:
WaveGlove-single (waveglove_single.h5)
WaveGlove-multi (waveglove_multi.h5)
uWave (uwave.h5)
OPPORTUNITY (opportunity.h5)
PAMAP2 (pamap2.h5)
SKODA (skoda.h5)
MHEALTH (non overlapping windows) (mhealth.h5)
Six datasets with all four predefined train/test folds as preprocessed by Jordao et al. originally in WearableSensorData (FNOW, LOSO, LOTO and SNOW prefixed .h5 files)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains 2D and 3D complex field sinograms for optical diffraction tomography created with finite-difference time domain simulations using Meep (http://ab-initio.mit.edu/wiki/index.php?title=Meep). The entire complex electric field data from the simulations would have amounted to >2TB of data which is not easy to share and also contain a lot of redundant and uninteresting (for tomography at least) information. Most of these data were used in the ODTbrain paper (https://dx.doi.org/10.1186/s12859-015-0764-0).
Data Structure Each dataset is an HDF5 file (https://www.hdfgroup.org) that contains the simulation structure (the cell phantom), a background field (simulation without phantom), and a field for each rotational position of the phantom (sinogram). The fields are slices through the complex electrical field in the original simulation volume behind the phantom at the end of the simulation (supposedly steady state). The slice position is written as the HDF5 attribute “extraction focus distance [px]”. The slice position is important for the reconstruction, because the fields must be numerically refocused to the center of the simulation volume before reconstruction. The perfectly matched layer (PML) has already been cropped from the fields. Alongside each field, the source code of the Meep simulation and the standard-output of the compiled simulation are stored. You can also find the simulation templates in the ODTbrain repository at https://github.com/RI-imaging/ODTbrain/tree/master/misc. I recommend you to explore the files using HDFView (https://www.hdfgroup.org/downloads/hdfview/).
Naming Scheme I adopted the naming scheme of the original simulations. - The first part of the file name determines the dimension of the simulation. The larger “phantom_3d” files contain the 3D simulation sinograms. - “A” is the total number of angles for which simulations were performed. - “R” is the resolution (number of pixels per wavelength). - “T” is the total number of simulation steps performed. - “Nmed” is the refractive index (RI) of the medium surrounding the cell phantom. - “Ncyt” is the RI of the phantom’s cytoplasm. - “Nnuc” is the RI of the phantom’s nucleus. - “Nleo” is the RI of the phantom’s nucleolus. The final part of the file name indicates to which type of study the simulation belongs: - “angles”: varying the total number of acquisition angles - “step-count”: varying the total number of time steps - “refractive-index”: varying the internal RI values of the cell phantom - “size”: varying the size of the phantom
Getting Started I added two Python scripts “recon_2d.py” and “recon_3d.py” (tested with Python 3.9 on Ubuntu 22.04) that will allow you to obtain RI reconstructions from the 2D and 3D sinograms. For this to work, you will have to install the Python libraries imported in those scripts. Note that for the 3D data you can also use the graphical tool CellReel (https://github.com/RI-imaging/CellReel).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General information on the data set
The dataset was generated with two different measurement systems at the ZeMA testbed for electromechanical cylinders.
All relevant information can be found within the hdf5 file.
Example for reading out the metadata of the hdf5 file in MATLAB:
# available structures inside file
dataset = 'axis11_2kHz_ZeMA_PTB_SI.h5';
h5disp(dataset)
% general attributes about file
attr = h5info(dataset).Attributes;
project = jsondecode(attr(1,1).Value)
person = jsondecode(attr(2,1).Value)
publication = jsondecode(attr(3,1).Value)
experiment = jsondecode(attr(4,1).Value)
Example for reading out the metadata of the hdf5 file in Python:
import h5py
import json
# open file
h5file = h5py.File("axis11_2kHz_ZeMA_PTB_SI.h5", "r")
# general attributes about file
for key in h5file.attrs:
print(key)
val = json.loads(h5file.attrs[key])
for subkey, subval in val.items():
print(" ", subkey, " : ", subval)
# available structures inside file
h5file.visit(print)
# proper exit
h5file.close()
Metadata output of the hdf5 file:
HDF5 axis11_2kHz_ZeMA_PTB_SI.h5
Group '/'
Attributes:
'Project': '{
"fullTitle":"Metrology for the Factory of the Future",
"acronym":"Met4FoF",
"websiteLink":"www.met4fof.eu",
"fundingSource":"European Commission (EC)",
"fundingAdministrator":"EURAMET",
"funding programme":"EMPIR",
"fundingNumber":"17IND12",
"acknowledgementText":"This work has received funding within the project 17IND12 Met4FoF from the EMPIR program co-financed by the Participating States and from the European Union's Horizon 2020 research and innovation program. The authors want to thank Clifford Brown, Daniel Hutzschenreuter, Holger Israel, Giacomo Lanza, Bj\u00f6rn Ludwig, and Julia Neumann fromPhysikalisch-Technische Bundesanstalt (PTB) for their helpful suggestions and support."
}'
'Person': '{
"dc:author":[
"Tanja Dorst",
"Maximilian Gruber",
"Anupam Prasad Vedurmudi"
],
"e-mail":[
"t.dorst@zema.de",
"maximilian.gruber@ptb.de",
"anupam.vedurmudi@ptb.de"
],
"affiliation":[
"ZeMA gGmbH",
"Physikalisch-Technische Bundesanstalt",
"Physikalisch-Technische Bundesanstalt"
]
}'
'Publication': '{
"dc:identifier":"10.5281/zenodo.5185953",
"dc:license":"Creative Commons Attribution 4.0 International (CC-BY-4.0)",
"dc:title":"Sensor data set of one electromechanical cylinder at ZeMA testbed (ZeMA DAQ and Smart-Up Unit)",
"dc:description":"The data set was generated with two different measurement systems at the ZeMA testbed. The ZeMA DAQ unit consists of 11 sensors and the SmartUp-Unit has 13 differentsignals. A typical working cycle lasts 2.8s and consists of a forward stroke, a waiting time and a return stroke of the electromechanical cylinder. The data set does not consist of the entire working cycles. Only one second of the return stroke of every 100rd working cycle is included. The dataset consists of 4776 cycles. One row represents one second of the return stroke of one working cycle.",
"dc:subject":[
"dynamic measurement",
"measurement uncertainty",
"sensor network",
"digital sensors",
"MEMS",
"machine learning",
"European Union (EU)",
"Horizon 2020",
"EMPIR"
],
"dc:SizeOrDuration":"24 sensors, 4776 cycles and 2000 datapoints each",
"dc:type":"Dataset",
"dc:issued":"2021-09-10",
"dc:bibliographicCitation":"T. Dorst, M. Gruber and A. P. Vedurmudi : Sensor data set of one electromechanical cylinder at ZeMA testbed (ZeMA DAQ and Smart-Up Unit), Zenodo [data set], https://doi.org/10.5281/zenodo.5185953, 2021."
}'
'Experiment': '{
"date":"2021-03-29/2021-04-15",
"DUT":"Festo ESBF cylinder",
"identifier":"axis11",
"label":"Electromechanical cylinder no. 11"
}'
HDF5 axis11_2kHz_ZeMA_PTB_SI.h5
Group '/PTB_SUU'
Group '/PTB_SUU/BMA_280'
Group '/PTB_SUU/BMA_280/Acceleration'
Attributes:
'qudt:hasQuantityKind': '[
"qudt:Acceleration",
"qudt:Acceleration",
"qudt:Acceleration"
]'
'misc': '{
"interpolation_scheme":"cubic"
}'
'si:unit': '"\\metre\\second\\tothe{-2}"'
'sosa:madeBySensor': '"BMA 280"'
'rdf:type': '"qudt:Quantity"'
Dataset 'qudt:standardUncertainty'
Size: 4766x1000x3
MaxSize: 4766x1000x3
Datatype: H5T_IEEE_F64LE (double)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'si:label': '[
"X acceleration uncertainty",
"Y acceleration uncertainty",
"Z acceleration uncertainty"
]'
Dataset 'qudt:value'
Size: 4766x1000x3
MaxSize: 4766x1000x3
Datatype: H5T_IEEE_F64LE (double)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'si:label': '[
"X acceleration",
"Y acceleration",
"Z acceleration"
]'
Group '/ZeMA_DAQ'
Group '/ZeMA_DAQ/Pressure'
Attributes:
'qudt:hasQuantityKind': '"qudt:Pressure"'
'sosa:madeBySensor': '"Festo VPPM"'
'si:unit': '"\\pascal"'
'rdf:type': '"qudt:Quantity"'
Dataset 'qudt:standardUncertainty'
Size: 4766x2000
MaxSize: 4766x2000
Datatype: H5T_IEEE_F64LE (double)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'si:label': '"Pneumatic pressure uncertainty"'
Dataset 'qudt:value'
Size: 4766x2000
MaxSize: 4766x2000
Datatype: H5T_IEEE_F64LE (double)
ChunkSize: []
Filters: none
FillValue: 0.000000
Attributes:
'si:label': '"Pneumatic pressure"'
'misc': '{
"raw_data":false,
"comment":"Converted from ADC values based on appropriate conversion."
}'
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of acoustic impulse responses (AIRs) measured at Hottinger Brüel & Kjær in Virum, Denmark. These AIRs were captured using a linear array of 11 microphones, which were shifted away from a volume velocity source (a Brüel & Kjær Omni-source placed inside of a car tyre) in regular steps of 0.1 m, resulting in a total of 99 measurement positions. The source was also shifted around the tyre in 17 equiangular positions, and the AIRs were captured using the Maximum Length Sequence (MLS) method with a pink noise signal. Finally, a reference microphone was used to capture the far field contributions of the source and potentially used as validation. The layout of the HBK measurement can be seen in the layout figure (measurment_layout.png).
The data are available as H5 (Hierarchical Data Formats) with the following structure: Dataset "HBK data - HDF5 datastructure"
Dataset.attributes ├── disambiguate_measurement_data (1) 'Nsources x Npositions x Nmics x Ntaps' └── fs (1) 8000 ├── measurement_data (17x9x11x2000) float64 ├── mic_coords (9x11x2) float64 ├── ref_pos (2x1) float64 ├── reference_data (17x2000) float64 └── src_pos (2x17) float64
Use with Matlab: https://mathworks.com/help/matlab/hdf5-files.html Use with Python: https://docs.h5py.org/en/stable/
This dataset is part of the project Danish Sound Cluster project titled "Physics-informed Neural Networks for Sound Field Reconstruction" [url]. Find more datasets here: Project Page
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This updated version uses larger ranges for the material parameters to account for a wider range of samples. Further, an erroneous boundary condition in the simulation model used to generate the data set in the previous version has been fixed.
This dataset contains the results of 301008 finite-element simulations of the complex, frequency-dependent electrical impedance of a piezoelectric ring with randomised material parameters. Each impedance consists of 2000 samples in the frequency domain up to 8 MHz. We assume the sample to be dielectric, thus the impedance a frequency 0 Hz is infinite. The piezoelectric ring has an outer radius of 6.35 mm an inner radius of 2.6 mm and a thickness of 1 mm. The transverse isotropic material parameters are sampled from independent uniform distributions with ranges that are intended to represent the behaviour of different piezoceramic materials. The parameters of the Rayleigh damping model (alpha_M and alpha_K) are sampled from a logarithmic distribution to account for the larger parameter range.
Parameter | Min | Max | Unit | Description |
c11 | 110 | 165 | GPa | Elastic stiffness |
c12 | 60 | 150 | GPa | Elastic stiffness |
c13 | 60 | 100 | GPa | Elastic stiffness |
c33 | 110 | 140 | GPa | Elastic stiffness |
c44 | 18 | 30 | GPa | Elastic stiffness |
eps11 | 3 | 12 | nF/m | Dielectric permittivity |
eps33 | 4 | 8 | nF/m | Dielectric permittivity |
e15 | 6 | 20 | C/m^2 | Piezoelectric coupling |
e31 | 1 | 7.5 | C/m^2 | Piezoelectric coupling |
e33 | 8 | 20 | C/m^2 | Piezoelectric coupling |
alpha_M | 0.2 | 150 | 1/ms | Mass-proportional damping |
alpha_K | 10 | 1000 | ps | Stiffness-proportional damping |
density | 7600 | 7850 | kg/m^3 | Density |
The dataset contains the following files:
The dataset is stored as a HDF5 file, which can be opened with all libraries that support that format, e.g. in Python using the h5py library:
import h5py
# Open the dataset in read mode.
file = h5py.File("dataset.hdf5", "r")
# Impedances as a 301008 x 2000 array of complex numbers.
impedances = file["impedances"]
# Material parameter values as a 301008 x 13 array of real numbers.
parameters = file["parameters"]
# Frequency vector of the impedance with length 2000.
frequencies = file["meta"]["frequencies"]
# 13 strings with the identifiers of the material parameters.
parameter_labels = file["meta"]["parameter_labels"]
To generate a result for the electrical impedance using the supplied simulation files, download and install openCFS and call the executable with the simulation.xml, but omit the file extension, e.g.:
cfsbin.exe simulation
The path to the executable of openCFS will depend on your operating system and installation. Running the simulation will result in the creation of several files and folders. Among those files will be the result for the electric charge on one of the electrodes of the sample, which will be placed in the 'history' subfolder. We can determine the current by taking the time derivative of the charge and already know the voltage because we excited the piezoceramic with an electric potential of 1 V in the simulation. Because the simulation is conducted in the frequency regime, all we have to do is to divide voltage by current to get the frequency dependent electrical impedance. The loading of the result and calculation of the impedance is implemented in the following Python script as an example:
import numpy as np
# Load result file for electric charge
result_path = 'history/simulation-elecCharge-surfRegion-ground.hist'
data = np.loadtxt(result_path)
frequency = data[:, 0]
# Convert polar representation from file to complex numbers.
charge = data[:, 1] * np.exp(1j * 2 * np. pi / 360 * data[:, 2])
# Excitation potential is 1 V in simulation.
potential = 1
# Determine impedance by applying Z = V / I = V / (j omega Q).
impedance = potential / (1j * 2 * np.pi * frequency * charge)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A curated list of preprocessed & ready to use under a minute Human Activity Recognition datasets.
All the datasets are preprocessed in HDF5 format, created using the h5py python library. Scripts used for data preprocessing are provided as well (Load.ipynb and load_jordao.py)
Each HDF5 file contains at least the keys:
x a single array of size [sample count, temporal length, sensor channel count], contains the actual sensor data. Metadata contains the names of individual sensor channel count. All samples are zero-padded for constant length in the file, original lengths before padding available under the meta keys.
y a single array of size [sample count] with integer values for target classes (zero-based). Metadata contains the names of the target classes.
meta contain various metadata, depends on the dataset (original length before padding, subject no., trial no., etc.)
Usage example
import h5py
with h5py.File(f'data/waveglove_multi.h5', 'r') as h5f: x = h5f['x'] y = h5f['y']['class'] print(f'WaveGlove-multi: {x.shape[0]} samples') print(f'Sensor channels: {h5f["x"].attrs["channels"]}') print(f'Target classes: {h5f["y"].attrs["labels"]}') first_sample = x[0]
Current list of datasets:
WaveGlove-single (waveglove_single.h5)
WaveGlove-multi (waveglove_multi.h5)
uWave (uwave.h5)
OPPORTUNITY (opportunity.h5)
PAMAP2 (pamap2.h5)
SKODA (skoda.h5)
MHEALTH (non overlapping windows) (mhealth.h5)
Six datasets with all four predefined train/test folds as preprocessed by Jordao et al. originally in WearableSensorData (FNOW, LOSO, LOTO and SNOW prefixed .h5 files)