Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context : We share a large database containing electroencephalographic signals from 87 human participants, with more than 20,800 trials in total representing about 70 hours of recording. It was collected during brain-computer interface (BCI) experiments and organized into 3 datasets (A, B, and C) that were all recorded following the same protocol: right and left hand motor imagery (MI) tasks during one single day session. It includes the performance of the associated BCI users, detailed information about the demographics, personality and cognitive user’s profile, and the experimental instructions and codes (executed in the open-source platform OpenViBE). Such database could prove useful for various studies, including but not limited to: 1) studying the relationships between BCI users' profiles and their BCI performances, 2) studying how EEG signals properties varies for different users' profiles and MI tasks, 3) using the large number of participants to design cross-user BCI machine learning algorithms or 4) incorporating users' profile information into the design of EEG signal classification algorithms.
Sixty participants (Dataset A) performed the first experiment, designed in order to investigated the impact of experimenters' and users' gender on MI-BCI user training outcomes, i.e., users performance and experience, (Pillette & al). Twenty one participants (Dataset B) performed the second one, designed to examined the relationship between users' online performance (i.e., classification accuracy) and the characteristics of the chosen user-specific Most Discriminant Frequency Band (MDFB) (Benaroch & al). The only difference between the two experiments lies in the algorithm used to select the MDFB. Dataset C contains 6 additional participants who completed one of the two experiments described above. Physiological signals were measured using a g.USBAmp (g.tec, Austria), sampled at 512 Hz, and processed online using OpenViBE 2.1.0 (Dataset A) & OpenVIBE 2.2.0 (Dataset B). For Dataset C, participants C83 and C85 were collected with OpenViBE 2.1.0 and the remaining 4 participants with OpenViBE 2.2.0. Experiments were recorded at Inria Bordeaux sud-ouest, France.
Duration : Each participant's folder is composed of approximately 48 minutes EEG recording. Meaning six 7-minutes runs and a 6-minutes baseline.
Documents Instructions: checklist read by experimenters during the experiments. Questionnaires: the Mental Rotation test used, the translation of 4 questionnaires, notably the Demographic and Social information, the Pre and Post-session questionnaires, and the Index of Learning style. English and french version Performance: The online OpenViBE BCI classification performances obtained by each participant are provided for each run, as well as answers to all questionnaires Scenarios/scripts : set of OpenViBE scenarios used to perform each of the steps of the MI-BCI protocol, e.g., acquire training data, calibrate the classifier or run the online MI-BCI
Database : raw signals Dataset A : N=60 participants Dataset B : N=21 participants Dataset C : N=6 participants
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Traffic volumes data across Dublin City from the SCATS traffic management system. The Sydney Coordinated Adaptive Traffic System (SCATS) is an intelligent transportation system used to manage timing of signal phases at traffic signals. SCATS uses sensors at each traffic signal to detect vehicle presence in each lane and pedestrians waiting to cross at the local site. The vehicle sensors are generally inductive loops installed within the road. 3 resources are provided: SCATS Traffic Volumes Data (Monthly) Contained in this report are traffic counts taken from the SCATS traffic detectors located at junctions. The primary function for these traffic detectors is for traffic signal control. Such devices can also count general traffic volumes at defined locations on approach to a junction. These devices are set at specific locations on approaches to the junction but may not be on all approaches to a junction. As there are multiple junctions on any one route, it could be expected that a vehicle would be counted multiple times as it progress along the route. Thus the traffic volume counts here are best used to represent trends in vehicle movement by selecting a specific junction on the route which best represents the overall traffic flows. Information provided: End Time: time that one hour count period finishes. Region: location of the detector site (e.g. North City, West City, etc). Site: this can be matched with the SCATS Sites file to show location Detector: the detectors/ sensors at each site are numbered Sum volume: total traffic volumes in preceding hour Avg volume: average traffic volumes per 5 minute interval in preceding hour All Dates Traffic Volumes Data This file contains daily totals of traffic flow at each site location. SCATS Site Location Data Contained in this report, the location data for the SCATS sites is provided. The meta data provided includes the following; Site id – This is a unique identifier for each junction on SCATS Site description( CAP) – Descriptive location of the junction containing street name(s) intersecting streets Site description (lower) - – Descriptive location of the junction containing street name(s) intersecting streets Region – The area of the city, adjoining local authority, region that the site is located LAT/LONG – Coordinates Disclaimer: the location files are regularly updated to represent the locations of SCATS sites under the control of Dublin City Council. However site accuracy is not absolute. Information for LAT/LONG and region may not be available for all sites contained. It is at the discretion of the user to link the files for analysis and to create further data. Furthermore, detector communication issues or faulty detectors could also result in an inaccurate result for a given period, so values should not be taken as absolute but can be used to indicate trends.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
INTRODUCTION
The goal of these LPWAN datasets is to provide the global research community with a benchmark tool to evaluate fingerprint localization algorithms in large outdoor environments with various properties. An identical collection methodology was used for all datasets: during a period of three months, numerous devices containing a GPS receiver periodically obtained new location data, which was sent to a local data server via a Sigfox or LoRaWAN message. Together with network information such as the receiving time of the message, base station IDs' of all receiving base stations and the Received Signal Strength Indicator (RSSI) per base station, this location data was stored in one of the three LPWAN datasets:
lorawan_dataset_antwerp.csv
130 430 LoRaWAN messages, obtained in the city center of Antwerp
sigfox_dataset_antwerp.csv
14 378 Sigfox messages, obtained in the city center of Antwerp
sigfox_dataset_rural.csv
25 638 Sigfox messages, obtained in a rural area between Antwerp and Ghent
As the rural and urban Sigfox datasets were recorded in adjacent areas, many base stations that are located at the border of these areas can be found in both datasets. However, they do not necessarily share the same identifier: e.g. ‘BS 1’ in the urban Sigfox dataset could be the same base station as ‘BS 36’ in the rural Sigfox dataset. If the user intends to combine both Sigfox datasets, the mapping of the ID's of these base stations can be found in the file:
sigfox_bs_mapping.csv
The collection methodology of the datasets, and the first results of a basic fingerprinting implementation are documented in the following journal paper: http://www.mdpi.com/2306-5729/3/2/13
UPDATES IN VERSION 1.2
In this version of the LPWAN dataset, only the LoRaWAN set has been updated. The Sigfox datasets remain identical to version 1.0 and 1.1. The main updates in the LoRaWAN set are the following:
New data: the LoRaWAN messages in the new set are collected 1 year after the previous dataset version. To be consistent with the previous versions, the new LoRaWAN set is uploaded in the same .CSV format as before. This upload can still be found in this repository as ‘lorawan_dataset_antwerp.csv’.
More gateways: Compared to the previous dataset, 4 gateways were added to the LoRaWAN network. The RSSI of these gateways are shown in columns ‘BS 69’, ‘BS 70’,‘BS 71’ and ‘BS 72’. All other ‘BS’ columns are in the same order as in previous dataset versions.
More metadata: In the previous LoRaWAN dataset, metadata was limited to 3 receiving gateways per message. In the new dataset version, metadata from all receiving gateways is included in every message. Moreover, some gateways provide a timestamp with nanosecond precision, which can be used to evaluate Time Difference of Arrival localization methods with LoRaWAN.
2 file formats: As more metadata becomes available, we find it important to share the dataset in a clearer overview. This also allows researchers to evaluate the performance of LoRaWAN in an urban environment. Therefore, we publish the new LoRaWAN dataset as a .CSV file as described above, but also as a .JSON file (lorawan_antwerp_2019_dataset.json.txt, the .txt file type had to be appended, otherwise the file could not be uploaded to Zenodo) An example of one message in this JSON format can be seen below:
JSON format description:
HDOP: Horizontal Dilution of Precision
dev_addr: LoRaWAN device address
dev_eui: LoRaWAN device EUI
sf: Spreading factor
channel: TX channel (EU region)
payload: application payload
adr: Adaptive Data Rate (1 = enabled, 0= disabled)
counter: device uplink message counter
latitude: Groundtruth TX location latitude
longitude: Groundtruth TX location longitude
airtime: signal airtime (seconds)
gateways:
rssi: Received Signal Strength
esp: Estimated Signal Power
snr: Signal-to-Noise Ratio
ts_type: Timestamp type. If this says "GPS_RADIO", a nanosecond precise timestamp is available
time: time of arrival at the gateway
id: gateway ID
JSON example
{ "hdop": 0.7, "dev_addr": "07000EFE", "payload": "008d000392d54c4284d18c403333333f04682aa9410500e8fd4106cabdbc420f00db0d470ce32ac93f0d582be93f0bfa3f8d3f", "adr": 1, "latitude": 51.20856475830078, "counter": 31952, "longitude": 4.400575637817383, "airtime": 0.112896, "gateways": [ { "rssi": -115, "esp": -115.832695, "snr": 6.75, "rx_time": { "ts_type": "None", "time": "2019-01-04T08:59:53.079+01:00" }, "id": "08060716" }, { "rssi": -116, "esp": -125.51497, "snr": -9.0, "rx_time": { "ts_type": "GPS_RADIO", "time": "2019-01-04T08:59:53.962029179+01:00" }, "id": "FF0178DF" } ], "dev_eui": "3432333853376B18", "sf": 7, "channel": 8 }
The International Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data for each country and region across the globe, where data is available. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
**SUBF Dataset v1.0: Bearing Fault Diagnosis using Vibration Signals **
Description The SUBF dataset v1.0 has been designed for the analysis and diagnosis of mechanical bearing faults. The mechanical setup consists of a motor, a frame/base, bearings, and a shaft, simulating different machine conditions such as a healthy state, inner race fault, and outer race fault. This dataset aims to facilitate reproducibility and support research in mechanical fault diagnosis and machine condition monitoring.
The dataset is part of the research paper "Aziz, S., Khan, M. U., Faraz, M., & Montes, G. A. (2023). Intelligent bearing faults diagnosis featuring automated relative energy-based empirical mode decomposition and novel cepstral autoregressive features. Measurement, 216, 112871." DOI: https://doi.org/10.1016/j.measurement.2023.112871
The dataset can be used with MATLAB and Python.
Experimental Setup Motor: A 3-phase AC motor, 0.25 HP, operating at 1440 RPM, 50 Hz frequency, and 440 Volts. Target Bearings: The left-side bearing was replaced to represent three categories: - Normal Bearings - Inner Race Fault Bearings - Outer Race Fault Bearings
Instrumentation - Sensor: BeanDevice 2.4 GHz AX-3D, a wireless vibration sensor, was used to record vibration data. - Recording: Data collected via BeanGateway and stored on a PC. - Sampling: 1000 Hz.
Data Acquisition - Duration: 18 hours of data collection (6 hours per class). - Segmenting: Signals were divided into 10-second segments, resulting in 2160 signals for each fault category. - Classes: Healthy state, inner race fault, and outer race fault.
Dataset Organization The dataset is structured as follows: Main Folder: Contains two subfolders for .mat and .csv file formats to accommodate different user preferences.
Subfolder 1: .mat Files Healthy: Contains .mat files representing vibration signals for the healthy state. Inner Race Fault: Contains .mat files representing vibration signals for bearings with an inner race fault. Outer Race Fault: Contains .mat files representing vibration signals for bearings with an outer race fault.
Subfolder 2: .csv Files Healthy: Contains .csv files representing vibration signals for the healthy state. Inner Race Fault: Contains .csv files representing vibration signals for bearings with an inner race fault. Outer Race Fault: Contains .csv files representing vibration signals for bearings with an outer race fault.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7973470%2F59b468d1431202f361679b0a99d328da%2Fs3.png?generation=1732113352733560&alt=media" alt="">
Applications This dataset is suitable for tasks such as: Fault detection and diagnosis Signal processing and feature extraction research Development and benchmarking of machine learning and deep learning models
Usage This dataset can be used for academic research, industrial fault diagnosis applications, and algorithm development. Please cite the following reference when using this dataset: Aziz, S., Khan, M. U., Faraz, M., & Montes, G. A. (2023). Intelligent bearing faults diagnosis featuring automated relative energy based empirical mode decomposition and novel cepstral autoregressive features. Measurement, 216, 112871." DOI: https://doi.org/10.1016/j.measurement.2023.112871
Licence
This dataset is made publicly available for research purposes. Ensure appropriate citation and credit when using the data.https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7973470%2F1a76f7de5a0ca312ddc2d9ed0caf99a5%2Fs2.png?generation=1732113357938012&alt=media" alt="">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
InstructionsAcquisition ProtocolThe 8th Ninapro database is described in the paper: "Agamemnon Krasoulis, Sethu Vijayakumar & Kianoush Nazarpour. Effect of user adaptation on prosthetic finger control with an intuitive myoelectric decoder, Frontiers in Neuroscience. Please cite this paper for any work related to this database.More information about the protocol can be found in the original paper: "Manfredo Atzori, Arjan Gijsberts, Claudio Castellini, Barbara Caputo, Anne-Gabrielle Mittaz Hager, Simone Elsig, Giorgio Giatsidis, Franco Bassetto & Henning Müller. Electromyography data for non-invasive naturally-controlled robotic hand prostheses. Scientific Data, 2014" (http://www.nature.com/articles/sdata201453)The experiment comprised nine movements including single-finger as well as functional movements. The subjects had to repeat the instructed movements following visual cues (i.e. movies) shown on the screen of a computer monitor.The muscular activity was recorded using 16 active double-differential wireless sensors from a Delsys Trigno IM Wireless EMG system. The sensors comprise EMG electrodes and 9-axis inertial measurement units (IMUs). The sensors were positioned in two rows of eight units around the participants’ right forearm in correspondence to the radiohumeral joint (see pictures below). No specific muscles were targeted. The sensors were fixed on the forearm using the standard manufacturer-provided adhesive bands. Moreover, a hypoallergenic elastic latex-free band was placed around the sensors to keep them fixed during the acquisition. The sEMG signals were sampled at a rate of 1111 Hz, accelerometer and gyroscope data were sampled at 148 Hz, and magnetometer data were sampled at 74 Hz. All signals were upsampled to 2 kHz and post-synchronized.Hand kinematic data were recorded with a dataglove (Cyberglove 2, 18-DOF model). For all participants (i.e. both able-bodied and amputee), the data glove was worn on the left hand (i.e. contralateral to the arm where the EMG sensors were located). The Cyberglove signals correspond to data from the associated Cyberglove sensors located as shown in the picture below ("n/a" corresponds to sensors that were not available, since an 18-DOF model was used). Prior to each experimental session, the data glove was calibrated for the specific participant using the "quick calibration" procedure provided by the manufacturer. The Cyberglove signals were sampled at 100 Hz and subsequently upsampled to 2 kHz and synchronized to EMG and IMU data.Ten able-bodied (Subjects 1-10) and two right-hand transradial amputee participants (Subjects 11-12) are included in the dataset. During the acquisition, the subjects were asked to repeat 9 movements using both hands (bilateral mirrored movements). The duration of each of the nine movements varied between 6 and 9 seconds and consecutive trials were interleaved with 3 seconds of rest. Each repetition started with the participant holding their fingers at the rest state and involved slowly reaching the target posture as shown on the screen and returning to the rest state before the end of the trial. The following movements were included:0. rest1. thumb flexion/extension2. thumb abduction/adduction3. index finger flexion/extension4. middle finger flexion/extension5. combined ring and little fingers flexion/extension6. index pointer7. cylindrical grip8. lateral grip9. tripod grip DatasetsFor each participant, three datasets were collected: the first two datasets (acquisitions 1 & 2) comprised 10 repetitions of each movement and the third dataset (acquisition 3) comprised only two repetitions. For each subject, the associated .zip file contains three MATLAB files in .mat format, that is, one for each dataset, with synchronized variables.The variables included in the .mat files are the following:· subject: subject number· exercise: exercise number (value set to 1 in all data files)· emg (16 columns): sEMG signals from the 16 sensors· acc (48 columns): three-axis accelerometer data from the 16 sensors· gyro (48 columns): three-axis gyroscope data from the 16 sensors· mag (48 columns): three-axis magnetometer data from the 16 sensors· glove (18 columns): calibrated signals from the 18 sensors of the Cyberglove· stimulus (1 column): the movement repeated by the subject· restimulus (1 column): again the movement repeated by the subject. In this case, the duration of the movement label is refined a-posteriori in order to correspond to the real movement.· repetition (1 column): repetition number of the stimulus· rerepetition (1 column): repetition number of restimulus Important notesGiven the nature of the data collection procedure (slow finger movement and lack of extended hold period), this database is intended to be used for estimation/reconstruction of finger movement rather than motion/grip classification. In other words, the purpose of this database is to provide a benchmark for decoding finger position from (contralateral) EMG measurements using regression algorithms as opposed to classification. Therefore, the use of stimulus/restimulus vectors as target variables should be avoided; these are only provided for the user to have access to the exact timings of each movement repetition.Three datasets/acquisitions are provided for each subject. It is recommended that dataset 3, which comprises only two repetitions for each movement, is only used to report performance results and no training or hyper-parameter tuning is performed using this data (i.e. test dataset). The three datasets, which were recorded sequentially, can offer an out-of-the-box three-way split for model training (dataset 1), hyper-parameter tuning/validation (dataset 2), and performance testing (dataset 3). Another possibility is to merge datasets 1 & 2 and perform training and validation/hyper-parameter tuning using K-fold cross-validation, then report performance results on dataset 3.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Controlled Anomalies Time Series (CATS) Dataset consists of commands, external stimuli, and telemetry readings of a simulated complex dynamical system with 200 injected anomalies.
The CATS Dataset exhibits a set of desirable properties that make it very suitable for benchmarking Anomaly Detection Algorithms in Multivariate Time Series [1]:
[1] Example Benchmark of Anomaly Detection in Time Series: “Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. doi:10.14778/3538598.3538602”
About Solenix
Solenix is an international company providing software engineering, consulting services and software products for the space market. Solenix is a dynamic company that brings innovative technologies and concepts to the aerospace market, keeping up to date with technical advancements and actively promoting spin-in and spin-out technology activities. We combine modern solutions which complement conventional practices. We aspire to achieve maximum customer satisfaction by fostering collaboration, constructivism, and flexibility.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MSK
This dataset was collected as part of the U.S. Department of Transportation (U.S. DOT) Intersection Safety Challenge (hereafter, “the Challenge”) for Stage 1B: System Assessment and Virtual Testing. Multi-sensor data were collected at a controlled test roadway intersection the Federal Highway Administration (FHWA) Turner-Fairbank Highway Research Center (TFHRC) Smart Intersection facility in McLean, VA from October 2023 through March 2024. The data include potential conflict-based and non-conflict-based experimental scenarios between vulnerable road users (e.g., pedestrians, bicyclists) and vehicles during both daytime and nighttime conditions. Note that no actual human vulnerable road users were put at risk of being involved in a collision during the data collection efforts. The provided data (hereafter, “the Challenge Dataset”) are unlabeled training data (without ground truth) that were collected to be used for intersection safety system algorithm training, refinement, tuning, and/or validation, but may have additional uses. For a summary of the Stage 1B data collection effort, please see this video: https://youtu.be/csirVHFa2Cc. The Challenge Dataset includes data at a single, signalized four-way intersection from 20 roadside sensors and traffic control devices, including eight closed-circuit television (CCTV) visual cameras, five thermal cameras, two light detection and ranging (LiDAR) sensors, and four radar sensors. Intrinsic calibration was performed for all visual and thermal cameras. Extrinsic calibration was performed for specific pairs of roadside sensors. Additionally, the traffic signal phase and timing data and vehicle and/or pedestrian calls to the traffic signal controller (if any) are also provided. The total number of unique runs in the Challenge Dataset is 1,104, bringing the total size of the dataset to approximately 1 TB. A sample of 20 unique runs from the Challenge Dataset is provided here for download, inspection, and use. If, after inspecting this sample, a potential data user would like access to download the full Challenge Dataset, a request can be made via the form here: https://its.dot.gov/data/data-request For more details about the data collection, supplemental files, organization and dictionary, and sensor calibration, see the attached “U.S. DOT ISC Stage 1B ITS DataHub Metadata_v1.0.pdf” document. For more information on the background of the Intersection Safety Challenge Stage 1B, please visit: its.dot.gov/isc.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Audio-Polygraphy Dataset for Sleep Apnea Analysis (APSAA) provides synchronized, full-night audio and polygraph recordings from 32 subjects, along with manual annotations for labeled events in the polygraph studies. All subjects were provided with detailed information about the study, and written informed consent was obtained from those who agreed to take part. The recordings were collected between September 2021 and April 2022 at the Sleep Unit of Dr. Sagaz Hospital in Jaén (Spain). The study was approved by the Provincial Research Ethics Committee of Jaén (Spain).
Each subject's data is organized in a designated folder named according to the subject's unique identification code. Inside each folder, users will find: (1) the audio recording in WAV format, (2) separate CSV files for each polygraph signal, and (3) a CSV file containing manual annotations of polygraph events. The polygraph signals included are as follows:
Additionally, an automated algorithm for synchronizing audio and polygraph signals in sleep studies is provided, accessible via the following Github repository
Funding: This work was supported in part under grant 1257914 funded by Programa Operativo FEDER Andalucia 2014–2020, grant P18-RT-1994 funded by the Ministry of Economy, Knowledge and University (Junta de Andalucía, Spain), by MCIN/AEI/10.13039/501100011033 under the project grants PID2020-119082RB-{C21,C22} and by the Ministerio de Ciencia, Innovación y Universidades (Gobierno de España) under the grants PID2023-146520OB-{C21,C22}.
Description:
The Behavior-based WiFi User Dataset for user authentication. This dataset contains the physiological characteristics captured by WiFi from 10 participants for 10 different activities. Each participant performs 20 rounds for each activity. The experiments are conducted in two different environments, the campus office, and the home apartment. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of WiFi sense.
Format: .dat format
Section 1: Device Configuration
Section 2: Data Format
We provide raw data received by the CSI tool. The data files are saved in the dat format. The details are shown in the following:
Section 3: Experimental Setups
There are two experiment setups for our data collection. An image of the experimental setup and the illustration of activities from two different environments is included in the dataset. Each activity was performed in a designated location. In each activity location, the specific activity was conducted in 4 different proximate locations at least one foot away from each other.
Code | Activity |
A→B | Walking (trajectory 1) |
B→C | Walking (trajectory 2) |
B | Picking up a remote control |
C | Sitting in a chair |
D | Exercising |
E | Operating on the oven |
F | Using the stove |
Code | Activity |
G | Sitting in a seat |
H | Stretching the body |
I | Typing on a keyboard |
Section 4: Data Description
We separate our raw data into different folders based on different environment types. In each environment type, data are further distributed in terms of date. Each file includes all data from three internal antennas. All data files are in .dat format. We also provide Matlab scripts for CSI analysis and visualization. The following variables can be revealed from the codes:
Section 5: Codes
Section 6: Citations
If your paper is related to our works, please cite our papers as follows.
https://ieeexplore.ieee.org/document/9356038
C. Shi, J. Liu, N. Borodinov, B. Leao and Y. Chen, "Towards Environment-independent Behavior-based User Authentication Using WiFi," 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Delhi, India, 2020, pp. 666-674, doi: 10.1109/MASS50613.2020.00086
Bibtex:
@INPROCEEDINGS{9356038,
author={Shi, Cong and Liu, Jian and Borodinov, Nick and Leao, Bruno and Chen, Yingying},
booktitle={2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)},
title={Towards Environment-independent Behavior-based User Authentication Using WiFi},
year={2020},
volume={},
number={},
pages={666-674},
doi={10.1109/MASS50613.2020.00086}}
The current version of the dataset is shrunk due to its size. If you wish to acquire the full version or you have any questions regarding the dataset, contact us by email: cl1361@scarletmail.rutgers.edu.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This platform is a multi-functional music data sharing platform for Computational Musicology research. It contains many music datas such as the sound information of Chinese traditional musical instruments and the labeling information of Chinese pop music, which is available for free use by computational musicology researchers.
This platform is also a large-scale music data sharing platform specially used for Computational Musicology research in China, including 3 music databases: Chinese Traditional Instrument Sound Database (CTIS), Midi-wav Bi-directional Database of Pop Music and Multi-functional Music Database for MIR Research (CCMusic). All 3 databases are available for free use by computational musicology researchers. For the contents contained in the database, we will provide audio files recorded by the professional team of the conservatory of music, as well as corresponding labelled files, which have no commodity copyright problem and facilitate large-scale promotion. We hope that this music data sharing platform can meet the one-stop data needs of users and contribute to the research in the field of Computational Musicology.
If you want to know more information or obtain complete files, please go to the official website of this platform:
Music Data Sharing Platform for Academic Research
Chinese Traditional Instrument Sound Database (CTIS)
This database is developed by Prof. Han Baoqiang's team for many years, which collects sound information about Chinese traditional musical instruments. The database includes 287 Chinese national musical instruments, including traditional musical instruments, improved musical instruments and ethnic minority musical instruments.
Multi-functional Music Database for MIR Research
This database collects sound materials of pop music, folk music and hundreds of national musical instruments, and makes comprehensive annotation to form a multi-purpose music database for MIR researchers.
This database contains hundreds of Chinese pop songs, and each song contains the corresponding midi-audio-lyric information. Among them, recording the vocal part and accompaniment part of audio independently is helpful to study the MIR task under the ideal situation. In addition, the information of singing techniques consistent with vocal part (such as breath sound, falsetto, breathing, vibrato, mute, slide, etc.) is marked in MuseScore, which constitutes a Midi-Wav bi-direction corresponding pop music database.
The Google Trends dataset will provide critical signals that individual users and businesses alike can leverage to make better data-driven decisions. This dataset simplifies the manual interaction with the existing Google Trends UI by automating and exposing anonymized, aggregated, and indexed search data in BigQuery. This dataset includes the Top 25 stories and Top 25 Rising queries from Google Trends. It will be made available as two separate BigQuery tables, with a set of new top terms appended daily. Each set of Top 25 and Top 25 rising expires after 30 days, and will be accompanied by a rolling five-year window of historical data in 210 distinct locations in the United States. This Google dataset is hosted in Google BigQuery as part of Google Cloud's Datasets solution and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Multiplexed imaging technologies provide insights into complex tissue architectures. However, challenges arise due to software fragmentation with cumbersome data handoffs, inefficiencies in processing large images (8 to 40 gigabytes per image), and limited spatial analysis capabilities. To efficiently analyze multiplexed imaging data, we developed SPACEc, a scalable end-to-end Python solution, that handles image extraction, cell segmentation, and data preprocessing and incorporates machine-learning-enabled, multi-scaled, spatial analysis, operated through a user-friendly and interactive interface. The demonstration dataset was derived from a previous analysis and contains TMA cores from a human tonsil and tonsillitis sample that were acquired with the Akoya PhenocyclerFusion platform. The dataset can be used to test the workflow and establish it on a user’s system or to familiarize oneself with the pipeline. Methods Tissue samples: Tonsil cores were extracted from a larger multi-tumor tissue microarray (TMA), which included a total of 66 unique tissues (51 malignant and semi-malignant tissues, as well as 15 non-malignant tissues). Representative tissue regions were annotated on corresponding hematoxylin and eosin (H&E)-stained sections by a board-certified surgical pathologist (S.Z.). Annotations were used to generate the 66 cores each with cores of 1mm diameter. FFPE tissue blocks were retrieved from the tissue archives of the Institute of Pathology, University Medical Center Mainz, Germany, and the Department of Dermatology, University Medical Center Mainz, Germany. The multi-tumor-TMA block was sectioned at 3µm thickness onto SuperFrost Plus microscopy slides before being processed for CODEX multiplex imaging as previously described. CODEX multiplexed imaging and processing To run the CODEX machine, the slide was taken from the storage buffer and placed in PBS for 10 minutes to equilibrate. After drying the PBS with a tissue, a flow cell was sealed onto the tissue slide. The assembled slide and flow cell were then placed in a PhenoCycler Buffer made from 10X PhenoCycler Buffer & Additive for at least 10 minutes before starting the experiment. A 96-well reporter plate was prepared with each reporter corresponding to the correct barcoded antibody for each cycle, with up to 3 reporters per cycle per well. The fluorescence reporters were mixed with 1X PhenoCycler Buffer, Additive, nuclear-staining reagent, and assay reagent according to the manufacturer's instructions. With the reporter plate and assembled slide and flow cell placed into the CODEX machine, the automated multiplexed imaging experiment was initiated. Each imaging cycle included steps for reporter binding, imaging of three fluorescent channels, and reporter stripping to prepare for the next cycle and set of markers. This was repeated until all markers were imaged. After the experiment, a .qptiff image file containing individual antibody channels and the DAPI channel was obtained. Image stitching, drift compensation, deconvolution, and cycle concatenation are performed within the Akoya PhenoCycler software. The raw imaging data output (tiff, 377.442nm per pixel for 20x CODEX) is first examined with QuPath software (https://qupath.github.io/) for inspection of staining quality. Any markers that produce unexpected patterns or low signal-to-noise ratios should be excluded from the ensuing analysis. The qptiff files must be converted into tiff files for input into SPACEc. Data preprocessing includes image stitching, drift compensation, deconvolution, and cycle concatenation performed using the Akoya Phenocycler software. The raw imaging data (qptiff, 377.442 nm/pixel for 20x CODEX) files from the Akoya PhenoCycler technology were first examined with QuPath software (https://qupath.github.io/) to inspect staining qualities. Markers with untenable patterns or low signal-to-noise ratios were excluded from further analysis. A custom CODEX analysis pipeline was used to process all acquired CODEX data (scripts available upon request). The qptiff files were converted into tiff files for tissue detection (watershed algorithm) and cell segmentation.
This layer presents detectable thermal activity from VIIRS satellites for the last 7 days. VIIRS Thermal Hotspots and Fire Activity is a product of NASA’s Land, Atmosphere Near real-time Capability for EOS (LANCE) Earth Observation Data, part of NASA's Earth Science Data.Consumption Best Practices: As a service that is subject to Viral loads (very high usage), avoid adding Filters that use a Date/Time type field. These queries are not cacheable and WILL be subject to Rate Limiting by ArcGIS Online. To accommodate filtering events by Date/Time, we encourage using the included "Age" fields that maintain the number of Days or Hours since a record was created or last modified compared to the last service update. These queries fully support the ability to cache a response, allowing common query results to be supplied to many users without adding load on the service.When ingesting this service in your applications, avoid using POST requests, these requests are not cacheable and will also be subject to Rate Limiting measures.Source: NASA LANCE - VNP14IMG_NRT active fire detection - WorldScale/Resolution: 375-meterUpdate Frequency: Hourly using the aggregated live feed methodologyArea Covered: WorldWhat can I do with this layer?This layer represents the most frequently updated and most detailed global remotely sensed wildfire information. Detection attributes include time, location, and intensity. It can be used to track the location of fires from the recent past, a few hours up to seven days behind real time. This layer also shows the location of wildfire over the past 7 days as a time-enabled service so that the progress of fires over that timeframe can be reproduced as an animation.The VIIRS thermal activity layer can be used to visualize and assess wildfires worldwide. However, it should be noted that this dataset contains many “false positives” (e.g., oil/natural gas wells or volcanoes) since the satellite will detect any large thermal signal.Fire points in this service are generally available within 3 1/4 hours after detection by a VIIRS device. LANCE estimates availability at around 3 hours after detection, and esri livefeeds updates this feature layer every 15 minutes from LANCE.Even though these data display as point features, each point in fact represents a pixel that is >= 375 m high and wide. A point feature means somewhere in this pixel at least one "hot" spot was detected which may be a fire.VIIRS is a scanning radiometer device aboard the Suomi NPP and NOAA-20 satellites that collects imagery and radiometric measurements of the land, atmosphere, cryosphere, and oceans in several visible and infrared bands. The VIIRS Thermal Hotspots and Fire Activity layer is a livefeed from a subset of the overall VIIRS imagery, in particular from NASA's VNP14IMG_NRT active fire detection product. The downloads are automatically downloaded from LANCE, NASA's near real time data and imagery site, every 15 minutes.The 375-m data complements the 1-km Moderate Resolution Imaging Spectroradiometer (MODIS) Thermal Hotspots and Fire Activity layer; they both show good agreement in hotspot detection but the improved spatial resolution of the 375 m data provides a greater response over fires of relatively small areas and provides improved mapping of large fire perimeters.Attribute informationLatitude and Longitude: The center point location of the 375 m (approximately) pixel flagged as containing one or more fires/hotspots.Satellite: Whether the detection was picked up by the Suomi NPP satellite (N) or NOAA-20 satellite (1). For best results, use the virtual field WhichSatellite, redefined by an arcade expression, that gives the complete satellite name.Confidence: The detection confidence is a quality flag of the individual hotspot/active fire pixel. This value is based on a collection of intermediate algorithm quantities used in the detection process. It is intended to help users gauge the quality of individual hotspot/fire pixels. Confidence values are set to low, nominal and high. Low confidence daytime fire pixels are typically associated with areas of sun glint and lower relative temperature anomaly (<15K) in the mid-infrared channel I4. Nominal confidence pixels are those free of potential sun glint contamination during the day and marked by strong (>15K) temperature anomaly in either day or nighttime data. High confidence fire pixels are associated with day or nighttime saturated pixels.Please note: Low confidence nighttime pixels occur only over the geographic area extending from 11 deg E to 110 deg W and 7 deg N to 55 deg S. This area describes the region of influence of the South Atlantic Magnetic Anomaly which can cause spurious brightness temperatures in the mid-infrared channel I4 leading to potential false positive alarms. These have been removed from the NRT data distributed by FIRMS.FRP: Fire Radiative Power. Depicts the pixel-integrated fire radiative power in MW (MegaWatts). FRP provides information on the measured radiant heat output of detected fires. The amount of radiant heat energy liberated per unit time (the Fire Radiative Power) is thought to be related to the rate at which fuel is being consumed (Wooster et. al. (2005)).DayNight: D = Daytime fire, N = Nighttime fireHours Old: Derived field that provides age of record in hours between Acquisition date/time and latest update date/time. 0 = less than 1 hour ago, 1 = less than 2 hours ago, 2 = less than 3 hours ago, and so on.Additional information can be found on the NASA FIRMS site FAQ.Note about near real time data:Near real time data is not checked thoroughly before it's posted on LANCE or downloaded and posted to the Living Atlas. NASA's goal is to get vital fire information to its customers within three hours of observation time. However, the data is screened by a confidence algorithm which seeks to help users gauge the quality of individual hotspot/fire points. Low confidence daytime fire pixels are typically associated with areas of sun glint and lower relative temperature anomaly (<15K) in the mid-infrared channel I4. Medium confidence pixels are those free of potential sun glint contamination during the day and marked by strong (>15K) temperature anomaly in either day or nighttime data. High confidence fire pixels are associated with day or nighttime saturated pixels.RevisionsSeptember 15, 2022: Updated to include 'Hours_Old' field. Time series has been disabled by default, but still available.July 5, 2022: Terms of Use updated to Esri Master License Agreement, no longer stating that a subscription is required!This layer is provided for informational purposes and is not monitored 24/7 for accuracy and currency.If you would like to be alerted to potential issues or simply see when this Service will update next, please visit our Live Feed Status Page!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository showcases real-world operational data gathered from the power systems of the Spallation Neutron Source facility, renowned for delivering the world's most intense neutron beam. This dataset serves as a valuable resource for crafting techniques and algorithms aimed at preemptively identifying system faults, enabling timely operator intervention, and effective maintenance oversight. The authors utilized a radio-frequency test facility (RFTF) to conduct controlled laboratory experiments simulating system failures, all without triggering a catastrophic system breakdown. The dataset comprises waveform signals obtained during both regular system operations and deliberate fault induction efforts, offering a substantial amount of data for training statistical or machine learning models. Afterward, the authors carried out 21 test experiments wherein they gradually introduced faults into the RFTF system to evaluate the models' effectiveness in detecting and preempting impending faults. These experiments involved combinations of magnetic flux compensation and adjustments to start pulse width, leading to a gradual deterioration in various waveform aspects such as system output voltage and current. These alterations effectively mimicked real fault scenarios. All experiments took place at the Oak Ridge National Laboratory's Spallation Neutron Source facility in Oak Ridge, Tennessee, United States, during July 2022. The users of this dataset may include researchers in control, predictive maintenance, machine learning, and signal processing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Traffic Monitoring Systems: This model could be utilized to automatically monitor live traffic footage and alert authorities to any irregularities such as drivers not respecting signs. For example, it can be used to identify motorists who ignore stop signs or pedestrian-crossing rules, thereby enforcing traffic compliance and ensuring public safety.
Autonomous Vehicle Navigation: Self-driving cars could leverage this model to more accurately interpret road signs and pedestrian duty signals. This would enhance their situational awareness, enabling them to behave accordingly (e.g., stopping at stop signs, or allowing pedestrians to cross at crosswalks).
Traffic Education Software: An application could use this model to educate users about various traffic signals and signs. This could come in handy for driving schools or learning programs for students learning to drive or new drivers.
Urban Planning Analytics: City planners could use this model to analyze footages from around the city to understand and study the placement and effectiveness of different signage. For example, pinpointing areas where additional crossing signs may be necessary for public safety.
Augmented Reality Apps: For visually impaired individuals, an AR application could incorporate this model to identify and interpret road signs and signals, providing audio instructions to assist with navigation and safety in the real world.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The brain-computer interfaces (BCIs) provide humans a new communication channel by encoding and decoding brain activities. Steady-state visual evoked potential (SSVEP)-based BCI stands out among many BCI paradigms because of its non-invasiveness, little user training, and high information transfer rate (ITR). However, the use of conductive gel and bulky hardware in the traditional Electroencephalogram (EEG) method hinder the application of SSVEP-based BCIs. Besides, continuous visual stimulation in long time use will lead to visual fatigue and pose a new challenges to the practical application. This study presents an open dataset collected with a wearable SSVEP-based BCI system that compared wet and dry electrodes comprehensively with continuous recording of multiple sessions. The dataset consists of 8-channel SSVEP data from 102 healthy subjects while they performed a cue-guided target selecting task with a 12-target SSVEP-based BCI. For each subject, wet and dry electrodes were used to record 10 consecutive blocks respectively in an overall duration of around two hours. The dataset can be used to evaluate the performance of wet and dry electrodes in SSVEP-based BCIs. The dataset also provide sufficient data for developing new target identification algorithms to improve the performance of wearable SSVEP-based BCIs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It consists of 85,432 ads videos from the China popular short-term video app, Kwai. The videos were made and uploaded by commercial advertisers rather than personal users. The reason to use the ads videos lied on two folds: 1) the source guarantees the videos under control to some level, such as high-resolution pictures and intention-ally designed scene; 2) the ads videos mimic the style of the ones uploaded by personal users, as they are played in be-tween the personal videos in Kwai app. It can be seen as a quality controlled UGVs dataset.The dataset was collected in two batches (Batch-1 is our preliminary work), coming with the tags of ads industry cluster. The videos were randomly picked from a pool. The pool was formed by selecting the ads from several contiguous days.Half of the selected ads had click through rate(CTR) in top30000 within that day and the other half had CTR in bottom30000. It should be noticed that the released dataset is a sub-set of the pool. The audio track had2 channels (we mixed to mono channel in the study) and was sampled at 44.1 kHz, while the visual track had resolution of1280×720 and was sampled at 25frame per second(FPS).This dataset is a extension of the KWAI-AD corpus [3]. It is not only suitable for tasks in multimodal learning area, but also for ones in ads recommendation. It shows that the ads videos have three main characteristics: 1) The videos may have very inconsistent information in visual or audio streams. For example, the video may play a drama-like story at first, and then present the product introduction, whose scenes are very different. 2) The correspondence between audio and visual streams is not clear.For instance, similar visual objects (e.g. talking salesman)come with very different audio streams. 3) The relationship between audio and video varies in different industries. For example, game or E-commerce ads will have very different styles. These characteristics make the dataset suitable yet challenging for our study about the AVC learning.In the folder, you will see: audio_features.tar.gz, meta, README, samples, ad_label.npy, video_fetaures.tar.gz. The details are included in README.If you use our dataset, please cite our paper: "Themes Inferred Audio-visual Correspondence Learning" (https://arxiv.org/pdf/2009.06573.pdf)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset was obtained from the OpenfMRI project (http://www.openfmri.org). Accession #: ds007 Description: Stop-signal task with spoken & manual responses
Please cite the following references if you use these data:
Xue, G., Aron, A.R., Poldrack, R.A. (2008). Common neural substrates for inhibition of spoken and manual responses. Cereb Cortex, 18(8):1923-32. doi: 10.1093/cercor/bhm220
This dataset is made available under the Public Domain Dedication and License v1.0, whose full text can be found at http://www.opendatacommons.org/licenses/pddl/1.0/. We hope that all users will follow the ODC Attribution/Share-Alike Community Norms (http://www.opendatacommons.org/norms/odc-by-sa/); in particular, while not legally required, we hope that all users of the data will acknowledge the OpenfMRI project and NSF Grant OCI-1131441 (R. Poldrack, PI) in any publications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context : We share a large database containing electroencephalographic signals from 87 human participants, with more than 20,800 trials in total representing about 70 hours of recording. It was collected during brain-computer interface (BCI) experiments and organized into 3 datasets (A, B, and C) that were all recorded following the same protocol: right and left hand motor imagery (MI) tasks during one single day session. It includes the performance of the associated BCI users, detailed information about the demographics, personality and cognitive user’s profile, and the experimental instructions and codes (executed in the open-source platform OpenViBE). Such database could prove useful for various studies, including but not limited to: 1) studying the relationships between BCI users' profiles and their BCI performances, 2) studying how EEG signals properties varies for different users' profiles and MI tasks, 3) using the large number of participants to design cross-user BCI machine learning algorithms or 4) incorporating users' profile information into the design of EEG signal classification algorithms.
Sixty participants (Dataset A) performed the first experiment, designed in order to investigated the impact of experimenters' and users' gender on MI-BCI user training outcomes, i.e., users performance and experience, (Pillette & al). Twenty one participants (Dataset B) performed the second one, designed to examined the relationship between users' online performance (i.e., classification accuracy) and the characteristics of the chosen user-specific Most Discriminant Frequency Band (MDFB) (Benaroch & al). The only difference between the two experiments lies in the algorithm used to select the MDFB. Dataset C contains 6 additional participants who completed one of the two experiments described above. Physiological signals were measured using a g.USBAmp (g.tec, Austria), sampled at 512 Hz, and processed online using OpenViBE 2.1.0 (Dataset A) & OpenVIBE 2.2.0 (Dataset B). For Dataset C, participants C83 and C85 were collected with OpenViBE 2.1.0 and the remaining 4 participants with OpenViBE 2.2.0. Experiments were recorded at Inria Bordeaux sud-ouest, France.
Duration : Each participant's folder is composed of approximately 48 minutes EEG recording. Meaning six 7-minutes runs and a 6-minutes baseline.
Documents Instructions: checklist read by experimenters during the experiments. Questionnaires: the Mental Rotation test used, the translation of 4 questionnaires, notably the Demographic and Social information, the Pre and Post-session questionnaires, and the Index of Learning style. English and french version Performance: The online OpenViBE BCI classification performances obtained by each participant are provided for each run, as well as answers to all questionnaires Scenarios/scripts : set of OpenViBE scenarios used to perform each of the steps of the MI-BCI protocol, e.g., acquire training data, calibrate the classifier or run the online MI-BCI
Database : raw signals Dataset A : N=60 participants Dataset B : N=21 participants Dataset C : N=6 participants