Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
where the finger outline was extracted from the video.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the first attached data, 'Subject' is the person identificator, 'Measurement_Number' is the sample number of total measurement, 'Measurement_Number_Subject' is the sample number per person, 'HR_Fingertip_Pulse_Oximeter' is the heart rate measured by the fingertip pulse oximeter method, and 'HR_PPGI_Kalman' is the heart rate measured by the PPGI Kalman method. Also, 'Subject' is a number assigned to classify eighteen patients. Each sample is the number of heart rate calculated over a 6 s sliding-windows size.In the second attached data, 'Number' is the sample number of total measurement, 'ICA_CPU_Usage' is the CPU usage measured when PPGI ICA method estimated heart rate, and 'Kalman_CPU_Usage' s the CPU usage measured when PPGI Kalman method estimated heart rate.
This data was exported from Samsung´s Health App, collected from a Samsung Fit2 (C609), model number SM-R220, paired with Bluetooth into an iPhone 7 running iOS 14.6.
Mobile phone was set to US region including date/time.
Heart Rate data sample recorded from June 2021 until August 2021. Sample data was simplified to remove user´s personal data, according bellow:
Original columns are: pkg_name heart_beat_count time_offset binning_data max heart_rate comment start_time deviceuuid custom end_time datauuid create_time update_time min
Sample columns are: heart_rate max start_time end_time create_time update_time min
Samsung documentation: https://www.samsung.com/ca/support/mobile-devices/gear-fit2-pro-how-to-back-up-data-stored-on-my-gear-fit2-pro/
Understanding how the device collect and store data for psycho-physiological estimates and inference.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
forcing the analysis to be done off-line. Thus
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Processed CSV files containing about 9 months of my personal Apple Health data, including Sleep Stages and Heart Rate since 2022/09/13, the date the Sleep Tracking algorithm was first released and soon became the market leader. All of the data comes from Apple Watch devices.
The data is exported by Apple Health app, then processed to make the CSV files.
Personal information is left out.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The heart attack datasets were collected at Zheen hospital in Erbil, Iraq, from January 2019 to May 2019. The attributes of this dataset are: age, gender, heart rate, systolic blood pressure, diastolic blood pressure, blood sugar, ck-mb and troponin with negative or positive output. According to the provided information, the medical dataset classifies either heart attack or none. The gender column in the data is normalized: the male is set to 1 and the female to 0. The glucose column is set to 1 if it is > 120; otherwise, 0. As for the output, positive is set to 1 and negative to 0.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LifeSnaps Dataset Documentation
Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.
The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.
Data Import: Reading CSV
For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.
Data Import: Setting up a MongoDB (Recommended)
To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.
To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.
For the Fitbit data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c fitbit
For the SEMA data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c sema
For surveys data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c surveys
If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.
Data Availability
The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:
{
_id:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The heart is a vital and complex organ in the human body that forms with most organs between the second week of pregnancy, and fetal heart rate is an important indicator or biological index to know the condition of fetal well-being. In general, long-term measurement of fetal heart rate is the most widely used method of providing information about fetal health. In addition to fetal life, growth, and maturity, information such as congenital heart disease, often due to structural or functional defects in heart structure that often occur during the first trimester of pregnancy during fetal development, can be detected by continuous monitoring of fetal heart rate. The gold standard for monitoring the fetus’s health is the use of non-invasive methods and portable devices so that while maintaining the health of the mother and fetus, it provides the possibility of continuous monitoring, especially for mothers who have a high-risk pregnancy. Therefore, the present study aimed to propose a low-cost, compact, and portable device for recording the heart rate of 18-day-old fetal mouse heart cells. Introduced device allows non-invasive heart rate monitoring instantly and without side effects for mouse fetal heart cells. One-dimensional gold-plated plasmonic specimens as a physiological signal recorder are mainly chips with nanoarray of resonant nanowire patterns perform in an integrated platform. Here the surface plasmon waves generated in a one-dimensional plasmonic sample are paired with an electrical wave from the heart pulse, and this two-wave pairing is used to record and detect the heart rate of fetal heart cells with high accuracy and good sensitivity. This measurement was performed in normal mode and two different stimulation modes. Stimulation of cells was performed once using adrenaline and again with electrical stimulation. Our results show that our sensor is sensitive enough to detect heart rate in both standard and excitatory states and is also well able to detect and distinguish between changes in heart rate caused by different excitatory conditions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Section 1: Introduction
Brief overview of dataset contents:
Current database contains anonymised data collected during exercise testing services performed on male and female participants (cycling, rowing, kayaking and running) provided by the Human Performance Laboratory, School of Medicine, Trinity College Dublin, Dublin 2, Ireland.
835 graded incremental exercise test files (285 cycling, 266 rowing / kayaking, 284 running)
Description file with each row representing a test file - COLUMNS: file name (AXXX), sport (cycling, running, rowing or kayaking)
Anthropometric data of participants by sport (age, gender, height, body mass, BMI, skinfold thickness,% body fat, lean body mass and haematological data; namely, haemoglobin concentration (Hb), haematocrit (Hct), red blood cell (RBC) count and white blood cell (WBC) count )
Test data (HR, VO2 and lactate data) at rest and across a range of exercise intensities
Derived physiological indices quantifying each individual’s endurance profile
Following a request from athletes seeking assessment by phone or e-mail the test protocol, risks, benefits and test and medical requirements, were explained verbally or by return e-mail. Subsequently, an appointment for an exercise assessment was arranged following the regulatory reflection period (7 days). Following this regulatory period each participant’s verbal consent was obtained pre-test, for participants under 18 years of age parent / guardian consent was obtained in writing. Ethics approval was obtained from the Faculty of Health Sciences ethics committee and all testing procedures were performed in compliance with Declaration of Helsinki guidelines.
All consenting participants were required to attend the laboratory on one occasion in a rested, carbohydrate loaded and well-hydrated state, and for male participants’ clean shaven in the facial region. All participants underwent a pre-test medical examination, including assessment of resting blood pressure, pulmonary function testing and haematological (Coulter Counter Act Diff, Beckmann Coulter, CA,US) review performed by a qualified medical doctor prior to exercise testing. Any person presenting with any cardiac abnormalities, respiratory difficulties, symptoms of cold or influenza, musculoskeletal injury that could impair performance, diabetes, hypertension, metabolic disorders, or any other contra-indicatory symptoms were excluded. In addition, participants completed a medical questionnaire detailing training history, previous personal and family health abnormalities, recent illness or injury, menstrual status for female participants, as well as details of recent travel and current vaccination status, and current medications, supplements and allergies. Barefoot height in metre (Holtain, Crymych, UK), body mass (counter balanced scales) in kilogram (Seca, Hamburg, Germany) and skinfold thickness in millimetre using a Harpenden skinfold caliper (Bath International, West Sussex, UK) were recorded pre-exercise.
Section 2: Testing protocols
2.1: Cycling
A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on an electromagnetically braked cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). Participants initially identified a cycling position in which they were most comfortable by adjusting saddle height, saddle fore-aft position relative to the crank axis, saddle to handlebar distance and handlebar height. Participant’s feet were secured to the ergometer using their own cycling shoes with cleats and accompanying pedals. The protocol commenced with a 15-min warm-up at a workload of 120 Watt (W), followed by a 10-min rest. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a workload of 100 or 120 W for female and male participants, respectively, and subsequently increasing by a 20, 30 or 40 W incremental increase every 3-min depending on gender and current competition category. During assessment participants maintained a constant self-selected cadence chosen during their warm-up (permitted window was 5 rev.min−1 within a permitted absolute range of 75 to 95 rev.min−1) and the test was terminated when a participant was no longer able to maintain a constant cadence.
Heart rate (HR) data were recorded continuously by radio-telemetry using a Cosmed HR monitor (Cosmed, Rome, Italy). During the test, blood samples were collected from the middle finger of the right hand at the end of the second minute of each 3-min interval. The fingertip was cleaned to remove any sweat or blood and lanced using a long point sterile lancet (Braun, Melsungen, Germany). The blood sample was collected into a heparinised capillary tube (Brand, Wertheim, Germany) by holding the tube horizontal to the droplet and allowing transfer by capillary action. Subsequently, a 25μL aliquot of whole blood was drawn from the capillary tube using a YSI syringepet (YSI, OH, USA) and added into the chamber of a YSI 1500 Sport lactate analyser (YSI, OH, USA) for determination of non-lysed [Lac] in mmol.L−1. The lactate analyser was calibrated to the manufacturer’s requirements (± 0.05 mmol.L−1) before each test using a standard solution (YSI, OH, USA) of known concentration (5 mmol.L−1) and analyser linearity was confirmed using either a 15 or 30 mmol.L-1 standard solution (YSI, OH, USA).
Gas exchange variables including respiration rate (Rf in breaths.min-1), minute ventilation (VE in L.min-1), oxygen consumption (VO2 in L.min-1 and in mL.kg-1.min-1) and carbon dioxide production (VCO2 in L.min-1), were measured on a breath-by-breath basis throughout the test, using a cardiopulmonary exercise testing unit (CPET) and an associated software package (Cosmed, Rome, Italy). Participants wore a face mask (Hans Rudolf, KA, USA) which was connected to the CPET unit. The metabolic unit was calibrated prior to each test using ambient air and an alpha certified gas mixture containing 16% O2, 5% CO2 and 79% N2 (Cosmed, Rome, Italy). Volume calibration was performed using a 3L gas calibration syringe (Cosmed, Rome, Italy). Barometric pressure recorded by the CPET was confirmed by recording barometric pressure using a laboratory grade barometer.
Following testing mean HR and mean VO2 data at rest and during each exercise increment were computed and tabulated over the final minute of each 3-min interval. A graphical plot of [Lac], mean VO2 and mean HR versus cycling workload was constructed and analysed to quantify physiological endurance indices, see Data Analysis section. Data for VO2 peak in L.min-1 (absolute) and in mL.kg-1.min-1 (relative) and VE peak in L.min-1 were reported as the peak data recorded over any 10 consecutive breaths recorded during the last minute of the final exercise increment.
2.2: Running protocol
A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a motorised treadmill (Powerjog, Birmingham, UK). The running protocol, performed at a gradient of 0%, commenced with a 15-min warm-up at a velocity (km.h-1) which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity. Subsequently, the warm-up was followed by a 10 minute rest / dynamic stretching phase. From a safety perspective during all running GxT participants wore a suspended lightweight safety harness to minimise any potential falls risk. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal running velocity which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity, and subsequently increased by ≥ 1 km.h-1 every 3-min depending on gender and current competition category. The test was terminated when a participant was no longer able to maintain the imposed treadmill.
Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.
2.3: Rowing / kayaking protocol
A discontinuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a Concept 2C rowing ergometer (Concept, VA, US) in rowers or a Dansprint kayak ergometer (Dansprint, Hvidovre, Denmark) in flat-water kayakers. The protocol commenced with a 15-min low-intensity warm-up at a workload (W) dependent on gender, sport and competition category, followed by a 10-min rest. For rowing the flywheel damping (120, 125 or 130W) was set dependent on gender and competition category. For kayaking the bungee cord tension was adjusted by individual participants to suit their requirements. A discontinuous protocol of 3-min exercise at a targeted load followed by a 1-min rest phase to facilitate stationary earlobe capillary blood sample collection and resetting of ergometer display (Dansprint ergometer) was used. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal load 80 to 120 W for rowing, 50 to 90 W for kayaking and subsequently increased by 20,30 or 40 W every 3-min depending on gender, sport and current competition category. The test was terminated when a participant was no longer able to maintain the targeted workload.
Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.
3.1: Data analysis
Constructed graphical plots (HR, VO2 and [Lac] versus load / velocity) were analysed to quantify the following; load / velocity at TLac, HR at TLac, [Lac] at TLac, % of VO2 peak at TLac, % of HRmax at TLac, load / velocity and HR at a nominal [Lac] of 2 mmol.L-1, load / velocity, VO2 and [Lac} at a nominal HR of
This dataset comprises of heart rate variability (HRV) indices computed from the multimodal SWELL knowledge work (SWELL-KW) dataset for research on stress and user modeling (see. http://cs.ru.nl/~skoldijk/SWELL-KW/Dataset.html). The SWELL was collected by researchers at the Institute for Computing and Information Sciences at Radboud University. It is a result of experiments conducted on 25 subjects doing typical office work (for example writing reports, making presentations, reading e-mail and searching for information). The subject went through typical working stressors such as receiving unexpected emails interruptions and pressure to complete their work on time. The experiment recorded various data including computer logging, facial expression, body postures, ECG signal, and skin conductance. The researchers also recorded the subjects’ subjective experience on task load, mental effort, emotion, and perceived stress. Each participant went through three different working conditions:
The original dataset contains raw ECG signal and a feature dataset which is annotated with the conditions under which the data was collected. It also contained heart rate variability (HRV) feature (only RMSSD) that was computed every one minute. This limiting and lead to the conclusion that HRV was not a good predictor of office stressor. This dataset is more comprehensive and allows to predict the stressor with a 99.25% accuracy. For more details, refer to our published paper included in this dataset.
Nkurikiyeyezu, K., Yokokubo, A., & Lopez, G. (2020). The Effect of Person-Specific Biometrics in Improving Generic Stress Predictive Models. Journal of Sensors & Material, 1–12. http://arxiv.org/abs/1910.01770
K. Nkurikiyeyezu, K. Shoji, A. Yokokubo, and G. Lopez, “Thermal Comfort and Stress Recognition in Office Environment.” Prague, Czech: SCITEPRESS - Science and Technology Publications, 2019.
All the credit goes to Koldijk et.al for collecting and freely sharing the swell dataset. If you find this research helpful, please consider citing their papers
Drug Cardiotoxicity dataset [1-2] is a molecule classification task to detect cardiotoxicity caused by binding hERG target, a protein associated with heart beat rhythm. The data covers over 9000 molecules with hERG activity.
Note:
The data is split into four splits: train, test-iid, test-ood1, test-ood2.
Each molecule in the dataset has 2D graph annotations which is designed to facilitate graph neural network modeling. Nodes are the atoms of the molecule and edges are the bonds. Each atom is represented as a vector encoding basic atom information such as atom type. Similar logic applies to bonds.
We include Tanimoto fingerprint distance (to training data) for each molecule in the test sets to facilitate research on distributional shift in graph domain.
For each example, the features include: atoms: a 2D tensor with shape (60, 27) storing node features. Molecules with less than 60 atoms are padded with zeros. Each atom has 27 atom features. pairs: a 3D tensor with shape (60, 60, 12) storing edge features. Each edge has 12 edge features. atom_mask: a 1D tensor with shape (60, ) storing node masks. 1 indicates the corresponding atom is real, othewise a padded one. pair_mask: a 2D tensor with shape (60, 60) storing edge masks. 1 indicates the corresponding edge is real, othewise a padded one. active: a one-hot vector indicating if the molecule is toxic or not. [0, 1] indicates it's toxic, otherwise [1, 0] non-toxic.
[1]: V. B. Siramshetty et al. Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the Big Data Era. JCIM, 2020. https://pubs.acs.org/doi/10.1021/acs.jcim.0c00884
[2]: K. Han et al. Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift. NeurIPS DistShift Workshop 2021. https://arxiv.org/abs/2111.12951
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('cardiotox', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Replication data was collected from twenty horses diagnosed with sEA and twenty (n=20) asymptomatic (non-sEA) horses. In the sEA group, data regarding the history of the disease was collected, and, after physical and endoscopic examination, tracheal wash and bronchoalveolar lavage samples were taken. SEA horses showed clinical symptoms and received no treatment at the time of the measurement. The RR intervals of the ECG were recorded for 1 hour at rest between 9 a.m. and 11 a.m. using a heart rate (HR) monitor. The file includes HRV data analyzed with time, frequency-domain, and geometric methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the data associated with our research project titled Impact of delayed response on Wearable Cognitive Assistance. A preprint of the associated paper can be found at https://arxiv.org/abs/2011.02555.
Title of Dataset: Impact of delayed response on Wearable Cognitive Assistance
Author Information
First Author Contact Information Name: Manuel Olguín Muñoz Institution: School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology Address: Malvinas väg 10, Stockholm 11428, Sweden Email: molguin@kth.se Phone Number: +46 73 652 7628
Author Contact Information Name: Roberta L. Klatzky Institution: Department of Psychology, Carnegie Mellon University Address: 5000 Forbes Ave, Pittsburgh, PA 15213 Email: klatzky@cmu.edu Phone Number: +1 412 268 8026
Author Contact Information Name: Mahadev Satyanarayanan Institution: School of Computer Science, Carnegie Mellon University Address: 5000 Forbes Ave, Pittsburgh, PA 15213 Email: satya@cs.cmu.edu Phone Number: +1 412 268 3743
Author Contact Information Name: James R. Gross Institution: School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology Address: Malvinas väg 10, Stockholm 11428, Sweden Email: jamesgr@kth.se Phone Number: +46 8 790 8819
Directory of Files: A. Filename: accelerometer_data.csv Short description: Time-series accelerometer data. Each row corresponds to a sample.
B. Filename: block_aggregate.csv
Short description: Contains the block- and slice-level aggregates for each of the metrics and statistics present in this dataset. Each row corresponds to either a full block or a slice of a block, see below for details.
C. Filename: block_metadata.csv
Short description: Contains the metadata for each block in the task for each participant. Each row corresponds to a block.
D. Filename: bvp_data.csv
Short description: Time-series blood-volume-pulse data. Each row corresponds to a sample.
E. Filename: eeg_data.csv
Short description: Time-series electroencephalogram data, represented as power per band. Each row corresponds to a sample; power was calculated in 0.5 second intervals.
F. Filename: frame_metadata.csv
Short description: Contains the metadata for each video frame processed by the cognitive assistant. Each row corresponds to a processed frame.
G. Filename: gsr_data.csv
Short description: Time-series galvanic skin response data. Each row corresponds to a sample.
H. Filename: task_step_metadata.csv
Short description: Contains the metadata for each step in the task for each participant. Each row corresponds to a step in the task.
I. Filename: temperature_data.csv
Short description: Time-series thermometer data. Each row corresponds to a sample.
Additional Notes on File Relationships, Context, or Content (for example, if a user wants to reuse and/or cite your data, what information would you want them to know?):
The data contained in these CSVs was obtained from 40 participants in a study performed with approval from the Carnegie Mellon University Institutional Research Board. In this study, participants were asked to interact with a Cognitive Assistant while wearing an array of physiological sensors. The data contained in this dataset corresponds to the actual collected data, after some preliminary preprocessing to convert from sensors readings into meaningful values.
Participants have been anonymized using random integer identifiers.
block_aggregate.csv can be replicated by cross-referencing the start and end timestamps of each block in block_metadata.csv and the timestamps for each desired metric.
The actual video frames mentioned in frame_metadata.csv are not included in the dataset since their contents were not relevant to the research.
File Naming Convention: N/A
Number of variables: 7
Number of cases/rows: 1844688
Missing data codes: N/A
Variable list:
A. Name: timestamp Description: Timestamp of the sample.
B. Name: x Description: Acceleration reading from the x-axis of the accelerometer in g-forces [g].
C. Name: y Description: Acceleration reading from the y-axis of the accelerometer in g-forces [g].
D. Name: z Description: Acceleration reading from the z-axis of the accelerometer in g-forces [g].
E. Name: ts Description: Time difference with respect to first sample.
F. Name: participant Description: Denotes the numeric ID representing each individual participant.
G. Name: delay Description: Delay that was being applied on the task when this reading was obtained in time delta format.
Number of variables: 16
Number of cases/rows: 2520
Missing data codes:
Variable List:
A. Name: participant Description: Denotes the numeric ID representing each individual participant.
B. Name: block_seq Description: Denotes the position of the block in the task. Ranges from 1 to 21.
C. Name: slice Description: Index of the 4-step slice of the block over which the data was aggregated. Ranges from 0 to 2, however higher values are only applicable for blocks of appropriate length (i.e. blocks of length 4 only have a 0-slice, length 8 have 0 and 1, and length 12 have slices from 0 to 2). A missing value indicates that this row instead contains aggregate values for the whole block.
D. Name: block_length Description: Length of the block. Valid values are 4, 8 and 12.
C. Name: block_delay Description: Delay applied to the block, in seconds.
F. Name: start Description: Timestamp marking the start of the block or slice.
G. Name: end Description: Timestamp marking the end of the block or slice.
H. Name: duration Description: Duration of the block or slice, in seconds.
I. Name: exec_time_per_step_mean Description: Mean execution time for each step in the block or slice.
J. Name: bpm_mean Description: Mean heart rate, in beats-per-minute, for the block or slice.
K. Name: bpm_std Description: Standard deviation of the heart rate, in beats-per-minute, for the block or slice.
L. Name: gsr_per_second Description: Galvanic skin response in microsiemens, summed and then normalized by block or slice duration.
M. Name: movement_score Description: Movement score for the block or slice. The movement score is calculated as the sum of the magnitude of all the acceleration vectors in the block or slice, divided by duration in seconds.
N. Name: eeg_alpha_log_mean Description: Log of the average EEG power for the alpha band for the, block or slice.
O. Name: eeg_beta_log_mean Description: Log of the average EEG power for the beta band for the, block or slice.
P. Name: eeg_total_log_mean Description: Log of the average EEG power for the complete EEG signal, for the block or slice.
Number of variables: 8
Number of cases/rows: 880
Missing data codes: N/A
Variable list:
A. Name: participant Description: Denotes the numeric ID representing each individual participant.
B. Name: seq Description: Index of the block in the task, ranging from 0 to 21. Note that block 0 is not to be included in aggregate calculations.
C. Name: length Description: Length of the block in number of steps.
D. Name: delay Description: Delay applied to the block.
E. Name: start Description: Timestamp marking the start of the block.
F. Name: end Description: Timestamp marking the end of the block.
G. Name: duration Description: Duration of the block as a timedelta.
H. Name: exec_time Description: Execution time of the block as a timedelta.
Number of variables: 8
Number of cases/rows: 3683504
Missing data codes: Columns bpm and ibi only contain values for rows corresponding to a sample taken at a heartbeat.
Variable list:
A. Name: ts Description: Time difference with respect to first sample.
B. Name: timestamp Description: Timestamp of the sample.
C. Name: bvp Description: Blood-volume-pulse reading, in millivolts.
D. Name: onset Description: Boolean indicating if this sample corresponds to the onset of a pulse.
E. Name: bpm Description: Instantaneous beat-per-minute value.
F. Name: ibi Description: Instantaneous inter-beat-interval value.
G. Name: delay Description: Delay that was being applied on the task when this reading was obtained in time delta format.
H. Name: participant Description: Denotes the numeric ID representing each individual
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
A beginner-friendly version of the MIT-BIH Arrhythmia Database, which contains 48 electrocardiograms (EKGs) from 47 patients that were at Beth Israel Deaconess Medical Center in Boston, MA in 1975-1979.
There are 48 CSVs, each of which is a 30-minute echocardiogram (EKG) from a single patient (record 201 and 202 are from the same patient). Data was collected at 360 Hz, meaning that 360 data points is equal to 1 second of time.
Banner photo by Joshua Chehov on Unsplash.
EKGs, or electrocardiograms, measure the heart's function by looking at its electrical activity. The electrical activity in each part of the heart is supposed to happen in a particular order and intensity, creating that classic "heartbeat" line (or "QRS complex") you see on monitors in medical TV shows.
There are a few types of EKGs (4-lead, 5-lead, 12-lead, etc.), which give us varying detail about the heart. A 12-lead is one of the most detailed types of EKGs, as it allows us to get 12 different outputs or graphs, all looking at different, specific parts of the heart muscles.
This dataset only publishes two leads from each patient's 12-lead EKG, since that is all that the original MIT-BIH database provided.
Check out Ninja Nerd's EKG Basics tutorial on YouTube to understand what each part of the QRS complex (or heartbeat) means from an electrical standpoint.
Each file's name is the ID of the patient (except for 201 and 202, which are the same person).
index / 360 * 1000
)The two leads are often lead MLII and another lead such as V1, V2, or V5, though some datasets do not use MLII at all. MLII is the lead most often associated with the classic QRS Complex (the medical name for a single heartbeat).
Milliseconds were calculated and added as a secondary index to each dataset. Calculations were made by dividing the index
by 360
Hz then multiplying by 1000
. The original index was preserved, since the calculation of milliseconds as digital signals processing (e.g. filtering) occurs may cause issues with the correlation and merging of data. You are encouraged to try whichever index is most suitable for your analysis and/or recalculate a time index with Pandas' to_timedelta()
.
Info about each of the 47 patients is available here, including age, gender, medications, diagnoses, etc.
Physionet has some online tutorials and tips for analyzing EKGs and other time series / digital signals.
Check out our notebook for opening and visualizing the data.
A write-up on how the data was converted from .dat
to .csv
files is available on Medium.com. Data was downloaded from the MIT-BIH Arrhythmia Database then converted to CSV.
Moody GB, Mark RG. The impact of the MIT-BIH Arrhythmia Database. IEEE Eng in Med and Biol 20(3):45-50 (May-June 2001). (PMID: 11446209)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Detrended Fluctuation Analysis (DFA) is an algorithm widely used to determine fractal long-range correlations in physiological signals. Its application to heart rate variability (HRV) has proven useful in distinguishing healthy subjects from patients with cardiovascular disease. In this study we examined the effect of respiratory sinus arrhythmia (RSA) on the performance of DFA applied to HRV. Predictions based on a mathematical model were compared with those obtained from a sample of 14 normal subjects at three breathing frequencies: 0.1 Hz, 0.2 Hz and 0.25 Hz. Results revealed that: (1) the periodical properties of RSA produce a change of the correlation exponent in HRV at a scale corresponding to the respiratory period, (2) the short-term DFA exponent is significantly reduced when breathing frequency rises from 0.1 Hz to 0.2 Hz. These findings raise important methodological questions regarding the application of fractal measures to short-term HRV.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographics of patients in the sample (n = 788).
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains vital sign measurements (heart rate and respiratory rate) from mm-wave Frequency-Modulated Continuous Wave (FMCW) radar. It includes data from ten participants across scenarios like resting, elevated heart rate activities, and accounts for extreme physiological conditions such as asthma and meditation. Each record is validated against the Polar H10 sensor. The dataset is structured to facilitate research in non-invasive health monitoring, providing ADC samples, range maps, and chest displacement signals.
Note: The detailed guide for this data set can be found in "https://doi.org/10.48550/arXiv.2405.12659". In the case of using it please cite this article.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
The Pulse Wave Database
The Pulse Wave Database (PWDB) is a database of simulated arterial pulse waves designed to be representative of a sample of pulse waves measured from healthy adults. It contains pulse waves for 4,374 virtual subjects, aged from 25-75 years old (in 10 year increments). The database contains a baseline set of pulse waves for each of the six age groups, created using cardiovascular properties (such as heart rate and arterial stiffness) which are representative of healthy subjects at each age group. It also contains 728 further virtual subjects at each age group, in which each of the cardiovascular properties are varied within normal ranges. The entire database is available at DOI: 10.5281/zenodo.2633174 .
This dataset: baseline subjects aged 25 to 75
This dataset is a subset of the PWDB. It contains the pulse waves for the six baseline subjects aged 25 to 75 (in 10 year increments). It contains the following waves:
arterial flow velocity (U),
luminal area (A),
pressure (P), and
photoplethysmogram (PPG).
These pulse waves are provided at a range of measurement sites, including:
aorta (ascending and descending)
carotid artery
brachial artery
radial artery
finger
femoral artery
The data are available in three formats: Matlab, CSV and WaveForm Database (WFDB) format. Further details of the formatting and contents of each file are available at: https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database
Accompanying Publication
This is a subset of the PWDB database, which is described in the following publication:
Charlton P.H., Mariscal Harana, J., Vennin, S., Li, Y., Chowienczyk, P. & Alastruey, J., “Modelling arterial pulse waves in healthy ageing: a database for in silico evaluation of haemodynamics and pulse wave indices,” [under review]
Please cite this publication when using the database.
Further Information
Further information on the Pulse Wave Database project can be found at: https://peterhcharlton.github.io/pwdb/
Version History
Version 1.0 : provided for peer review of "Modelling arterial pulse waves in healthy ageing: a database for in silico evaluation of haemodynamics and pulse wave indices"
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Heart Rate Variability (HRV) analysis aims to characterize the physiological state affecting heart rate, and identify potential markers of underlying pathologies. This typically involves calculating various HRV indices for each recording of two or more populations. Then, statistical tests are used to find differences. The normality of the indices, the number of groups being compared, and the correction of the significance level should be considered in this step. Especially for large studies, this process is tedious and error-prone. This paper presents RHRVEasy, an R open-source package that automates all the steps of HRV analysis. RHRVEasy takes as input a list of folders, each containing all the recordings of the same population. The package loads and preprocesses heart rate data, and computes up to 31 HRV time, frequency, and non-linear indices. Notably, it automates the computation of non-linear indices, which typically demands manual intervention. It then conducts hypothesis tests to find differences between the populations, adjusting significance levels if necessary. It also performs a post-hoc analysis to identify the differing groups if there are more than two populations. RHRVEasy was validated using a database of healthy subjects, and another of congestive heart failure patients. Significant differences in many HRV indices are expected between these groups. Two additional groups were constructed by random sampling of the original databases. Each of these groups should present no statistically significant differences with the group from which it was sampled, and it should present differences with the other two groups. All tests produced the expected results, demonstrating the software’s capability in simplifying HRV analysis. Code is available on https://github.com/constantino-garcia/RHRVEasy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
where the finger outline was extracted from the video.