Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Overview:
This dataset contains simulated (hypothetical) but almost realistic (based on AI) data related to sleep, heart rate, and exercise habits of 500 individuals. It includes both pre-exercise and post-exercise resting heart rates, allowing for analyses such as a dependent t-test (Paired Sample t-test) to observe changes in heart rate after an exercise program. The dataset also includes additional health-related variables, such as age, hours of sleep per night, and exercise frequency.
The data is designed for tasks involving hypothesis testing, health analytics, or even machine learning applications that predict changes in heart rate based on personal attributes and exercise behavior. It can be used to understand the relationships between exercise frequency, sleep, and changes in heart rate.
File: Filename: heart_rate_data.csv File Format: CSV
- Features (Columns):
Age: Description: The age of the individual. Type: Integer Range: 18-60 years Relevance: Age is an important factor in determining heart rate and the effects of exercise.
Sleep Hours: Description: The average number of hours the individual sleeps per night. Type: Float Range: 3.0 - 10.0 hours Relevance: Sleep is a crucial health metric that can impact heart rate and exercise recovery.
Exercise Frequency (Days/Week): Description: The number of days per week the individual engages in physical exercise. Type: Integer Range: 1-7 days/week Relevance: More frequent exercise may lead to greater heart rate improvements and better cardiovascular health.
Resting Heart Rate Before: Description: The individual’s resting heart rate measured before beginning a 6-week exercise program. Type: Integer Range: 50 - 100 bpm (beats per minute) Relevance: This is a key health indicator, providing a baseline measurement for the individual’s heart rate.
Resting Heart Rate After: Description: The individual’s resting heart rate measured after completing the 6-week exercise program. Type: Integer Range: 45 - 95 bpm (lower than the "Resting Heart Rate Before" due to the effects of exercise). Relevance: This variable is essential for understanding how exercise affects heart rate over time, and it can be used to perform a dependent t-test analysis.
Max Heart Rate During Exercise: Description: The maximum heart rate the individual reached during exercise sessions. Type: Integer Range: 120 - 190 bpm Relevance: This metric helps in understanding cardiovascular strain during exercise and can be linked to exercise frequency or fitness levels.
Potential Uses: Dependent T-Test Analysis: The dataset is particularly suited for a dependent (paired) t-test where you compare the resting heart rate before and after the exercise program for each individual.
Exploratory Data Analysis (EDA):Investigate relationships between sleep, exercise frequency, and changes in heart rate. Potential analyses include correlations between sleep hours and resting heart rate improvement, or regression analyses to predict heart rate after exercise.
Machine Learning: Use the dataset for predictive modeling, and build a beginner regression model to predict post-exercise heart rate using age, sleep, and exercise frequency as features.
Health and Fitness Insights: This dataset can be useful for studying how different factors like sleep and age influence heart rate changes and overall cardiovascular health.
License: Choose an appropriate open license, such as:
CC BY 4.0 (Attribution 4.0 International).
Inspiration for Kaggle Users: How does exercise frequency influence the reduction in resting heart rate? Is there a relationship between sleep and heart rate improvements post-exercise? Can we predict the post-exercise heart rate using other health variables? How do age and exercise frequency interact to affect heart rate?
Acknowledgments: This is a simulated dataset for educational purposes, generated to demonstrate statistical and machine learning applications in the field of health analytics.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a detailed overview of gym members' exercise routines, physical attributes, and fitness metrics. It contains 973 samples of gym data, including key performance indicators such as heart rate, calories burned, and workout duration. Each entry also includes demographic data and experience levels, allowing for comprehensive analysis of fitness patterns, athlete progression, and health trends.
Key Features:
This dataset is ideal for data scientists, health researchers, and fitness enthusiasts interested in studying exercise habits, modeling fitness progression, or analyzing the relationship between demographic and physiological data. With a wide range of variables, it offers insights into how different factors affect workout intensity, endurance, and overall health.
Facebook
TwitterThe EPA Control Measure Dataset is a collection of documents describing air pollution control available to regulated facilities for the control and abatement of air pollution emissions from a range of regulated source types, whether directly through the use of technical measures, or indirectly through economic or other measures.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Study information The sample included in this dataset represents five children who participated in a number line intervention study. Originally six children were included in the study, but one of them fulfilled the criterion for exclusion after missing several consecutive sessions. Thus, their data is not included in the dataset. All participants were currently attending Year 1 of primary school at an independent school in New South Wales, Australia. For children to be able to eligible to participate they had to present with low mathematics achievement by performing at or below the 25th percentile in the Maths Problem Solving and/or Numerical Operations subtests from the Wechsler Individual Achievement Test III (WIAT III A & NZ, Wechsler, 2016). Participants were excluded from participating if, as reported by their parents, they have any other diagnosed disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, intellectual disability, developmental language disorder, cerebral palsy or uncorrected sensory disorders. The study followed a multiple baseline case series design, with a baseline phase, a treatment phase, and a post-treatment phase. The baseline phase varied between two and three measurement points, the treatment phase varied between four and seven measurement points, and all participants had 1 post-treatment measurement point. The number of measurement points were distributed across participants as follows: Participant 1 – 3 baseline, 6 treatment, 1 post-treatment Participant 3 – 2 baseline, 7 treatment, 1 post-treatment Participant 5 – 2 baseline, 5 treatment, 1 post-treatment Participant 6 – 3 baseline, 4 treatment, 1 post-treatment Participant 7 – 2 baseline, 5 treatment, 1 post-treatment In each session across all three phases children were assessed in their performance on a number line estimation task, a single-digit computation task, a multi-digit computation task, a dot comparison task and a number comparison task. Furthermore, during the treatment phase, all children completed the intervention task after these assessments. The order of the assessment tasks varied randomly between sessions.
Measures Number Line Estimation. Children completed a computerised bounded number line task (0-100). The number line is presented in the middle of the screen, and the target number is presented above the start point of the number line to avoid signalling the midpoint (Dackermann et al., 2018). Target numbers included two non-overlapping sets (trained and untrained) of 30 items each. Untrained items were assessed on all phases of the study. Trained items were assessed independent of the intervention during baseline and post-treatment phases, and performance on the intervention is used to index performance on the trained set during the treatment phase. Within each set, numbers were equally distributed throughout the number range, with three items within each ten (0-10, 11-20, 21-30, etc.). Target numbers were presented in random order. Participants did not receive performance-based feedback. Accuracy is indexed by percent absolute error (PAE) [(number estimated - target number)/ scale of number line] x100.
Single-Digit Computation. The task included ten additions with single-digit addends (1-9) and single-digit results (2-9). The order was counterbalanced so that half of the additions present the lowest addend first (e.g., 3 + 5) and half of the additions present the highest addend first (e.g., 6 + 3). This task also included ten subtractions with single-digit minuends (3-9), subtrahends (1-6) and differences (1-6). The items were presented horizontally on the screen accompanied by a sound and participants were required to give a verbal response. Participants did not receive performance-based feedback. Performance on this task was indexed by item-based accuracy.
Multi-digit computational estimation. The task included eight additions and eight subtractions presented with double-digit numbers and three response options. None of the response options represent the correct result. Participants were asked to select the option that was closest to the correct result. In half of the items the calculation involved two double-digit numbers, and in the other half one double and one single digit number. The distance between the correct response option and the exact result of the calculation was two for half of the trials and three for the other half. The calculation was presented vertically on the screen with the three options shown below. The calculations remained on the screen until participants responded by clicking on one of the options on the screen. Participants did not receive performance-based feedback. Performance on this task is measured by item-based accuracy.
Dot Comparison and Number Comparison. Both tasks included the same 20 items, which were presented twice, counterbalancing left and right presentation. Magnitudes to be compared were between 5 and 99, with four items for each of the following ratios: .91, .83, .77, .71, .67. Both quantities were presented horizontally side by side, and participants were instructed to press one of two keys (F or J), as quickly as possible, to indicate the largest one. Items were presented in random order and participants did not receive performance-based feedback. In the non-symbolic comparison task (dot comparison) the two sets of dots remained on the screen for a maximum of two seconds (to prevent counting). Overall area and convex hull for both sets of dots is kept constant following Guillaume et al. (2020). In the symbolic comparison task (Arabic numbers), the numbers remained on the screen until a response was given. Performance on both tasks was indexed by accuracy.
The Number Line Intervention During the intervention sessions, participants estimated the position of 30 Arabic numbers in a 0-100 bounded number line. As a form of feedback, within each item, the participants’ estimate remained visible, and the correct position of the target number appeared on the number line. When the estimate’s PAE was lower than 2.5, a message appeared on the screen that read “Excellent job”, when PAE was between 2.5 and 5 the message read “Well done, so close! and when PAE was higher than 5 the message read “Good try!” Numbers were presented in random order.
Variables in the dataset Age = age in ‘years, months’ at the start of the study Sex = female/male/non-binary or third gender/prefer not to say (as reported by parents) Math_Problem_Solving_raw = Raw score on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016). Math_Problem_Solving_Percentile = Percentile equivalent on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016). Num_Ops_Raw = Raw score on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016). Math_Problem_Solving_Percentile = Percentile equivalent on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).
The remaining variables refer to participants’ performance on the study tasks. Each variable name is composed by three sections. The first one refers to the phase and session. For example, Base1 refers to the first measurement point of the baseline phase, Treat1 to the first measurement point on the treatment phase, and post1 to the first measurement point on the post-treatment phase.
The second part of the variable name refers to the task, as follows: DC = dot comparison SDC = single-digit computation NLE_UT = number line estimation (untrained set) NLE_T= number line estimation (trained set) CE = multidigit computational estimation NC = number comparison The final part of the variable name refers to the type of measure being used (i.e., acc = total correct responses and pae = percent absolute error).
Thus, variable Base2_NC_acc corresponds to accuracy on the number comparison task during the second measurement point of the baseline phase and Treat3_NLE_UT_pae refers to the percent absolute error on the untrained set of the number line task during the third session of the Treatment phase.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.
The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.
If you share or use this dataset, please cite [4] and [5] in any relevant documentation.
In addition, an image dataset for crack classification has also been published at [6].
References:
[1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873
[2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605
[3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434
[4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678
[5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044
[6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset and codes for "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023 "
The MATLAB codes and related datasets are used for generating the figures for the paper "Observation of Acceleration and Deceleration Periods at Pine Island Ice Shelf from 1997–2023".
Files and variables
File 1: Data_and_Code.zip
Directory: Main_function
**Description:****Include MATLAB scripts and functions. Each script include discriptions that guide the user how to used it and how to find the dataset that used for processing.
MATLAB Main Scripts: Include the whole steps to process the data, output figures, and output videos.
Script_1_Ice_velocity_process_flow.m
Script_2_strain_rate_process_flow.m
Script_3_DROT_grounding_line_extraction.m
Script_4_Read_ICESat2_h5_files.m
Script_5_Extraction_results.m
MATLAB functions: Five Files that includes MATLAB functions that support the main script:
1_Ice_velocity_code: Include MATLAB functions related to ice velocity post-processing, includes remove outliers, filter, correct for atmospheric and tidal effect, inverse weited averaged, and error estimate.
2_strain_rate: Include MATLAB functions related to strain rate calculation.
3_DROT_extract_grounding_line_code: Include MATLAB functions related to convert range offset results output from GAMMA to differential vertical displacement and used the result extract grounding line.
4_Extract_data_from_2D_result: Include MATLAB functions that used for extract profiles from 2D data.
5_NeRD_Damage_detection: Modified code fom Izeboud et al. 2023. When apply this code please also cite Izeboud et al. 2023 (https://www.sciencedirect.com/science/article/pii/S0034425722004655).
6_Figure_plotting_code:Include MATLAB functions related to Figures in the paper and support information.
Director: data_and_result
Description:**Include directories that store the results output from MATLAB. user only neeed to modify the path in MATLAB script to their own path.
1_origin : Sample data ("PS-20180323-20180329", “PS-20180329-20180404”, “PS-20180404-20180410”) output from GAMMA software in Geotiff format that can be used to calculate DROT and velocity. Includes displacment, theta, phi, and ccp.
2_maskccpN: Remove outliers by ccp < 0.05 and change displacement to velocity (m/day).
3_rockpoint: Extract velocities at non-moving region
4_constant_detrend: removed orbit error
5_Tidal_correction: remove atmospheric and tidal induced error
6_rockpoint: Extract non-aggregated velocities at non-moving region
6_vx_vy_v: trasform velocities from va/vr to vx/vy
7_rockpoint: Extract aggregated velocities at non-moving region
7_vx_vy_v_aggregate_and_error_estimate: inverse weighted average of three ice velocity maps and calculate the error maps
8_strain_rate: calculated strain rate from aggregate ice velocity
9_compare: store the results before and after tidal correction and aggregation.
10_Block_result: times series results that extrac from 2D data.
11_MALAB_output_png_result: Store .png files and time serties result
12_DROT: Differential Range Offset Tracking results
13_ICESat_2: ICESat_2 .h5 files and .mat files can put here (in this file only include the samples from tracks 0965 and 1094)
14_MODIS_images: you can store MODIS images here
shp: grounding line, rock region, ice front, and other shape files.
File 2 : PIG_front_1947_2023.zip
Includes Ice front positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.
File 3 : PIG_DROT_GL_2016_2021.zip
Includes grounding line positions shape files from 1947 to 2023, which used for plotting figure.1 in the paper.
Data was derived from the following sources:
Those links can be found in MATLAB scripts or in the paper "**Open Research" **section.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
A key characteristic of free-range chicken farming is to enable chickens to spend time outdoors. However, each chicken may use the available areas for roaming in variable ways. To check if, and how, broilers use their outdoor range at an individual level, we need to reliably characterise range use behaviour. Traditional methods relying on visual scans require significant time investment and only provide discontinuous information. Passive RFID (Radio Frequency Identification) systems enable tracking individually tagged chickens’ when they go through pop-holes; hence they only provide partial information on the movements of individual chickens. Here, we describe a new method to measure chickens’ range use and test its reliability on three ranges each containing a different breed. We used an active RFID system to localise chickens in their barn, or in one of nine zones of their range, every 30 seconds and assessed range-use behaviour in 600 chickens belonging to three breeds of slow- or medium-growing broilers used for outdoor production (all < 40g daily weight gain). From those real-time locations, we determined five measures to describe daily range use: time spent in the barn, number of outdoor accesses, number of zones visited in a day, gregariousness (an index that increases when birds spend time in zones where other birds are), and numbers of zone changes. Principal Component Analyses (PCAs) were performed on those measures, in each production system, to create two synthetic indicators of chickens’ range use behaviour. Our dataset includes the files needed to calibrate the system (supplementary materials), the data files used in the publication and the associated codes.
Facebook
TwitterThe Address Coordinator in the Planning Department assigns new addresses during the application review process per the address manual. The status field can be used to filter out valid addresses. The True, Multi, Corner (status will be changed to true when the building configuration is identified), and Model values are all valid addresses.Other Statuses:Preliminary - subdivision addresses assigned during the review process and are not official until the plat is recorded; Temporary - addresses assigned to power poles and construction trailers during the building process; Land - addresses for undeveloped properties; Range - used to aid the address coordinator in assigning new addresses when calculating the address range on a new street segment; Retired - former addresses that are no longer in use like shopping center reconfigurations.Use field: This field helps clarify what the address is representing such as power meters for subdivision fountains or pump houses in multi-family developments.The address field is a concatenation of the individual address fields except for the unit number (field name = Units).Address with suffix field concatenates the address number and any suffix for use in geocoders.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Section 1: Introduction
Brief overview of dataset contents:
Current database contains anonymised data collected during exercise testing services performed on male and female participants (cycling, rowing, kayaking and running) provided by the Human Performance Laboratory, School of Medicine, Trinity College Dublin, Dublin 2, Ireland.
835 graded incremental exercise test files (285 cycling, 266 rowing / kayaking, 284 running)
Description file with each row representing a test file - COLUMNS: file name (AXXX), sport (cycling, running, rowing or kayaking)
Anthropometric data of participants by sport (age, gender, height, body mass, BMI, skinfold thickness,% body fat, lean body mass and haematological data; namely, haemoglobin concentration (Hb), haematocrit (Hct), red blood cell (RBC) count and white blood cell (WBC) count )
Test data (HR, VO2 and lactate data) at rest and across a range of exercise intensities
Derived physiological indices quantifying each individual’s endurance profile
Following a request from athletes seeking assessment by phone or e-mail the test protocol, risks, benefits and test and medical requirements, were explained verbally or by return e-mail. Subsequently, an appointment for an exercise assessment was arranged following the regulatory reflection period (7 days). Following this regulatory period each participant’s verbal consent was obtained pre-test, for participants under 18 years of age parent / guardian consent was obtained in writing. Ethics approval was obtained from the Faculty of Health Sciences ethics committee and all testing procedures were performed in compliance with Declaration of Helsinki guidelines.
All consenting participants were required to attend the laboratory on one occasion in a rested, carbohydrate loaded and well-hydrated state, and for male participants’ clean shaven in the facial region. All participants underwent a pre-test medical examination, including assessment of resting blood pressure, pulmonary function testing and haematological (Coulter Counter Act Diff, Beckmann Coulter, CA,US) review performed by a qualified medical doctor prior to exercise testing. Any person presenting with any cardiac abnormalities, respiratory difficulties, symptoms of cold or influenza, musculoskeletal injury that could impair performance, diabetes, hypertension, metabolic disorders, or any other contra-indicatory symptoms were excluded. In addition, participants completed a medical questionnaire detailing training history, previous personal and family health abnormalities, recent illness or injury, menstrual status for female participants, as well as details of recent travel and current vaccination status, and current medications, supplements and allergies. Barefoot height in metre (Holtain, Crymych, UK), body mass (counter balanced scales) in kilogram (Seca, Hamburg, Germany) and skinfold thickness in millimetre using a Harpenden skinfold caliper (Bath International, West Sussex, UK) were recorded pre-exercise.
Section 2: Testing protocols
2.1: Cycling
A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on an electromagnetically braked cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). Participants initially identified a cycling position in which they were most comfortable by adjusting saddle height, saddle fore-aft position relative to the crank axis, saddle to handlebar distance and handlebar height. Participant’s feet were secured to the ergometer using their own cycling shoes with cleats and accompanying pedals. The protocol commenced with a 15-min warm-up at a workload of 120 Watt (W), followed by a 10-min rest. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a workload of 100 or 120 W for female and male participants, respectively, and subsequently increasing by a 20, 30 or 40 W incremental increase every 3-min depending on gender and current competition category. During assessment participants maintained a constant self-selected cadence chosen during their warm-up (permitted window was 5 rev.min−1 within a permitted absolute range of 75 to 95 rev.min−1) and the test was terminated when a participant was no longer able to maintain a constant cadence.
Heart rate (HR) data were recorded continuously by radio-telemetry using a Cosmed HR monitor (Cosmed, Rome, Italy). During the test, blood samples were collected from the middle finger of the right hand at the end of the second minute of each 3-min interval. The fingertip was cleaned to remove any sweat or blood and lanced using a long point sterile lancet (Braun, Melsungen, Germany). The blood sample was collected into a heparinised capillary tube (Brand, Wertheim, Germany) by holding the tube horizontal to the droplet and allowing transfer by capillary action. Subsequently, a 25μL aliquot of whole blood was drawn from the capillary tube using a YSI syringepet (YSI, OH, USA) and added into the chamber of a YSI 1500 Sport lactate analyser (YSI, OH, USA) for determination of non-lysed [Lac] in mmol.L−1. The lactate analyser was calibrated to the manufacturer’s requirements (± 0.05 mmol.L−1) before each test using a standard solution (YSI, OH, USA) of known concentration (5 mmol.L−1) and analyser linearity was confirmed using either a 15 or 30 mmol.L-1 standard solution (YSI, OH, USA).
Gas exchange variables including respiration rate (Rf in breaths.min-1), minute ventilation (VE in L.min-1), oxygen consumption (VO2 in L.min-1 and in mL.kg-1.min-1) and carbon dioxide production (VCO2 in L.min-1), were measured on a breath-by-breath basis throughout the test, using a cardiopulmonary exercise testing unit (CPET) and an associated software package (Cosmed, Rome, Italy). Participants wore a face mask (Hans Rudolf, KA, USA) which was connected to the CPET unit. The metabolic unit was calibrated prior to each test using ambient air and an alpha certified gas mixture containing 16% O2, 5% CO2 and 79% N2 (Cosmed, Rome, Italy). Volume calibration was performed using a 3L gas calibration syringe (Cosmed, Rome, Italy). Barometric pressure recorded by the CPET was confirmed by recording barometric pressure using a laboratory grade barometer.
Following testing mean HR and mean VO2 data at rest and during each exercise increment were computed and tabulated over the final minute of each 3-min interval. A graphical plot of [Lac], mean VO2 and mean HR versus cycling workload was constructed and analysed to quantify physiological endurance indices, see Data Analysis section. Data for VO2 peak in L.min-1 (absolute) and in mL.kg-1.min-1 (relative) and VE peak in L.min-1 were reported as the peak data recorded over any 10 consecutive breaths recorded during the last minute of the final exercise increment.
2.2: Running protocol
A continuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a motorised treadmill (Powerjog, Birmingham, UK). The running protocol, performed at a gradient of 0%, commenced with a 15-min warm-up at a velocity (km.h-1) which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity. Subsequently, the warm-up was followed by a 10 minute rest / dynamic stretching phase. From a safety perspective during all running GxT participants wore a suspended lightweight safety harness to minimise any potential falls risk. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal running velocity which was lower than the participant’s reported typical weekly long run (>60 min) on-road training velocity, and subsequently increased by ≥ 1 km.h-1 every 3-min depending on gender and current competition category. The test was terminated when a participant was no longer able to maintain the imposed treadmill.
Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.
2.3: Rowing / kayaking protocol
A discontinuous graded incremental exercise test (GxT) to volitional exhaustion was performed on a Concept 2C rowing ergometer (Concept, VA, US) in rowers or a Dansprint kayak ergometer (Dansprint, Hvidovre, Denmark) in flat-water kayakers. The protocol commenced with a 15-min low-intensity warm-up at a workload (W) dependent on gender, sport and competition category, followed by a 10-min rest. For rowing the flywheel damping (120, 125 or 130W) was set dependent on gender and competition category. For kayaking the bungee cord tension was adjusted by individual participants to suit their requirements. A discontinuous protocol of 3-min exercise at a targeted load followed by a 1-min rest phase to facilitate stationary earlobe capillary blood sample collection and resetting of ergometer display (Dansprint ergometer) was used. The GxT began with a 3-min stationary phase for resting data collection, followed by an active phase commencing at a sub-maximal load 80 to 120 W for rowing, 50 to 90 W for kayaking and subsequently increased by 20,30 or 40 W every 3-min depending on gender, sport and current competition category. The test was terminated when a participant was no longer able to maintain the targeted workload.
Measurement variables, equipment and pre-test calibration procedures, timing and procedure for measurement of selected variables and subsequent data analysis were as outlined in Section 2.1.
3.1: Data analysis
Constructed graphical plots (HR, VO2 and [Lac] versus load / velocity) were analysed to quantify the following; load / velocity at TLac, HR at TLac, [Lac] at TLac, % of VO2 peak at TLac, % of HRmax at TLac, load / velocity and HR at a nominal [Lac] of 2 mmol.L-1, load / velocity, VO2 and [Lac} at a nominal HR of
Facebook
TwitterSummary and methods used to calculate the physical characteristics used to compare the home range estimators.
Facebook
TwitterThe U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By ddrg (From Huggingface) [source]
With a total of six columns, including formula1, formula2, label (binary format), formula1, formula2, and label, the dataset provides all the necessary information for conducting comprehensive analysis and evaluation.
The train.csv file contains a subset of the dataset specifically curated for training purposes. It includes an extensive range of math formula pairs along with their corresponding labels and unique ID names. This allows researchers and data scientists to construct models that can predict whether two given formulas fall within the same category or not.
On the other hand, test.csv serves as an evaluation set. It consists of additional pairs of math formulas accompanied by their respective labels and unique IDs. By evaluating model performance on this test set after training it on train.csv data, researchers can assess how well their models generalize to unseen instances.
By leveraging this informative dataset, researchers can unlock new possibilities in mathematics-related fields such as pattern recognition algorithms development or enhancing educational tools that involve automatic identification and categorization tasks based on mathematical formulas
Introduction
Dataset Description
train.csv
The
train.csvfile contains a set of labeled math formula pairs along with their corresponding labels and formula name IDs. It consists of the following columns: - formula1: The first mathematical formula in the pair (text). - formula2: The second mathematical formula in the pair (text). - label: The classification label indicating whether the pair of formulas belong to the same category or not (binary). A label value of 1 indicates that both formulas belong to the same category, while a label value of 0 indicates different categories.test.csv
The purpose of the
test.csvfile is to provide a set of formula pairs along with their labels and formula name IDs for testing and evaluation purposes. It has an identical structure totrain.csv, containing columns like formula1, formula2, label, etc.Task
The main task using this dataset is binary classification, where your objective is to predict whether two mathematical formulas belong to the same category or not based on their textual representation. You can use various machine learning algorithms such as logistic regression, decision trees, random forests, or neural networks for training models on this dataset.
Exploring & Analyzing Data
Before building your model, it's crucial to explore and analyze your data. Here are some steps you can take:
- Load both CSV files (
train.csvandtest.csv) into your preferred data analysis framework or programming language (e.g., Python with libraries like pandas).- Examine the dataset's structure, including the number of rows, columns, and data types.
- Check for missing values in the dataset and handle them accordingly.
- Visualize the distribution of labels to understand whether it is balanced or imbalanced.
Model Building
Once you have analyzed and preprocessed your dataset, you can start building your classification model using various machine learning algorithms:
- Split your
train.csvdata into training and validation sets for model evaluation during training.- Choose a suitable
- Math Formula Similarity: This dataset can be used to develop a model that classifies whether two mathematical formulas are similar or not. This can be useful in various applications such as plagiarism detection, identifying duplicate formulas in databases, or suggesting similar formulas based on user input.
- Formula Categorization: The dataset can be used to train a model that categorizes mathematical formulas into different classes or categories. For example, the model can classify formulas into algebraic expressions, trigonometric equations, calculus problems, or geometric theorems. This categorization can help organize and search through large collections of mathematical formulas.
- Formula Recommendation: Using this dataset, one could build a recommendation system that suggests related math formulas based on user input. By analyzing the similarities between different formula pairs and their corresponding labels, the system could provide recommendations for relevant mathematical concepts that users may need while solving problems or studying specific topics in mathematics
Facebook
TwitterTransient killers whales inhabit the West Coast of the United States. Their range and movement patterns are difficult to ascertain, but are vital to understanding killer whale population dynamics and abundance trends. Satellite tagging of West Coast transient killer whales to determine range and movement patterns will provide data to assist in understanding transient killer whale populations. L...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
When instructing exercises to improve Range of Motion (ROM), clinicians often create an internal focus of attention, while motor performance may improve more when using an external focus of attention. Using Virtual Reality (VR), we investigated the effect of tasks with an internal and external focus on maximal ROM in people with neck pain and explored whether this effect was associated with fear of movement. Participants’ cervical ROM was measured while performing a target-seeking exercise in a VR-environment (external focus task (EFT)) and during three maximal rotation and flexion-extension movements with the VR-headset on, without signal (internal focus task (IFT)). Questionnaires were used to assess fear of movement. The main statistical analysis included two dependent T-tests. Pearson correlation coefficients were calculated to investigate whether the differences in ROM in both conditions were correlated to fear of movement. Maximum neck rotation was larger in the EFT condition than in the IFT condition (mean (SD) difference: 26.4 (21.4) degrees; p<0.001, r=0.78), but there was a difference in favour of the IFT condition for flexion-extension (mean (SD) difference: 8.2 (24.5) degrees; p=0.018, r=0.32). The variability in ROM was not explained by variability in fear of movement (for all correlations p≥0.197). An external focus resulted in a larger range of rotation, but our flexion-extension findings suggest that the task has to be specific to elicit such an effect. Further research, using a task that sufficiently elicits movement in all directions, is needed to determine the value of an external focus during exercise. The data set includes raw experimental data of fifty-four people with non-specific neck pain who were recruited from four primary care physiotherapy clinics in the region of Amsterdam and Rotterdam, and at Vrije Universiteit Amsterdam. Participants completed a digital questionnaire to collect personal and neck pain related information regarding age, sex, the duration and onset of their neck pain (gradual or sudden and if sudden, history of trauma), pain intensity, disability, kinesiophobia, fear of physical activity and fear avoidance beliefs. The following questionnaires were used: the Numeric Pain Rating Scale, Neck Disability Index, Tampa Scale of Kinesiophobia and the Fear Avoidance Beliefs Questionnaire. Within a week after completion of the questionnaire the participants performed the VR-experiment to evaluate the effect of an EFT and IFT in a VR environment on maximal cervical range of motion. In addition, we explored whether the size of the effect was associated with the level of kinesiophobia and fear avoidance beliefs. Participants’ cervical ROM was measured while performing a target-seeking exercise in a VR-environment (EFT) and during three maximal rotation and flexion-extension movements with the VR-headset on, without signal (IFT). The furthest maximum and minimum headset position in each direction, respectively around the horizontal (x-axis, flexion-extension) and vertical (y-axis, rotation) was measured. This resulted in four measurements per participant (i.e., rotation EFT, flexion-extension EFT, rotation IFT, flexion-extension IFT). After the VR-experiment, motion sickness was evaluated, using the short version of the Misery Scale (sMISC).
Facebook
TwitterCollection of 7 scientific formulas for calculating ideal body weight
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Despite the wide distribution of many parasites around the globe, the range of individual species varies significantly even among phylogenetically related taxa. Since parasites need suitable hosts to complete their development, parasite geographical and environmental ranges should be limited to communities where their hosts are found. Parasites may also suffer from a trade-off between being locally abundant or widely dispersed. We hypothesize that the geographical and environmental ranges of parasites are negatively associated to their host specificity and their local abundance. Location: Worldwide Time period: 2009 to 2021 Major taxa studied: Avian haemosporidian parasites Methods: We tested these hypotheses using a global database which comprises data on avian haemosporidian parasites from across the world. For each parasite lineage, we computed five metrics: phylogenetic host-range, environmental range, geographical range, and their mean local and total number of observations in the database. Phylogenetic generalized least squares models were ran to evaluate the influence of phylogenetic host-range and total and local abundances on geographical and environmental range. In addition, we analysed separately the two regions with the largest amount of available data: Europe and South America. Results: We evaluated 401 lineages from 757 localities and observed that generalism (i.e. phylogenetic host range) associates positively to both the parasites’ geographical and environmental ranges at global and Europe scales. For South America, generalism only associates with geographical range. Finally, mean local abundance (mean local number of parasite occurrences) was negatively related to geographical and environmental range. This pattern was detected worldwide and in South America, but not in Europe. Main Conclusions: We demonstrate that parasite specificity is linked to both their geographical and environmental ranges. The fact that locally abundant parasites present restricted ranges, indicates a trade-off between these two traits. This trade-off, however, only becomes evident when sufficient heterogeneous host communities are considered. Methods We compiled data on haemosporidian lineages from the MalAvi database (http://130.235.244.92/Malavi/ , Bensch et al. 2009) including all the data available from the “Grand Lineage Summary” representing Plasmodium and Haemoproteus genera from wild birds and that contained information regarding location. After checking for duplicated sequences, this dataset comprised a total of ~6200 sequenced parasites representing 1602 distinct lineages (775 Plasmodium and 827 Haemoproteus) collected from 1139 different host species and 757 localities from all continents except Antarctica (Supplementary figure 1, Supplementary Table 1). The parasite lineages deposited in MalAvi are based on a cyt b fragment of 478 bp. This dataset was used to calculate the parasites’ geographical, environmental and phylogenetic ranges. Geographical range All analyses in this study were performed using R version 4.02. In order to estimate the geographical range of each parasite lineage, we applied the R package “GeoRange” (Boyle, 2017) and chose the variable minimum spanning tree distance (i.e., shortest total distance of all lines connecting each locality where a particular lineage has been found). Using the function “create.matrix” from the “fossil” package, we created a matrix of lineages and coordinates and employed the function “GeoRange_MultiTaxa” to calculate the minimum spanning tree distance for each parasite lineage distance (i.e. shortest total distance in kilometers of all lines connecting each locality). Therefore, as at least two distinct sites are necessary to calculate this distance, parasites observed in a single locality could not have their geographical range estimated. For this reason, only parasites observed in two or more localities were considered in our phylogenetically controlled least squares (PGLS) models. Host and Environmental diversity Traditionally, ecologists use Shannon entropy to measure diversity in ecological assemblages (Pielou, 1966). The Shannon entropy of a set of elements is related to the degree of uncertainty someone would have about the identity of a random selected element of that set (Jost, 2006). Thus, Shannon entropy matches our intuitive notion of biodiversity, as the more diverse an assemblage is, the more uncertainty regarding to which species a randomly selected individual belongs. Shannon diversity increases with both the assemblage richness (e.g., the number of species) and evenness (e.g., uniformity in abundance among species). To compare the diversity of assemblages that vary in richness and evenness in a more intuitive manner, we can normalize diversities by Hill numbers (Chao et al., 2014b). The Hill number of an assemblage represents the effective number of species in the assemblage, i.e., the number of equally abundant species that are needed to give the same value of the diversity metric in that assemblage. Hill numbers can be extended to incorporate phylogenetic information. In such case, instead of species, we are measuring the effective number of phylogenetic entities in the assemblage. Here, we computed phylogenetic host-range as the phylogenetic Hill number associated with the assemblage of hosts found infected by a given parasite. Analyses were performed using the function “hill_phylo” from the “hillr” package (Chao et al., 2014a). Hill numbers are parameterized by a parameter “q” that determines the sensitivity of the metric to relative species abundance. Different “q” values produce Hill numbers associated with different diversity metrics. We set q = 1 to compute the Hill number associated with Shannon diversity. Here, low Hill numbers indicate specialization on a narrow phylogenetic range of hosts, whereas a higher Hill number indicates generalism across a broader phylogenetic spectrum of hosts. We also used Hill numbers to compute the environmental range of sites occupied by each parasite lineage. Firstly, we collected the 19 bioclimatic variables from WorldClim version 2 (http://www.worldclim.com/version2) for all sites used in this study (N = 713). Then, we standardized the 19 variables by centering and scaling them by their respective mean and standard deviation. Thereafter, we computed the pairwise Euclidian environmental distance among all sites and used this distance to compute a dissimilarity cluster. Finally, as for the phylogenetic Hill number, we used this dissimilarity cluster to compute the environmental Hill number of the assemblage of sites occupied by each parasite lineage. The environmental Hill number for each parasite can be interpreted as the effective number of environmental conditions in which a parasite lineage occurs. Thus, the higher the environmental Hill number, the more generalist the parasite is regarding the environmental conditions in which it can occur. Parasite phylogenetic tree A Bayesian phylogenetic reconstruction was performed. We built a tree for all parasite sequences for which we were able to estimate the parasite’s geographical, environmental and phylogenetic ranges (see above); this represented 401 distinct parasite lineages. This inference was produced using MrBayes 3.2.2 (Ronquist & Huelsenbeck, 2003) with the GTR + I + G model of nucleotide evolution, as recommended by ModelTest (Posada & Crandall, 1998), which selects the best-fit nucleotide substitution model for a set of genetic sequences. We ran four Markov chains simultaneously for a total of 7.5 million generations that were sampled every 1000 generations. The first 1250 million trees (25%) were discarded as a burn-in step and the remaining trees were used to calculate the posterior probabilities of each estimated node in the final consensus tree. Our final tree obtained a cumulative posterior probability of 0.999. Leucocytozoon caulleryi was used as the outgroup to root the phylogenetic tree as Leucocytozoon spp. represents a basal group within avian haemosporidians (Pacheco et al., 2020).
Facebook
TwitterABSTRACT Objective: To measure nursing Workload (WL) of nurses who work in the Inpatient Unit, as recommended by the Nursing Interventions Classification (NIC), comparing observational and online methods to propose supervision strategies for academic professionals. Method: Quantitative, descriptive, observational study performed in a Clinical/Surgical Hospital Unit. 30 direct and indirect activities. Data collected in observational and online records. Statistical analysis: SPSS 18.0 software, percentage frequencies and associated times between groups by Fisher's Exact test, 95% confidence interval, significance level 5%. Results: Comparing the activities performed with the NIC time: from the direct 16, five observational and five online, were out of range, with no significant difference between frequencies (P=0.427). Of the 14 indirect, only in the observational, two were out of the range, without significant difference (P=0.486). Conclusion: Both methods measure WL; the online method developed accompanies activities performed in real time.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The files with simulation results for ECOC 20223 submission "Analysis of the Scalar and Vector Random Coupling Models For a Four Coupled-Core Fiber". "4CCF_eigenvectorsPol" file is the Mathematica code which enables to calculate supermodes (eigenvectors of M(w)) and their propagation constants of 4-coupled-core fiber (4CCF). These results are uploaded to the python notebook "4CCF_modelingECOC" in order to plot them to get Fig. 2 in the paper. "TransferMatrix" is the python file with functions used for modeling, simulation and plotting. It is also uploaded in the python notebook "4CCF_modelingECOC", where all the calculations for figures in the paper are presented.
! UPD 25.09.2023: There is an error in the formula of birefringence calculation. It is in the function "CouplingCoefficients" in "TransferMatrix" file. There the variable "birefringence" has to be calculated according to the formula (19) [A. Ankiewicz, A. Snyder, and X.-H. Zheng, "Coupling between parallel optical fiber cores–critical examination", Journal of Lightwave Technology, vol. 4, no. 9,pp. 1317–1323, 1986]: (4*U**2*W*spec.k0(W)*spec.kn(2, W_)/(spec.k1(W)*V**4))*((spec.iv(1, W)/spec.k1(W))-(spec.iv(2, W)/spec.k0(W))) The correct formula gives almost the same result (the difference is 10^-5), but one has to use a correct formula anyway. ! UPD 9.12.2023: I have noticed that in the published version of the code I forgot to change the wavelength range for impulse response calculation. So instead of seeing the nice shape as in the paper you will see resolution limited shape. To solve that just change the range of wavelengths, you can add "wl = [1545e-9, 1548e-9]" in the first cell after "Total power impulse response". P.s. In case of any questions or suggestions you are welcome to write me an email ekader@chalmers.se
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project provides a national unified database of residential building retrofit measures and associated retail prices and end-user might experience. These data are accessible to software programs that evaluate most cost-effective retrofit measures to improve the energy efficiency of residential buildings and are used in the consumer-facing website https://remdb.nrel.gov/
This publicly accessible, centralized database of retrofit measures offers the following benefits:
This database provides full price estimates for many different retrofit measures. For each measure, the database provides a range of prices, as the data for a measure can vary widely across regions, houses, and contractors. Climate, construction, home features, local economy, maturity of a market, and geographic location are some of the factors that may affect the actual price of these measures.
This database is not intended to provide specific cost estimates for a specific project. The cost estimates do not include any rebates or tax incentives that may be available for the measures. Rather, it is meant to help determine which measures may be more cost-effective. The National Renewable Energy Laboratory (NREL) makes every effort to ensure accuracy of the data; however, NREL does not assume any legal liability or responsibility for the accuracy or completeness of the information.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Overview:
This dataset contains simulated (hypothetical) but almost realistic (based on AI) data related to sleep, heart rate, and exercise habits of 500 individuals. It includes both pre-exercise and post-exercise resting heart rates, allowing for analyses such as a dependent t-test (Paired Sample t-test) to observe changes in heart rate after an exercise program. The dataset also includes additional health-related variables, such as age, hours of sleep per night, and exercise frequency.
The data is designed for tasks involving hypothesis testing, health analytics, or even machine learning applications that predict changes in heart rate based on personal attributes and exercise behavior. It can be used to understand the relationships between exercise frequency, sleep, and changes in heart rate.
File: Filename: heart_rate_data.csv File Format: CSV
- Features (Columns):
Age: Description: The age of the individual. Type: Integer Range: 18-60 years Relevance: Age is an important factor in determining heart rate and the effects of exercise.
Sleep Hours: Description: The average number of hours the individual sleeps per night. Type: Float Range: 3.0 - 10.0 hours Relevance: Sleep is a crucial health metric that can impact heart rate and exercise recovery.
Exercise Frequency (Days/Week): Description: The number of days per week the individual engages in physical exercise. Type: Integer Range: 1-7 days/week Relevance: More frequent exercise may lead to greater heart rate improvements and better cardiovascular health.
Resting Heart Rate Before: Description: The individual’s resting heart rate measured before beginning a 6-week exercise program. Type: Integer Range: 50 - 100 bpm (beats per minute) Relevance: This is a key health indicator, providing a baseline measurement for the individual’s heart rate.
Resting Heart Rate After: Description: The individual’s resting heart rate measured after completing the 6-week exercise program. Type: Integer Range: 45 - 95 bpm (lower than the "Resting Heart Rate Before" due to the effects of exercise). Relevance: This variable is essential for understanding how exercise affects heart rate over time, and it can be used to perform a dependent t-test analysis.
Max Heart Rate During Exercise: Description: The maximum heart rate the individual reached during exercise sessions. Type: Integer Range: 120 - 190 bpm Relevance: This metric helps in understanding cardiovascular strain during exercise and can be linked to exercise frequency or fitness levels.
Potential Uses: Dependent T-Test Analysis: The dataset is particularly suited for a dependent (paired) t-test where you compare the resting heart rate before and after the exercise program for each individual.
Exploratory Data Analysis (EDA):Investigate relationships between sleep, exercise frequency, and changes in heart rate. Potential analyses include correlations between sleep hours and resting heart rate improvement, or regression analyses to predict heart rate after exercise.
Machine Learning: Use the dataset for predictive modeling, and build a beginner regression model to predict post-exercise heart rate using age, sleep, and exercise frequency as features.
Health and Fitness Insights: This dataset can be useful for studying how different factors like sleep and age influence heart rate changes and overall cardiovascular health.
License: Choose an appropriate open license, such as:
CC BY 4.0 (Attribution 4.0 International).
Inspiration for Kaggle Users: How does exercise frequency influence the reduction in resting heart rate? Is there a relationship between sleep and heart rate improvements post-exercise? Can we predict the post-exercise heart rate using other health variables? How do age and exercise frequency interact to affect heart rate?
Acknowledgments: This is a simulated dataset for educational purposes, generated to demonstrate statistical and machine learning applications in the field of health analytics.