100+ datasets found
  1. h

    INTERVAL

    • healthdatagateway.org
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee., INTERVAL [Dataset]. https://healthdatagateway.org/dataset/201
    Explore at:
    unknownAvailable download formats
    Dataset authored and provided by
    INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee.
    License

    http://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdfhttp://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdf

    Description

    In over 100 years of blood donation practice, INTERVAL is the first randomised controlled trial to assess the impact of varying the frequency of blood donation on donor health and the blood supply. It provided policy-makers with evidence that collecting blood more frequently than current intervals can be implemented over two years without impacting on donor health, allowing better management of the supply to the NHS of units of blood with in-demand blood groups. INTERVAL was designed to deliver a multi-purpose strategy: an initial purpose related to blood donation research aiming to improve NHS Blood and Transplant’s core services and a longer-term purpose related to the creation of a comprehensive resource that will enable detailed studies of health-related questions.

    Approximately 50,000 generally healthy blood donors were recruited between June 2012 and June 2014 from 25 NHS Blood Donation centres across England. Approximately equal numbers of men and women; aged from 18-80; ~93% white ancestry. All participants completed brief online questionnaires at baseline and gave blood samples for research purposes. Participants were randomised to giving blood every 8/10/12 weeks (for men) and 12/14/16 weeks (for women) over a 2-year period. ~30,000 participants returned after 2 years and completed a brief online questionnaire and gave further blood samples for research purposes.

    The baseline questionnaire includes brief lifestyle information (smoking, alcohol consumption, etc), iron-related questions (e.g., red meat consumption), self-reported height and weight, etc. The SF-36 questionnaire was completed online at baseline and 2-years, with a 6-monthly SF-12 questionnaire between baseline and 2-years.

    All participants have had the Affymetrix Axiom UK Biobank genotyping array assayed and then imputed to 1000G+UK10K combined reference panel (80M variants in total). 4,000 participants have 50X whole-exome sequencing and 12,000 participants have 15X whole-genome sequencing. Whole-blood RNA sequencing has commenced in ~5,000 participants.

    The dataset also contains data on clinical chemistry biomarkers, blood cell traits, >200 lipoproteins, metabolomics (Metabolon HD4), lipidomics, and proteomics (SomaLogic, Olink), either cohort-wide or is large sub-sets of the cohort.

  2. m

    The banksia plot: a method for visually comparing point estimates and...

    • bridges.monash.edu
    • researchdata.edu.au
    txt
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Turner; Amalia Karahalios; Elizabeth Korevaar; Joanne E. McKenzie (2024). The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets [Dataset]. http://doi.org/10.26180/25286407.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    Monash University
    Authors
    Simon Turner; Amalia Karahalios; Elizabeth Korevaar; Joanne E. McKenzie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Companion data for the creation of a banksia plot:Background:In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.Methods:The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.Results:In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.Conclusions:The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1

  3. u

    Data from: A randomized controlled trial of positive outcome expectancies...

    • agdatacommons.nal.usda.gov
    • datasets.ai
    • +1more
    application/csv
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelsey Ufholz (2025). Data from: A randomized controlled trial of positive outcome expectancies during high-intensity interval training in inactive adults [Dataset]. http://doi.org/10.15482/USDA.ADC/1523121
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    Ag Data Commons
    Authors
    Kelsey Ufholz
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Includes accelerometer data using an ActiGraph to assess usual sedentary, moderate, vigorous, and very vigorous activity at baseline, 6 weeks, and 10 weeks. Includes relative reinforcing value (RRV) data showing how participants rated how much they would want to perform both physical and sedentary activities on a scale of 1-10 at baseline, week 6, and week 10. Includes data on the breakpoint, or Pmax of the RRV, which was the last schedule of reinforcement (i.e. 4, 8, 16, …) completed for the behavior (exercise or sedentary). For both Pmax and RRV score, greater scores indicated a greater reinforcing value, with scores exceeding 1.0 indicating increased exercise reinforcement. Includes questionnaire data regarding preference and tolerance for exercise intensity using the Preference for and Tolerance of Intensity of Exercise Questionnaire (PRETIEQ) and positive and negative outcome expectancy of exercise using the outcome expectancy scale (OES). Includes data on height, weight, and BMI. Includes demographic data such as gender and race/ethnicity. Resources in this dataset:Resource Title: Actigraph activity data. File Name: AGData.csvResource Description: Includes data from Actigraph accelerometer for each participant at baseline, 6 weeks, and 10 weeks.Resource Title: RRV Data. File Name: RRVData.csvResource Description: Includes data from RRV at baseline, 6 weeks, and 10 weeks, OES survey data, PRETIE-Q survey data, and demographic data (gender, weight, height, race, ethnicity, and age).

  4. f

    Data from: Additive Hazards Regression Analysis of Massive Interval-Censored...

    • tandf.figshare.com
    pdf
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peiyao Huang; Shuwei Li; Xinyuan Song (2025). Additive Hazards Regression Analysis of Massive Interval-Censored Data via Data Splitting [Dataset]. http://doi.org/10.6084/m9.figshare.27103243.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 12, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Peiyao Huang; Shuwei Li; Xinyuan Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the rapid development of data acquisition and storage space, massive datasets exhibited with large sample size emerge increasingly and make more advanced statistical tools urgently need. To accommodate such big volume in the analysis, a variety of methods have been proposed in the circumstances of complete or right censored survival data. However, existing development of big data methodology has not attended to interval-censored outcomes, which are ubiquitous in cross-sectional or periodical follow-up studies. In this work, we propose an easily implemented divide-and-combine approach for analyzing massive interval-censored survival data under the additive hazards model. We establish the asymptotic properties of the proposed estimator, including the consistency and asymptotic normality. In addition, the divide-and-combine estimator is shown to be asymptotically equivalent to the full-data-based estimator obtained from analyzing all data together. Simulation studies suggest that, relative to the full-data-based approach, the proposed divide-and-combine approach has desirable advantage in terms of computation time, making it more applicable to large-scale data analysis. An application to a set of interval-censored data also demonstrates the practical utility of the proposed method.

  5. Data from: Data File Including 48 Datasets with Values Used in Figs. 2 and 3...

    • figshare.com
    zip
    Updated May 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hirotaka Uchitomi (2016). Data File Including 48 Datasets with Values Used in Figs. 2 and 3 in PLOS ONE Article [Dataset]. http://doi.org/10.6084/m9.figshare.3219655.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 5, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Hirotaka Uchitomi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data File Including 48 Datasets with Values Used in Figs. 2 and 3 in PLOS ONE Article. Thirty of these datasets are the measured time series row data for stride interval in the PD subjects; the remaining 18 datasets are the measured time series row data for stride interval in the control subjects. Each dataset is composed of a single comma separated values (CSV) file and each CSV file stores a data array of two columns: the first column is a time stamp in seconds, the second column is time series data of stride interval.

  6. HRV-ACC: a dataset with R-R intervals and accelerometer data for the...

    • zenodo.org
    • data.niaid.nih.gov
    csv, txt, zip
    Updated Aug 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kamil Książek; Kamil Książek; Wilhelm Masarczyk; Wilhelm Masarczyk; Przemysław Głomb; Przemysław Głomb; Michał Romaszewski; Michał Romaszewski; Iga Stokłosa; Iga Stokłosa; Piotr Ścisło; Piotr Ścisło; Paweł Dębski; Paweł Dębski; Robert Pudlo; Robert Pudlo; Piotr Gorczyca; Piotr Gorczyca; Magdalena Piegza; Magdalena Piegza (2023). HRV-ACC: a dataset with R-R intervals and accelerometer data for the diagnosis of psychotic disorders using a Polar H10 wearable sensor [Dataset]. http://doi.org/10.5281/zenodo.8171266
    Explore at:
    txt, zip, csvAvailable download formats
    Dataset updated
    Aug 9, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kamil Książek; Kamil Książek; Wilhelm Masarczyk; Wilhelm Masarczyk; Przemysław Głomb; Przemysław Głomb; Michał Romaszewski; Michał Romaszewski; Iga Stokłosa; Iga Stokłosa; Piotr Ścisło; Piotr Ścisło; Paweł Dębski; Paweł Dębski; Robert Pudlo; Robert Pudlo; Piotr Gorczyca; Piotr Gorczyca; Magdalena Piegza; Magdalena Piegza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT

    The issue of diagnosing psychotic diseases, including schizophrenia and bipolar disorder, in particular, the objectification of symptom severity assessment, is still a problem requiring the attention of researchers. Two measures that can be helpful in patient diagnosis are heart rate variability calculated based on electrocardiographic signal and accelerometer mobility data. The following dataset contains data from 30 psychiatric ward patients having schizophrenia or bipolar disorder and 30 healthy persons. The duration of the measurements for individuals was usually between 1.5 and 2 hours. R-R intervals necessary for heart rate variability calculation were collected simultaneously with accelerometer data using a wearable Polar H10 device. The Positive and Negative Syndrome Scale (PANSS) test was performed for each patient participating in the experiment, and its results were attached to the dataset. Furthermore, the code for loading and preprocessing data, as well as for statistical analysis, was included on the corresponding GitHub repository.

    BACKGROUND

    Heart rate variability (HRV), calculated based on electrocardiographic (ECG) recordings of R-R intervals stemming from the heart's electrical activity, may be used as a biomarker of mental illnesses, including schizophrenia and bipolar disorder (BD) [Benjamin et al]. The variations of R-R interval values correspond to the heart's autonomic regulation changes [Berntson et al, Stogios et al]. Moreover, the HRV measure reflects the activity of the sympathetic and parasympathetic parts of the autonomous nervous system (ANS) [Task Force of the European Society of Cardiology the North American Society of Pacing Electrophysiology, Matusik et al]. Patients with psychotic mental disorders show a tendency for a change in the centrally regulated ANS balance in the direction of less dynamic changes in the ANS activity in response to different environmental conditions [Stogios et al]. Larger sympathetic activity relative to the parasympathetic one leads to lower HRV, while, on the other hand, higher parasympathetic activity translates to higher HRV. This loss of dynamic response may be an indicator of mental health. Additional benefits may come from measuring the daily activity of patients using accelerometry. This may be used to register periods of physical activity and inactivity or withdrawal for further correlation with HRV values recorded at the same time.

    EXPERIMENTS

    In our experiment, the participants were 30 psychiatric ward patients with schizophrenia or BD and 30 healthy people. All measurements were performed using a Polar H10 wearable device. The sensor collects ECG recordings and accelerometer data and, additionally, prepares a detection of R wave peaks. Participants of the experiment had to wear the sensor for a given time. Basically, it was between 1.5 and 2 hours, but the shortest recording was 70 minutes. During this time, evaluated persons could perform any activity a few minutes after starting the measurement. Participants were encouraged to undertake physical activity and, more specifically, to take a walk. Due to patients being in the medical ward, they received instruction to take a walk in the corridors at the beginning of the experiment. They were to repeat the walk 30 minutes and 1 hour after the first walk. The subsequent walks were to be slightly longer (about 3, 5 and 7 minutes, respectively). We did not remind or supervise the command during the experiment, both in the treatment and the control group. Seven persons from the control group did not receive this order and their measurements correspond to freely selected activities with rest periods but at least three of them performed physical activities during this time. Nevertheless, at the start of the experiment, all participants were requested to rest in a sitting position for 5 minutes. Moreover, for each patient, the disease severity was assessed using the PANSS test and its scores are attached to the dataset.

    The data from sensors were collected using Polar Sensor Logger application [Happonen]. Such extracted measurements were then preprocessed and analyzed using the code prepared by the authors of the experiment. It is publicly available on the GitHub repository [Książek et al].

    Firstly, we performed a manual artifact detection to remove abnormal heartbeats due to non-sinus beats and technical issues of the device (e.g. temporary disconnections and inappropriate electrode readings). We also performed anomaly detection using Daubechies wavelet transform. Nevertheless, the dataset includes raw data, while a full code necessary to reproduce our anomaly detection approach is available in the repository. Optionally, it is also possible to perform cubic spline data interpolation. After that step, rolling windows of a particular size and time intervals between them are created. Then, a statistical analysis is prepared, e.g. mean HRV calculation using the RMSSD (Root Mean Square of Successive Differences) approach, measuring a relationship between mean HRV and PANSS scores, mobility coefficient calculation based on accelerometer data and verification of dependencies between HRV and mobility scores.

    DATA DESCRIPTION

    The structure of the dataset is as follows. One folder, called HRV_anonymized_data contains values of R-R intervals together with timestamps for each experiment participant. The data was properly anonymized, i.e. the day of the measurement was removed to prevent person identification. Files concerned with patients have the name treatment_X.csv, where X is the number of the person, while files related to the healthy controls are named control_Y.csv, where Y is the identification number of the person. Furthermore, for visualization purposes, an image of the raw RR intervals for each participant is presented. Its name is raw_RR_{control,treatment}_N.png, where N is the number of the person from the control/treatment group. The collected data are raw, i.e. before the anomaly removal. The code enabling reproducing the anomaly detection stage and removing suspicious heartbeats is publicly available in the repository [Książek et al]. The structure of consecutive files collecting R-R intervals is following:

    Phone timestampRR-interval [ms]
    12:43:26.538000651
    12:43:27.189000632
    12:43:27.821000618
    12:43:28.439000621
    12:43:29.060000661
    ......

    The first column contains the timestamp for which the distance between two consecutive R peaks was registered. The corresponding R-R interval is presented in the second column of the file and is expressed in milliseconds.
    The second folder, called accelerometer_anonymized_data contains values of accelerometer data collected at the same time as R-R intervals. The naming convention is similar to that of the R-R interval data: treatment_X.csv and control_X.csv represent the data coming from the persons from the treatment and control group, respectively, while X is the identification number of the selected participant. The numbers are exactly the same as for R-R intervals. The structure of the files with accelerometer recordings is as follows:

    Phone timestampX [mg]Y [mg]Z [mg]
    13:00:17.196000-961-23182
    13:00:17.205000-965-21181
    13:00:17.215000-966-22187
    13:00:17.225000-967-26193
    13:00:17.235000-965-27191
    ............

    The first column contains a timestamp, while the next three columns correspond to the currently registered acceleration in three axes: X, Y and Z, in milli-g unit.

    We also attached a file with the PANSS test scores (PANSS.csv) for all patients participating in the measurement. The structure of this file is as follows:

    no_of_personPANSS_PPANSS_NPANSS_GPANSS_total
    18132243
    21171836
    314304488
    418132758
    ..............


    The first column contains the identification number of the patient, while the three following columns refer to the PANSS scores related to positive, negative and general symptoms, respectively.

    USAGE NOTES

    All the files necessary to run the HRV and/or accelerometer data analysis are available on the GitHub repository [Książek et al]. HRV data loading, preprocessing (i.e. anomaly detection and removal), as well as the

  7. 10,000 RR Interval Data (9500NAF & 500PAF) from 24 h Holter recordings used...

    • figshare.com
    zip
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fan Lin; Xiaoyun Yang; Peng Zhang (2024). 10,000 RR Interval Data (9500NAF & 500PAF) from 24 h Holter recordings used for atrial fibrillation detection [Dataset]. http://doi.org/10.6084/m9.figshare.28000112.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Fan Lin; Xiaoyun Yang; Peng Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This RR interval dataset is derived from 10,000 cases of 24-hour Holter monitoring data sampled at 128 Hz. Among the cases, 9,500 are labeled as non-atrial fibrillation (NAF), and 500 as paroxysmal atrial fibrillation (PAF). These data have been used in the article "Clinician-AI Collaboration: A Win-Win solution for Efficiency and Reliability in Atrial Fibrillation Diagnosis".The dataset formated as CSV file consists of two columns:rr_interval: Represents the interval between consecutive R-peaks, measured in milliseconds.label: Categorical labels for the beats, where:1 indicates AF0 indicates NAF-1 indicates noise or artifactsEach case is named based on its category. NAF cases are labeled as NAF0001.csv through NAF9500.csv, while PAF cases are labeled as PAF0001.csv through PAF0500.csv.For any questions, please contact the email: hustzp@hust.edu.cn

  8. d

    Performance Measure Definition: Stroke Alert Call-to-Door Interval

    • catalog.data.gov
    • s.cnmilf.com
    Updated Jun 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2024). Performance Measure Definition: Stroke Alert Call-to-Door Interval [Dataset]. https://catalog.data.gov/dataset/performance-measure-definition-stroke-alert-call-to-door-interval
    Explore at:
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    data.austintexas.gov
    Description

    Performance Measure Definition: Stroke Alert Call-to-Door Interval

  9. f

    Data from: Confidence and Prediction in Linear Mixed Models: Do Not...

    • tandf.figshare.com
    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernard G. Francq; Dan Lin; Walter Hoyer (2023). Confidence and Prediction in Linear Mixed Models: Do Not Concatenate the Random Effects. Application in an Assay Qualification Study [Dataset]. http://doi.org/10.6084/m9.figshare.12410729.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Bernard G. Francq; Dan Lin; Walter Hoyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract–In the pharmaceutical industry, all analytical methods must be shown to deliver unbiased and precise results. In an assay qualification or validation study, the trueness, accuracy, and intermediate precision are usually assessed by comparing the measured concentrations to their nominal levels. Trueness is assessed by using Confidence Intervals (CIs) of mean measured concentration, accuracy by Prediction Intervals (PIs) for a future measured concentration, and the intermediate precision by the total variance. ICH and USP guidelines alike request that all relevant sources of variability must be studied, for example, the effect of different technicians, the day-to-day variability or the use of multiple reagent lots. Those different random effects must be modeled as crossed, nested, or a combination of both; while concatenating them to simplify the model is often taken place. This article compares this simplified approach to a mixed model with the actual design. Our simulation study shows an under-estimation of the intermediate precision and, therefore, a substantial reduction of the CI and PI. The power for accuracy or trueness is consequently over-estimated when designing a new study. Two real datasets from assay validation study during vaccine development are used to illustrate the impact of such concatenation of random variables.

  10. Z

    League of Legends Match Data at Various Time Intervals

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claudio Campelo (2023). League of Legends Match Data at Various Time Intervals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8303396
    Explore at:
    Dataset updated
    Aug 31, 2023
    Dataset provided by
    Jailson Barros da Silva Junior
    Claudio Campelo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises comprehensive information from ranked matches played in the game League of Legends, spanning the time frame between January 12, 2023, and May 18, 2023. The matches cover a wide range of skill levels, specifically from the Iron tier to the Diamond tier.

    The dataset is structured based on time intervals, presenting game data at various percentages of elapsed game time, including 20%, 40%, 60%, 80%, and 100%. For each interval, detailed match statistics, player performance metrics, objective control, gold distribution, and other vital in-game information are provided.

    This collection of data not only offers insights into how matches evolve and strategies change over different phases of the game but also enables the exploration of player behavior and decision-making as matches progress. Researchers and analysts in the field of esports and game analytics will find this dataset valuable for studying trends, developing predictive models, and gaining a deeper understanding of the dynamics within ranked League of Legends matches across different skill tiers.

  11. d

    Performance Measure Definition: STEMI Alert Call-to-Door Interval

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2024). Performance Measure Definition: STEMI Alert Call-to-Door Interval [Dataset]. https://catalog.data.gov/dataset/performance-measure-definition-stemi-alert-call-to-door-interval
    Explore at:
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    data.austintexas.gov
    Description

    Performance Measure Definition: STEMI Alert Call-to-Door Interval

  12. TABLE 3.9: Perinatal Statistics Report 2014: Interval in Years Since Last...

    • data.gov.ie
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.ie, TABLE 3.9: Perinatal Statistics Report 2014: Interval in Years Since Last Birth: Total Births, Live Births, Mortality Rates, and Maternities, 2014 - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/014-interval-in-years-since-last-birth-total-births-live-births-mortality-rates-and-matern-2014
    Explore at:
    Dataset provided by
    data.gov.ie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Presents the distribution of TOTAL, SINGLETON AND MULTIPLE births for 2014 by Interval in Years Since Last Birth. This table only includes women having second and subsequent births. Primiparous wome (i.e. women who have had no previous pregnancy resulting in a live birth or stillbirth) are not included in this table. This table outlines data for total births, live births, stillbirths, early neonatal deaths and perinatal mortality rates, as well as presenting the number of maternities. The Perinatal Statistics Report 2014 is a report on national data on Perinatal events in 2014. Information on every birth in the Republic of Ireland is submitted to the National Perinatal Reporting System (NPRS). All births are notified and registered on a standard four part birth notification form (BNF01) which is completed where the birth takes place. Part 3 of this form is sent to the HPO for data entry and validation. The information collected includes data on pregnancy outcomes (with particular reference to perinatal mortality and important aspects of perinatal care), as well as descriptive social and biological characteristics of mothers giving birth. See the complete Perinatal Statistics Report 2014 at http://www.hpo.ie/latest_hipe_nprs_reports/NPRS_2014/Perinatal_Statistics_Report_2014.pdf

  13. R

    Companion datasets for "A random forest approach for interval selection in...

    • entrepot.recherche.data.gouv.fr
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathalie Vialaneix; Nathalie Vialaneix; Rémi Servien; Rémi Servien (2024). Companion datasets for "A random forest approach for interval selection in functional regression" [Dataset]. http://doi.org/10.57745/KMH2GP
    Explore at:
    html(1493540), tsv(22844), application/x-r-data(1138), bin(35060), application/x-r-data(1334), zip(20094150), html(993342), application/x-r-data(2917), application/x-r-data(1793830), application/x-r-data(2190), html(1162896), zip(86691205)Available download formats
    Dataset updated
    Jun 20, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Nathalie Vialaneix; Nathalie Vialaneix; Rémi Servien; Rémi Servien
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Dataset funded by
    INRAE, DIGIT-BIO
    Description

    Companion datasets for the article "A random forest approach for interval selection in functional regression". This dataverse contains: simulated data, obtained from simulated meteorological data (WACSGen) and simulated agronomic data (STICS); real data, on truffle production. The dataverse contains raw and processed data as well as all the scripts that were used to process the original raw data.

  14. f

    Data from: Real data example.

    • plos.figshare.com
    xlsx
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jia Wang; Lili Tian; Li Yan (2024). Real data example. [Dataset]. http://doi.org/10.1371/journal.pone.0314705.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jia Wang; Lili Tian; Li Yan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In genomic study, log transformation is a common prepossessing step to adjust for skewness in data. This standard approach often assumes that log-transformed data is normally distributed, and two sample t-test (or its modifications) is used for detecting differences between two experimental conditions. However, recently it was shown that two sample t-test can lead to exaggerated false positives, and the Wilcoxon-Mann-Whitney (WMW) test was proposed as an alternative for studies with larger sample sizes. In addition, studies have demonstrated that the specific distribution used in modeling genomic data has profound impact on the interpretation and validity of results. The aim of this paper is three-fold: 1) to present the Exp-gamma distribution (exponential-gamma distribution stands for log-transformed gamma distribution) as a proper biological and statistical model for the analysis of log-transformed protein abundance data from single-cell experiments; 2) to demonstrate the inappropriateness of two sample t-test and the WMW test in analyzing log-transformed protein abundance data; 3) to propose and evaluate statistical inference methods for hypothesis testing and confidence interval estimation when comparing two independent samples under the Exp-gamma distributions. The proposed methods are applied to analyze protein abundance data from a single-cell dataset.

  15. Wind Generation Time Interval Exploration Data

    • data.ca.gov
    • data.cnra.ca.gov
    • +5more
    Updated Jan 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Energy Commission (2024). Wind Generation Time Interval Exploration Data [Dataset]. https://data.ca.gov/dataset/wind-generation-time-interval-exploration-data
    Explore at:
    zip, gpkg, gdb, arcgis geoservices rest api, kml, geojson, csv, html, xlsx, txtAvailable download formats
    Dataset updated
    Jan 19, 2024
    Dataset authored and provided by
    California Energy Commissionhttp://www.energy.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the data set behind the Wind Generation Interactive Query Tool created by the CEC. The visualization tool interactively displays wind generation over different time intervals in three-dimensional space. The viewer can look across the state to understand generation patterns of regions with concentrations of wind power plants. The tool aids in understanding high and low periods of generation. Operation of the electric grid requires that generation and demand are balanced in each period.



    The height and color of columns at wind generation areas are scaled and shaded to represent capacity factors (CFs) of the areas in a specific time interval. Capacity factor is the ratio of the energy produced to the amount of energy that could ideally have been produced in the same period using the rated nameplate capacity. Due to natural variations in wind speeds, higher factors tend to be seen over short time periods, with lower factors over longer periods. The capacity used is the reported nameplate capacity from the Quarterly Fuel and Energy Report, CEC-1304A. CFs are based on wind plants in service in the wind generation areas.

    Renewable energy resources like wind facilities vary in size and geographic distribution within each state. Resource planning, land use constraints, climate zones, and weather patterns limit availability of these resources and where they can be developed. National, state, and local policies also set limits on energy generation and use. An example of resource planning in California is the Desert Renewable Energy Conservation Plan.

    By exploring the visualization, a viewer can gain a three-dimensional understanding of temporal variation in generation CFs, along with how the wind generation areas compare to one another. The viewer can observe that areas peak in generation in different periods. The large range in CFs is also visible.



  16. Estimating Confidence Intervals for 2020 Census Statistics Using Approximate...

    • registry.opendata.aws
    Updated Aug 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau (2024). Estimating Confidence Intervals for 2020 Census Statistics Using Approximate Monte Carlo Simulation (2010 Census Proof of Concept) [Dataset]. https://registry.opendata.aws/census-2010-amc-mdf-replicates/
    Explore at:
    Dataset updated
    Aug 5, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The 2010 Census Production Settings Demographic and Housing Characteristics (DHC) Approximate Monte Carlo (AMC) method seed Privacy Protected Microdata File (PPMF0) and PPMF replicates (PPMF1, PPMF2, ..., PPMF25) are a set of microdata files intended for use in estimating the magnitude of error(s) introduced by the 2020 Decennial Census Disclosure Avoidance System (DAS) into the Redistricting and DHC products. The PPMF0 was created by executing the 2020 DAS TopDown Algorithm (TDA) using the confidential 2010 Census Edited File (CEF) as the initial input; the replicates were then created by executing the 2020 DAS TDA repeatedly with the PPMF0 as its initial input. Inspired by analogy to the use of bootstrap methods in non-private contexts, U.S. Census Bureau (USCB) researchers explored whether simple calculations based on comparing each PPMFi to the PPMF0 could be used to reliably estimate the scale of errors introduced by the 2020 DAS, and generally found this approach worked well.

    The PPMF0 and PPMFi files contained here are provided so that external researchers can estimate properties of DAS-introduced error without privileged access to internal USCB-curated data sets; further information on the estimation methodology can be found in Ashmead et. al 2024.

    The 2010 DHC AMC seed PPMF0 and PPMF replicates have been cleared for public dissemination by the USCB Disclosure Review Board (CBDRB-FY24-DSEP-0002). The 2010 PPMF0 included in these files was produced using the same parameters and settings as were used to produce the 2010 Demonstration Data Product Suite (2023-04-03) PPMF, but represents an independent execution of the TopDown Algorithm. The PPMF0 and PPMF replicates contain all Person and Units attributes necessary to produce the Redistricting and DHC publications for both the United States and Puerto Rico, and include geographic detail down to the Census Block level. They do not include attributes specific to either the Detailed DHC-A or Detailed DHC-B products; in particular, data on Major Race (e.g., White Alone) is included, but data on Detailed Race (e.g., Cambodian) is not included in the PPMF0 and replicates.

    The 2020 AMC replicate files for estimating confidence intervals for the official 2020 Census statistics are available.

  17. f

    Descriptive statistics of the dataset with mean, standard deviation (SD),...

    • plos.figshare.com
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Achim Langenbucher; Nóra Szentmáry; Alan Cayless; Jascha Wendelstein; Peter Hoffmann (2023). Descriptive statistics of the dataset with mean, standard deviation (SD), median, and the lower (quantile 5%) and upper (quantile 95%) boundary of the 90% confidence interval. [Dataset]. http://doi.org/10.1371/journal.pone.0267352.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Achim Langenbucher; Nóra Szentmáry; Alan Cayless; Jascha Wendelstein; Peter Hoffmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics of the dataset with mean, standard deviation (SD), median, and the lower (quantile 5%) and upper (quantile 95%) boundary of the 90% confidence interval.

  18. 🚌 | Public Transport Traffic in Minsk

    • kaggle.com
    Updated Jan 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    l3LlFF (2022). 🚌 | Public Transport Traffic in Minsk [Dataset]. https://www.kaggle.com/l3llff/traffic-data-in-minsk/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2022
    Dataset provided by
    Kaggle
    Authors
    l3LlFF
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Minsk
    Description

    https://github.com/l3LlFF/minsktrans_parser/blob/master/images/traffic.png?raw=true" alt="traffic">

    Where data came from?!

    Data was collected from real vehicles with period in one minute.

    Content

    This dataset represents information about public transport movement during time.

  19. J

    Interval censored regression with fixed effects (replication data)

    • jda-test.zbw.eu
    • journaldata.zbw.eu
    .rmd, csv, r, txt
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Abrevaya; Chris Muris; Jason Abrevaya; Chris Muris (2024). Interval censored regression with fixed effects (replication data) [Dataset]. https://jda-test.zbw.eu/dataset/interval-censored-regression-with-fixed-effects
    Explore at:
    txt(3460), csv(4118642), .rmd(2070), .rmd(3797), .rmd(2506), r(5699)Available download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Jason Abrevaya; Chris Muris; Jason Abrevaya; Chris Muris
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper considers identification and estimation of a fixed-effects model with an interval-censored dependent variable. In each time period, the researcher observes the interval (with known endpoints) in which the dependent variable lies but not the value of the dependent variable itself. Two versions of the model are considered: a parametric model with logistic errors and a semiparametric model with errors having an unspecified distribution. In both cases, the error disturbances can be heteroskedastic over cross-sectional units as long as they are stationary within a cross-sectional unit; the semiparametric model also allows for serial correlation of the error disturbances. A conditional-logit-type composite likelihood estimator is proposed for the logistic fixed-effects model, and a composite maximum-score-type estimator is proposed for the semiparametric model. In general, the scale of the coefficient parameters is identified by these estimators, meaning that the causal effects of interest are estimated directly in cases where the latent dependent variable is of primary interest (e.g., pure data-coding situations). Monte Carlo simulations and an empirical application to birthweight outcomes illustrate the performance of the parametric estimator.

  20. f

    Detailed analysis of data generation procedure.

    • plos.figshare.com
    xls
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Insoo Kim; Junhee Seok; Yoojoong Kim (2023). Detailed analysis of data generation procedure. [Dataset]. http://doi.org/10.1371/journal.pone.0294513.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Nov 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Insoo Kim; Junhee Seok; Yoojoong Kim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Traditionally, datasets with multiple censored time-to-events have not been utilized in multivariate analysis because of their high level of complexity. In this paper, we propose the Censored Time Interval Analysis (CTIVA) method to address this issue. It estimates the joint probability distribution of actual event times in the censored dataset by implementing a statistical probability density estimation technique on the dataset. Based on the acquired event time, CTIVA investigates variables correlated with the interval time of events via statistical tests. The proposed method handles both categorical and continuous variables simultaneously—thus, it is suitable for application on real-world censored time-to-event datasets, which include both categorical and continuous variables. CTIVA outperforms traditional censored time-to-event data handling methods by 5% on simulation data. The average area under the curve (AUC) of the proposed method on the simulation dataset exceeds 0.9 under various conditions. Further, CTIVA yields novel results on National Sample Cohort Demo (NSCD) and proteasome inhibitor bortezomib dataset, a real-world censored time-to-event dataset of medical history of beneficiaries provided by the National Health Insurance Sharing Service (NHISS) and National Center for Biotechnology Information (NCBI). We believe that the development of CTIVA is a milestone in the investigation of variables correlated with interval time of events in presence of censoring.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee., INTERVAL [Dataset]. https://healthdatagateway.org/dataset/201

INTERVAL

INTERVAL

Explore at:
unknownAvailable download formats
Dataset authored and provided by
INTERVAL must be acknowledged in all publications using these data. Further details will be issued through the Data Access Committee.
License

http://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdfhttp://www.donorhealth-btru.nihr.ac.uk/wp-content/uploads/2020/04/Data-Access-Policy-v1.0-14Apr2020.pdf

Description

In over 100 years of blood donation practice, INTERVAL is the first randomised controlled trial to assess the impact of varying the frequency of blood donation on donor health and the blood supply. It provided policy-makers with evidence that collecting blood more frequently than current intervals can be implemented over two years without impacting on donor health, allowing better management of the supply to the NHS of units of blood with in-demand blood groups. INTERVAL was designed to deliver a multi-purpose strategy: an initial purpose related to blood donation research aiming to improve NHS Blood and Transplant’s core services and a longer-term purpose related to the creation of a comprehensive resource that will enable detailed studies of health-related questions.

Approximately 50,000 generally healthy blood donors were recruited between June 2012 and June 2014 from 25 NHS Blood Donation centres across England. Approximately equal numbers of men and women; aged from 18-80; ~93% white ancestry. All participants completed brief online questionnaires at baseline and gave blood samples for research purposes. Participants were randomised to giving blood every 8/10/12 weeks (for men) and 12/14/16 weeks (for women) over a 2-year period. ~30,000 participants returned after 2 years and completed a brief online questionnaire and gave further blood samples for research purposes.

The baseline questionnaire includes brief lifestyle information (smoking, alcohol consumption, etc), iron-related questions (e.g., red meat consumption), self-reported height and weight, etc. The SF-36 questionnaire was completed online at baseline and 2-years, with a 6-monthly SF-12 questionnaire between baseline and 2-years.

All participants have had the Affymetrix Axiom UK Biobank genotyping array assayed and then imputed to 1000G+UK10K combined reference panel (80M variants in total). 4,000 participants have 50X whole-exome sequencing and 12,000 participants have 15X whole-genome sequencing. Whole-blood RNA sequencing has commenced in ~5,000 participants.

The dataset also contains data on clinical chemistry biomarkers, blood cell traits, >200 lipoproteins, metabolomics (Metabolon HD4), lipidomics, and proteomics (SomaLogic, Olink), either cohort-wide or is large sub-sets of the cohort.

Search
Clear search
Close search
Google apps
Main menu