46 datasets found
  1. UCI dataset

    • springernature.figshare.com
    bin
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wan-Ting Hsieh; Sergio González Vázquez; Trista Chen (2023). UCI dataset [Dataset]. http://doi.org/10.6084/m9.figshare.20496258.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Wan-Ting Hsieh; Sergio González Vázquez; Trista Chen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Cuff-Less Blood Pressure Estimation Dataset [2] from the UCI Machine Learning Repository. It is a subset of the MIMIC-II Waveform Dataset that contains 12000 records of simultaneous PPG and ABP from 942 patients with a sampling rate of 125 Hz. The 12000 records were uniformly split into four parts with 3000 records each. However, as the subject information is lacking, the Hold-one-out strategy was utilized to generate training, validation, and test sets once the data was preprocessed. In the end, the UCI dataset had 291,078 segments, which was around 404 hours of recording, making it substantially the biggest data set with a considerably higher ratio of continuous segments per record (32.15).

    [2] Kachuee, M., Kiani, M. M., Mohammadzade, H. & Shabany, M. Cuff-less blood pressure estimation data set (2015). UCI repository https://archive.ics.uci.edu/ml/datasets/Cuff-Less+Blood+Pressure+Estimation.

  2. UCI and OpenML Data Sets for Ordinal Quantification

    • zenodo.org
    zip
    Updated Jul 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. http://doi.org/10.5281/zenodo.8177302
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mirko Bunse; Mirko Bunse; Alejandro Moreo; Alejandro Moreo; Fabrizio Sebastiani; Fabrizio Sebastiani; Martin Senz; Martin Senz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

    With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

    We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

    Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

    Usage

    You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

    Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

    Data Extraction: In your terminal, you can call either

    make

    (recommended), or

    julia --project="." --eval "using Pkg; Pkg.instantiate()"
    julia --project="." extract-oq.jl

    Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

    Further Reading

    Implementation of our experiments: https://github.com/mirkobunse/regularized-oq

  3. heart-disease-data

    • kaggle.com
    zip
    Updated Aug 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nagaveda Reddy (2020). heart-disease-data [Dataset]. https://www.kaggle.com/nagavedareddy/heartdiseasedata
    Explore at:
    zip(3494 bytes)Available download formats
    Dataset updated
    Aug 5, 2020
    Authors
    Nagaveda Reddy
    Description

    Dataset

    This dataset was created by Nagaveda Reddy

    Contents

  4. d

    Repository URL

    • datadiscoverystudio.org
    resource url
    Updated 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2007). Repository URL [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/e6c8a4dd24884eef88fc0d51dae09862/html
    Explore at:
    resource urlAvailable download formats
    Dataset updated
    2007
    Description

    Link Function: information

  5. f

    Basic information on 40 datasets from UCI repository used in this study...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol (2023). Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV). [Dataset]. http://doi.org/10.1371/journal.pone.0033812.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV).

  6. Heart Disease

    • kaggle.com
    Updated Oct 3, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhou Xing (2021). Heart Disease [Dataset]. https://kaggle.com/zhoumeixing/heart-disease-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 3, 2021
    Dataset provided by
    Kaggle
    Authors
    Zhou Xing
    Description

    Context

    This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by Machine Learning researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0).

    Source: https://archive.ics.uci.edu/ml/datasets/heart+disease

  7. O

    ionosphere

    • opendatalab.com
    • paperswithcode.com
    zip
    Updated Aug 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nanjing University (2022). ionosphere [Dataset]. https://opendatalab.com/OpenDataLab/ionosphere
    Explore at:
    zip(58517 bytes)Available download formats
    Dataset updated
    Aug 28, 2022
    Dataset provided by
    Nanjing University
    Monash University
    Description

    The original ionosphere dataset from UCI machine learning repository is a binary classification dataset with dimensionality 34. There is one attribute having values all zeros, which is discarded. So the total number of dimensions are 33. The ‘bad’ class is considered as outliers class and the ‘good’ class as inliers.

  8. Regensburg Pediatric Appendicitis

    • kaggle.com
    zip
    Updated Jan 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kukku (2024). Regensburg Pediatric Appendicitis [Dataset]. https://www.kaggle.com/datasets/kukkuyouseff19ma118/regensburg-pediatric-appendicitis/code
    Explore at:
    zip(533431290 bytes)Available download formats
    Dataset updated
    Jan 15, 2024
    Authors
    Kukku
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Regensburg
    Description

    Dataset

    This dataset was created by Kukku

    Released under Apache 2.0

    Contents

  9. i

    Online Shoppers Purchasing Intention Dataset

    • ieee-dataport.org
    Updated Jan 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. O. Sakar (2025). Online Shoppers Purchasing Intention Dataset [Dataset]. http://doi.org/10.21227/e73k-cd23
    Explore at:
    Dataset updated
    Jan 9, 2025
    Dataset provided by
    IEEE Dataport
    Authors
    C. O. Sakar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of feature vectors belonging to 12,330 sessions. The dataset was formed so that each session would belong to a different user in a 1-year period to avoid any tendency to a specific campaign, special day, user profile, or period. Of the 12,330 sessions in the dataset, 84.5% (10,422) were negative class samples that did not end with shopping, and the rest (1908) were positive class samples ending with shopping.The dataset consists of 10 numerical and 8 categorical attributes. The 'Revenue' attribute can be used as the class label.The dataset contains 18 columns, each representing specific attributes of online shopping behavior:Administrative and Administrative_Duration: Number of pages visited and time spent on administrative pages.Informational and Informational_Duration: Number of pages visited and time spent on informational pages.ProductRelated and ProductRelated_Duration: Number of pages visited and time spent on product-related pages.BounceRates and ExitRates: Metrics indicating user behavior during the session.PageValues: Value of the page based on e-commerce metrics.SpecialDay: Likelihood of shopping based on special days.Month: Month of the session.OperatingSystems, Browser, Region, TrafficType: Technical and geographical attributes.VisitorType: Categorizes users as returning, new, or others.Weekend: Indicates if the session occurred on a weekend.Revenue: Target variable indicating whether a transaction was completed (True or False).The original dataset has been picked up from the UCI Machine Learning Repository, the link to which is as follows :https://archive.ics.uci.edu/dataset/468/online+shoppers+purchasing+intention+datasetAdditional Variable InformationThe dataset consists of 10 numerical and 8 categorical attributes. The 'Revenue' attribute can be used as the class label. "Administrative", "Administrative Duration", "Informational", "Informational Duration", "Product Related" and "Product Related Duration" represent the number of different types of pages visited by the visitor in that session and total time spent in each of these page categories. The values of these features are derived from the URL information of the pages visited by the user and updated in real time when a user takes an action, e.g. moving from one page to another. The "Bounce Rate", "Exit Rate" and "Page Value" features represent the metrics measured by "Google Analytics" for each page in the e-commerce site. The value of "Bounce Rate" feature for a web page refers to the percentage of visitors who enter the site from that page and then leave ("bounce") without triggering any other requests to the analytics server during that session. The value of "Exit Rate" feature for a specific web page is calculated as for all pageviews to the page, the percentage that were the last in the session. The "Page Value" feature represents the average value for a web page that a user visited before completing an e-commerce transaction. The "Special Day" feature indicates the closeness of the site visiting time to a specific special day (e.g. Mother’s Day, Valentine's Day) in which the sessions are more likely to be finalized with transaction. The value of this attribute is determined by considering the dynamics of e-commerce such as the duration between the order date and delivery date. For example, for Valentina’s day, this value takes a nonzero value between February 2 and February 12, zero before and after this date unless it is close to another special day, and its maximum value of 1 on February 8. The dataset also includes operating system, browser, region, traffic type, visitor type as returning or new visitor, a Boolean value indicating whether the date of the visit is weekend, and month of the year.

  10. Credit Default Data Set from UCI Repository

    • kaggle.com
    zip
    Updated Oct 30, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    somaktukai (2018). Credit Default Data Set from UCI Repository [Dataset]. https://www.kaggle.com/datasets/somaktukai/credit-default-data-set-from-uci-repository
    Explore at:
    zip(1464078 bytes)Available download formats
    Dataset updated
    Oct 30, 2018
    Authors
    somaktukai
    Description

    Dataset

    This dataset was created by somaktukai

    Contents

  11. f

    Details of the datasets from UCI repository used in the experiments.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QingJun Song; HaiYan Jiang; Qinghui Song; XieGuang Zhao; Xiaoxuan Wu (2023). Details of the datasets from UCI repository used in the experiments. [Dataset]. http://doi.org/10.1371/journal.pone.0184834.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    QingJun Song; HaiYan Jiang; Qinghui Song; XieGuang Zhao; Xiaoxuan Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Details of the datasets from UCI repository used in the experiments.

  12. H

    Replication Data for: Nursery Data Set

    • dataverse.harvard.edu
    Updated Apr 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenjuan Wang (2018). Replication Data for: Nursery Data Set [Dataset]. http://doi.org/10.7910/DVN/MBFQK0
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 5, 2018
    Dataset provided by
    Harvard Dataverse
    Authors
    Wenjuan Wang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset is downloaded from UCI repository. https://archive.ics.uci.edu/ml/datasets/nursery the dataset contains categorical data to rank nursery school applicants. The original dataset contains 5 classes. Classes were reorganized in order to remain with only two classes (”recommended” or ”not recommended”).

  13. DodgerLoopGame UCR Archive Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Southampton (2024). DodgerLoopGame UCR Archive Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11186627
    Explore at:
    Dataset updated
    May 14, 2024
    Dataset provided by
    University of Californiahttp://universityofcalifornia.edu/
    University of Southampton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is part of the UCR Archive maintained by University of Southampton researchers. Please cite a relevant or the latest full archive release if you use the datasets. See http://www.timeseriesclassification.com/.

    The traffic data are collected with the loop sensor installed on ramp for the 101 North freeway in Los Angeles. This location is close to Dodgers Stadium; therefore the traffic is affected by volume of visitors to the stadium. Missing values are represented with NaN. - Class 1: Normal Day - Class 2: Game Day There is nothing to infer from the order of examples in the train and test set. Missing values are represented with NaN in the text file. Data created by Ihler, Alexander, Jon Hutchins, and Padhraic Smyth (see [1][2][3]). Data edited by Chin-Chia Michael Yeh.

    [1] Ihler, Alexander, Jon Hutchins, and Padhraic Smyth. "Adaptive event detection with time-varying poisson processes." Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2006.

    [2] “UCI Machine Learning Repository: Dodgers Loop Sensor Data Set.” UCI Machine Learning Repository, archive.ics.uci.edu/ml/datasets/dodgers+loop+sensor.

    [3] “Caltrans PeMS.” Caltrans, pems.dot.ca.gov/.

    Donator: C. Yeh

  14. Z

    Multi-Label Datasets with Missing Values

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ádamo L. de Santana (2023). Multi-Label Datasets with Missing Values [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7748932
    Explore at:
    Dataset updated
    Mar 19, 2023
    Dataset provided by
    Ádamo L. de Santana
    Ewaldo Santana
    Fábio M. F. Lobato
    Fabrício A. do Carmo
    Antonio F. L. Jacob Jr.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Consisting of six multi-label datasets from the UCI Machine Learning repository.

    Each dataset contains missing values which have been artificially added at the following rates: 5, 10, 15, 20, 25, and 30%. The “amputation” was performed using the “Missing Completely at Random” mechanism.

    File names are represented as follows:

       amp_DB_MR.arff
    

    where:

       DB = original dataset;
    
    
       MR = missing rate.
    

    For more details, please read:

    IEEE Access article (in review process)

  15. d

    Replication Data for: Gas sensors for home activity monitoring Data Set

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wang, Wenjuan (2023). Replication Data for: Gas sensors for home activity monitoring Data Set [Dataset]. http://doi.org/10.7910/DVN/HEWNOU
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Wang, Wenjuan
    Description

    The dataset is downloaded from UCI repository http://archive.ics.uci.edu/ml/datasets/gas+sensors+for+home+activity+monitoring In the dataset there are only two classes banana and wine. Class background is not included in this dataset The dataset is sequential according to the ID

  16. Car MPG

    • kaggle.com
    zip
    Updated Mar 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Sharma (2020). Car MPG [Dataset]. https://www.kaggle.com/datasets/gauravsharma99/car-mpg
    Explore at:
    zip(6721 bytes)Available download formats
    Dataset updated
    Mar 23, 2020
    Authors
    Gaurav Sharma
    Description

    Dataset

    This dataset was created by Gaurav Sharma

    Contents

  17. h

    heart

    • huggingface.co
    Updated Apr 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattia (2023). heart [Dataset]. https://huggingface.co/datasets/mstz/heart
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2023
    Authors
    Mattia
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Heart

    The Heart dataset from the UCI ML repository. Does the patient have heart disease?

      Configurations and tasks
    

    Configuration Task

    hungary Binary classification

      Usage
    

    from datasets import load_dataset

    dataset = load_dataset("mstz/heart", "hungary")["train"]

  18. f

    Summary of the publicly-available UCI Machine Learning Repository datasets...

    • plos.figshare.com
    bin
    Updated Aug 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esha Datta; Aditya Ballal; Javier E. López; Leighton T. Izu (2023). Summary of the publicly-available UCI Machine Learning Repository datasets used for method comparison. [Dataset]. http://doi.org/10.1371/journal.pdig.0000307.t002
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 9, 2023
    Dataset provided by
    PLOS Digital Health
    Authors
    Esha Datta; Aditya Ballal; Javier E. López; Leighton T. Izu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of the publicly-available UCI Machine Learning Repository datasets used for method comparison.

  19. H

    Replication Data for: Covtype

    • dataverse.harvard.edu
    Updated Apr 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenjuan Wang (2018). Replication Data for: Covtype [Dataset]. http://doi.org/10.7910/DVN/NTIWVN
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 5, 2018
    Dataset provided by
    Harvard Dataverse
    Authors
    Wenjuan Wang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset is downloaded from UCI repository https://archive.ics.uci.edu/ml/datasets/covertype The dataset contains 1 to 7 Forest Cover Type. The task is to predict the forest cover type from cartographic variables only (no remotely sensed data)

  20. Z

    Household Reactive Power Consumption Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 24, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geoffrey I Webb (2021). Household Reactive Power Consumption Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3902705
    Explore at:
    Dataset updated
    Mar 24, 2021
    Dataset provided by
    Chang Wei Tan
    Francois Petitjean
    Christoph Bergmeir
    Geoffrey I Webb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is part of the Monash, UEA & UCR time series regression repository. http://tseregression.org/

    The goal of this dataset is to predict total reactive power consumption in a household. This dataset contains 1440 time series obtained from the Individual household electric power consumption dataset from the UCI repository. The time series has 5 dimensions. This includes measurements for voltage, current annd 3 sub-metering energy usage.

    Please refer to https://archive.ics.uci.edu/ml/datasets/Individual+household+electric+power+consumption for more details

    Source Georges Hebrail (georges.hebrail '@' edf.fr), Senior Researcher, EDF R&D, Clamart, France Alice Berard, TELECOM ParisTech Master of Engineering Internship at EDF R&D, Clamart, France

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Wan-Ting Hsieh; Sergio González Vázquez; Trista Chen (2023). UCI dataset [Dataset]. http://doi.org/10.6084/m9.figshare.20496258.v1
Organization logoOrganization logo

UCI dataset

Explore at:
binAvailable download formats
Dataset updated
Mar 13, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Wan-Ting Hsieh; Sergio González Vázquez; Trista Chen
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The Cuff-Less Blood Pressure Estimation Dataset [2] from the UCI Machine Learning Repository. It is a subset of the MIMIC-II Waveform Dataset that contains 12000 records of simultaneous PPG and ABP from 942 patients with a sampling rate of 125 Hz. The 12000 records were uniformly split into four parts with 3000 records each. However, as the subject information is lacking, the Hold-one-out strategy was utilized to generate training, validation, and test sets once the data was preprocessed. In the end, the UCI dataset had 291,078 segments, which was around 404 hours of recording, making it substantially the biggest data set with a considerably higher ratio of continuous segments per record (32.15).

[2] Kachuee, M., Kiani, M. M., Mohammadzade, H. & Shabany, M. Cuff-less blood pressure estimation data set (2015). UCI repository https://archive.ics.uci.edu/ml/datasets/Cuff-Less+Blood+Pressure+Estimation.

Search
Clear search
Close search
Google apps
Main menu