100+ datasets found
  1. Comparison of 14 classifiers

    • figshare.com
    application/gzip
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacques Wainer (2023). Comparison of 14 classifiers [Dataset]. http://doi.org/10.6084/m9.figshare.3407932.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jacques Wainer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data, programs, results, and analysis software for the paper "Comparison of 14 different families of classification algorithms on 115 binary data sets" https://arxiv.org/abs/1606.00930

  2. f

    Data from: Comparison of data mining models applied to a surface...

    • scielo.figshare.com
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anderson Cordeiro Charles; Anderson Amendoeira Namen; Pedro Paulo Gomes Watts Rodrigues (2023). Comparison of data mining models applied to a surface meteorological station [Dataset]. http://doi.org/10.6084/m9.figshare.5667640.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELO journals
    Authors
    Anderson Cordeiro Charles; Anderson Amendoeira Namen; Pedro Paulo Gomes Watts Rodrigues
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT This paper presents the application of data mining techniques for pattern identification obtained from the analysis of meteorological variables and their correlation with the occurrence of intense rainfall. The used data were collected between 2008 and 2012 by the surface meteorological station of the Polytechnic Institute of Rio de Janeiro State University, located in Nova Friburgo - RJ, Brazil. The main objective is the automatic prediction related to extreme precipitation events surrounding the meteorological station location one hour prior its occurrence. Classification models were developed based on decision trees and artificial neural networks. The steps of consistency analysis, treatment and data conversion, as well as the computational models used are described, and some metrics are compared in order to identify their effectiveness. The results obtained for the most accurate model presented a rate of 82. 9% of hits related to the prediction of rainfall equal to or greater than 10 mm h-1 one hour prior its occurrence. The results indicate the possibility of using this work to predict risk events in the study region.

  3. C

    Cryptocurrency Mining Platform Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Cryptocurrency Mining Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/cryptocurrency-mining-platform-1386453
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jun 24, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The cryptocurrency mining platform market is experiencing robust growth, driven by the increasing adoption of cryptocurrencies and the ongoing evolution of mining technologies. The market, valued at approximately $2.5 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching an estimated market value exceeding $8 billion by 2033. This expansion is fueled by several key factors, including the increasing sophistication of mining hardware, the rise of cloud-based mining solutions offering accessibility to individual investors, and the ongoing development of more energy-efficient mining algorithms. The market is segmented by platform type (cloud-based, software-based, hardware-based), target users (individual miners, mining pools), and geographic region, with North America and Europe currently dominating market share. However, the market is not without its challenges. Regulatory uncertainties surrounding cryptocurrency mining in various jurisdictions pose a significant restraint on growth. Fluctuations in cryptocurrency prices also impact profitability, making it a volatile market for both miners and platform providers. Furthermore, the increasing energy consumption associated with cryptocurrency mining and the growing concerns about environmental sustainability are pushing for the adoption of more eco-friendly mining practices and technologies, thereby influencing platform development and adoption. The competitive landscape is intense, with a range of established players like NiceHash and newer entrants like Salad competing for market share. The success of these platforms hinges on factors such as ease of use, security features, profitability, and the ongoing support of the cryptocurrency ecosystem. The market will continue to evolve, influenced by technological advancements, regulatory developments, and the overall health of the cryptocurrency market.

  4. T

    GDP FROM MINING.PHP by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jan 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). GDP FROM MINING.PHP by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/gdp-from-mining.php
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Jan 30, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for GDP FROM MINING.PHP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  5. Comparative Reviews Dataset's

    • kaggle.com
    zip
    Updated Jan 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umair Younis (2019). Comparative Reviews Dataset's [Dataset]. https://www.kaggle.com/umairyounis/comparative-reviews-datasets
    Explore at:
    zip(205233 bytes)Available download formats
    Dataset updated
    Jan 22, 2019
    Authors
    Umair Younis
    Description

    Context

    To get improved results on Machine Learning Algorithms, and other techniques used in Data Mining.

    Content

    Comprises of two columns, the First row consists of comparative reviews, the second row contains polarities.

    Acknowledgements

    I pay thanks to my supervisor, Dr Muhammad Zubair Asghar, Assitant Professor, ICIT, Gomal University (KPK). Di.Khan. Without his guidance, I can't accomplish this task.

    Inspiration

    Comparative opinion mining is becoming the most popular research area in the field of Data Mining. These three comparative reviews datasets will help the researchers who are working in the area of opinion mining and sentiment analysis.

  6. T

    MINING PRODUCTION by Country in EUROPE

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jan 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2024). MINING PRODUCTION by Country in EUROPE [Dataset]. https://tradingeconomics.com/country-list/mining-production/1000?continent=europe
    Explore at:
    csv, excel, json, xmlAvailable download formats
    Dataset updated
    Jan 12, 2024
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Europe
    Description

    This dataset provides values for MINING PRODUCTION reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  7. T

    GDP FROM MINING.PHP by Country in AMERICA

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Nov 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). GDP FROM MINING.PHP by Country in AMERICA [Dataset]. https://tradingeconomics.com/country-list/gdp-from-mining.php?continent=america
    Explore at:
    json, xml, excel, csvAvailable download formats
    Dataset updated
    Nov 1, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Description

    This dataset provides values for GDP FROM MINING.PHP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  8. d

    Data from: Discovering System Health Anomalies using Data Mining Techniques

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Discovering System Health Anomalies using Data Mining Techniques [Dataset]. https://catalog.data.gov/dataset/discovering-system-health-anomalies-using-data-mining-techniques
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    We discuss a statistical framework that underlies envelope detection schemes as well as dynamical models based on Hidden Markov Models (HMM) that can encompass both discrete and continuous sensor measurements for use in Integrated System Health Management (ISHM) applications. The HMM allows for the rapid assimilation, analysis, and discovery of system anomalies. We motivate our work with a discussion of an aviation problem where the identification of anomalous sequences is essential for safety reasons. The data in this application are discrete and continuous sensor measurements and can be dealt with seamlessly using the methods described here to discover anomalous flights. We specifically treat the problem of discovering anomalous features in the time series that may be hidden from the sensor suite and compare those methods to standard envelope detection methods on test data designed to accentuate the differences between the two methods. Identification of these hidden anomalies is crucial to building stable, reusable, and cost-efficient systems. We also discuss a data mining framework for the analysis and discovery of anomalies in high-dimensional time series of sensor measurements that would be found in an ISHM system. We conclude with recommendations that describe the tradeoffs in building an integrated scalable platform for robust anomaly detection in ISHM applications.

  9. Lex-Atlas:Covid-19 Parliaments Dataset

    • zenodo.org
    • explore.openaire.eu
    Updated Mar 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon; Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon (2022). Lex-Atlas:Covid-19 Parliaments Dataset [Dataset]. http://doi.org/10.5281/zenodo.6363125
    Explore at:
    Dataset updated
    Mar 17, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon; Jeff King; Octávio Ferraz; Andrew Jones; Roxane Agon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data on the impact on national parliaments resulting from the Covid-19 pandemic mined from country reports published by the Lex-Atlas: Covid-19 project and the Oxford University Press. For more information see https://lexatlas-c19.org

  10. m

    Foursquare Venue and Venue Comments Data

    • data.mendeley.com
    Updated Feb 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Asim Sinan Yuksel (2018). Foursquare Venue and Venue Comments Data [Dataset]. http://doi.org/10.17632/29tbvvwkdp.2
    Explore at:
    Dataset updated
    Feb 18, 2018
    Authors
    Asim Sinan Yuksel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1-Turkish comments for 128 venues in Foursquare Social Network Platform (binary and ternary classified) 2-Turkish adjectives and polarities 3-Turkish food and drink names 4- All comments without tagging 5-Venues, liked meals/foods

  11. T

    GDP FROM MINING.PHP by Country in ASIA

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2022). GDP FROM MINING.PHP by Country in ASIA [Dataset]. https://tradingeconomics.com/country-list/gdp-from-mining.php?continent=asia
    Explore at:
    json, csv, xml, excelAvailable download formats
    Dataset updated
    Jun 10, 2022
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Asia
    Description

    This dataset provides values for GDP FROM MINING.PHP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  12. g

    Mine Safety and Health At A Glance Calendar year

    • gimi9.com
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Mine Safety and Health At A Glance Calendar year [Dataset]. https://gimi9.com/dataset/data-gov_mine-safety-and-health-at-a-glance-calendar-year/
    Explore at:
    Dataset updated
    Dec 11, 2024
    Description

    miners mines mining-contractors mining-data mining-fatalities mining-injuries mining-injury-rates mining-operators mining-safety-statistics mining-statistics mining-trends mining-yearly-comparisons msha msha-at-the-glance

  13. T

    EXPORT NATURAL HYDROCARBONS PRDS OF MINING ELE by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 8, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2022). EXPORT NATURAL HYDROCARBONS PRDS OF MINING ELE by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/export-natural-hydrocarbons-prds-of-mining-ele
    Explore at:
    json, xml, excel, csvAvailable download formats
    Dataset updated
    Jul 8, 2022
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for EXPORT NATURAL HYDROCARBONS PRDS OF MINING ELE reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  14. MOESM4 of Deep-learning: investigating deep neural networks hyper-parameters...

    • springernature.figshare.com
    xlsx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexios Koutsoukas; Keith Monaghan; Xiaoli Li; Jun Huan (2023). MOESM4 of Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data [Dataset]. http://doi.org/10.6084/m9.figshare.c.3814018_D4.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Alexios Koutsoukas; Keith Monaghan; Xiaoli Li; Jun Huan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 4. Results of parameter selection.

  15. T

    GDP FROM MINING by Country in AUSTRALIA.PHP

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Sep 22, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). GDP FROM MINING by Country in AUSTRALIA.PHP [Dataset]. https://tradingeconomics.com/country-list/gdp-from-mining?continent=australia.php
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    Sep 22, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Australia
    Description

    This dataset provides values for GDP FROM MINING reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  16. g

    Metallic and Non Metallic Mining Development Potential Index | gimi9.com

    • gimi9.com
    Updated Mar 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metallic and Non Metallic Mining Development Potential Index | gimi9.com [Dataset]. https://gimi9.com/dataset/mekong_metallic-mining-development-potential-index/
    Explore at:
    Dataset updated
    Mar 23, 2025
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Metallic Mining Development Potential Index: This global dataset continuously ranks from >0 (minimal) to 1 (highest) all suitable lands at a 1-km resolution for metallic mining (e.g. gold, silver, copper). These data are derived from the weighted summary of four criteria maps: 1) proxy yield values based on deposit size and numbers, 2) distance to demand centers, 3) distance to major roads, and 4) distance to railways or ports. Included with these data are: a) the classified version of the continuous DPI, b) the corresponding classified uncertainty dataset, c) detailed sensitivity tables for all criteria used in the analysis, and d) full description of the constraints and criteria used in the analysis with the Analytic Hierarch Process (AHP) pairwise comparison matrix and resulting criteria weights derived from AHP. Non Metallic Mining Development Potential Index: This global dataset continuously ranks from >0 (minimal) to 1 (highest) all suitable lands at a 1-km resolution for non-metallic mining (e.g. sand and gravel mining). These data are derived from the weighted summary of four criteria maps: 1) proxy yield values based on deposit size and numbers, 2) distance to demand centers, 3) distance to major roads, and 4) distance to railways or ports. Included with these data are: a) the classified version of the continuous DPI, b) the corresponding classified uncertainty dataset, c) detailed sensitivity tables for all criteria used in the analysis, and d) full description of the constraints and criteria used in the analysis with the Analytic Hierarch Process (AHP) pairwise comparison matrix and resulting criteria weights derived from AHP.

  17. m

    Synthetic oversampling for credit card default prediction

    • data.mendeley.com
    Updated Mar 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fransiscus Pratikto (2023). Synthetic oversampling for credit card default prediction [Dataset]. http://doi.org/10.17632/jrss9jdjz9.1
    Explore at:
    Dataset updated
    Mar 8, 2023
    Authors
    Fransiscus Pratikto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains more than 17000 data of credit card holder with 20 predictor variables and 1 binary target variable. The corresponding R code for comparing several proposed (density-based) and existing synthetic oversampling methods (SMOTE-based) is also provided.

  18. e

    Key Characteristics of Algorithms' Dynamics Beyond Accuracy - Evaluation...

    • b2find.eudat.eu
    Updated Apr 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Key Characteristics of Algorithms' Dynamics Beyond Accuracy - Evaluation Tests - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/855f50a8-3e16-5932-829a-65122d51d553
    Explore at:
    Dataset updated
    Apr 3, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Key Characteristics of Algorithms' Dynamics Beyond Accuracy - Evaluation Tests conducted for the paper: What do anomaly scores actually mean? Key characteristics of algorithms' dynamics beyond accuracy by F. Iglesias, H. O. Marques, A. Zimek, T. Zseby Context and methodology Anomaly detection is intrinsic to a large number of data analysis applications today. Most of the algorithms used assign an outlierness score to each instance prior to establishing anomalies in a binary form. The experiments in this repository study how different algorithms generate different dynamics in the outlierness scores and react in very different ways to possible model perturbations that affect data. The study elaborated in the referred paper presents new indices and coefficients to assess the dynamics and explores the responses of the algorithms as a function of variations in these indices, revealing key aspects of the interdependence between algorithms, data geometries and the ability to discriminate anomalies. Therefeore, this repository reproduces the conducted experiments, which study eight algorithms (ABOD, HBOS, iForest, K-NN, LOF, OCSVM, SDO and GLOSH), submitted to seven perturbations related to: cardinality, dimensionality, outlier proportion, inlier-outlier density ratio, density layers, clusters and local outliers, and collects behavioural profiles with eleven measurements (Adjusted Average Precission, ROC-AUC, Perini's Confidence [1], Perini's Stability [2], S-curves, Discriminant Power, Robust Coefficients of Variations for Inliers and Outliers, Coherence, Bias and Robustness) under two types of normalization: linear and Gaussian, the latter aiming to standardize the outlierness scores issued by different algorithms [3]. This repository is framed within the research on the following domains: algorithm evaluation, outlier detection, anomaly detection, unsupervised learning, machine learning, data mining, data analysis. Datasets and algorithms can be used for experiment replication and for further evaluation and comparison. References [1] Perini, L., Vercruyssen, V., Davis, J.: Quantifying the confidence of anomaly detectors in their example-wise predictions. In: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Springer Verlag (2020). [2] Perini, L., Galvin, C., Vercruyssen, V.: A Ranking Stability Measure for Quantifying the Robustness of Anomaly Detection Methods. In: 2nd Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning @ ECML/PKDD (2020). [3] Kriegel, H.-P., Kröger, P., Schubert, E., Zimek, A.: Interpreting and unifying outlier scores. In: Proceedings of the 2011 SIAM International Conference on Data Mining (SDM), pp. 13–24 (2011) Technical details Experiments are in Python 3. Provided scripts generate all data and results. We keep them in the repo for the sake of comparability and replicability. The file and folder structure is as follows:

  19. Data from: CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2...

    • zenodo.org
    bin, png, zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado (2024). CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2 DIABETES [Dataset]. http://doi.org/10.5281/zenodo.7778291
    Explore at:
    bin, png, zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Technical notes and documentation on the common data model of the project CONCEPT-DM2.

    This publication corresponds to the Common Data Model (CDM) specification of the CONCEPT-DM2 project for the implementation of a federated network analysis of the healthcare pathway of type 2 diabetes.

    Aims of the CONCEPT-DM2 project:

    General aim: To analyse chronic care effectiveness and efficiency of care pathways in diabetes, assuming the relevance of care pathways as independent factors of health outcomes using data from real life world (RWD) from five Spanish Regional Health Systems.

    Main specific aims:

    • To characterize the care pathways in patients with diabetes through the whole care system in terms of process indicators and pharmacologic recommendations
    • To compare these observed care pathways with the theoretical clinical pathways derived from the clinical practice guidelines
    • To assess if the adherence to clinical guidelines influence on important health outcomes, such as cardiovascular hospitalizations.
    • To compare the traditional analytical methods with process mining methods in terms of modeling quality, prediction performance and information provided.

    Study Design: It is a population-based retrospective observational study centered on all T2D patients diagnosed in five Regional Health Services within the Spanish National Health Service. We will include all the contacts of these patients with the health services using the electronic medical record systems including Primary Care data, Specialized Care data, Hospitalizations, Urgent Care data, Pharmacy Claims, and also other registers such as the mortality and the population register.

    Cohort definition: All patients with code of Type 2 Diabetes in the clinical health records

    • Inclusion criteria: patients that, at 01/01/2017 or during the follow-up from 01/01/2017 to 31/12/2022 had active health card (active TIS - tarjeta sanitaria activa) and code of type 2 diabetes (T2D, DM2 in spanish) in the clinical records of primary care (CIAP2 T90 in case of using CIAP code system)
    • Exclusion criteria:
      • patients with no contact with the health system from 01/01/2017 to 31/12/2022
      • patients that had a T1D (DM1) code opened after the T2D code during the follow-up.
    • Study period. From 01/01/2017 to 31/12/2022

    Files included in this publication:

    • Datamodel_CONCEPT_DM2_diagram.png
    • Common data model specification (Datamodel_CONCEPT_DM2_v.0.1.0.xlsx)
    • Synthetic datasets (Datamodel_CONCEPT_DM2_sample_data)
      • sample_data1_dm_patient.csv
      • sample_data2_dm_param.csv
      • sample_data3_dm_patient.csv
      • sample_data4_dm_param.csv
      • sample_data5_dm_patient.csv
      • sample_data6_dm_param.csv
      • sample_data7_dm_param.csv
      • sample_data8_dm_param.csv
    • Datamodel_CONCEPT_DM2_explanation.pptx
  20. m

    Data for: Identification of hindered internal rotational mode for complex...

    • data.mendeley.com
    • narcis.nl
    Updated Nov 8, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Triet Le (2017). Data for: Identification of hindered internal rotational mode for complex chemical species: A data mining approach with multivariate logistic regression model [Dataset]. http://doi.org/10.17632/d37mzs3b3m.2
    Explore at:
    Dataset updated
    Nov 8, 2017
    Authors
    Triet Le
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The "Dataset_HIR" folder contains the data to reproduce the results of the data mining approach proposed in the manuscript titled "Identification of hindered internal rotational mode for complex chemical species: A data mining approach with multivariate logistic regression model".

    More specifically, the folder contains the raw electronic structure calculation input data provided by the domain experts as well as the training and testing dataset with the extracted features.

    The "Dataset_HIR" folder contains the following subfolders namely:

    1. Electronic structure calculation input data: contains the electronic structure calculation input generated by the Gaussian program

      1.1. Testing data: contains the raw data of all training species (each is stored in a separate folder) used for extracting dataset for training and validation phase.

      1.2. Testing data: contains the raw data of all testing species (each is stored in a separate folder) used for extracting data for the testing phase.

    2. Dataset 2.1. Training dataset: used to produce the results in Tables 3 and 4 in the manuscript

      + datasetTrain_raw.csv: contains the features for all vibrational modes associated with corresponding labeled species to let the chemists select the Hindered Internal Rotor from the list easily for the training and validation steps.  
      
      + datasetTrain.csv: refines the datasetTrain_raw.csv where the names of the species are all removed to transform the dataset into an appropriate form for the modeling and validation steps.
      

      2.2. Testing dataset: used to produce the results of the data mining approach in Table 5 in the manuscript.

      + datasetTest_raw.csv: contains the features for all vibrational modes of each labeled species to let the chemists select the Hindered Internal Rotor from the list for the testing step.
      
      + datasetTest.csv: refines the datasetTest_raw.csv where the names of the species are all removed to transform the dataset into an appropriate form for the testing step.
      

    Note for the Result feature in the dataset: 1 is for the mode needed to be treated as Hindered Internal Rotor, and 0 otherwise.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jacques Wainer (2023). Comparison of 14 classifiers [Dataset]. http://doi.org/10.6084/m9.figshare.3407932.v2
Organization logoOrganization logo

Comparison of 14 classifiers

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
application/gzipAvailable download formats
Dataset updated
Jun 11, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Jacques Wainer
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data, programs, results, and analysis software for the paper "Comparison of 14 different families of classification algorithms on 115 binary data sets" https://arxiv.org/abs/1606.00930

Search
Clear search
Close search
Google apps
Main menu