100+ datasets found
  1. ODM Data Analysis—A tool for the automatic validation, monitoring and...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    mp4
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas (2023). ODM Data Analysis—A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data [Dataset]. http://doi.org/10.1371/journal.pone.0199242
    Explore at:
    mp4Available download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.

  2. G

    Tularosa Basin Play Fairway Analysis Data and Models

    • gdr.openei.org
    • data.openei.org
    • +3more
    archive
    Updated Jul 11, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greg Nash; Greg Nash (2017). Tularosa Basin Play Fairway Analysis Data and Models [Dataset]. http://doi.org/10.15121/1369076
    Explore at:
    archiveAvailable download formats
    Dataset updated
    Jul 11, 2017
    Dataset provided by
    USDOE Office of Energy Efficiency and Renewable Energy (EERE), Renewable Power Office. Geothermal Technologies Program (EE-4G)
    Energy and Geoscience Institute at the University of Utah
    Geothermal Data Repository
    Authors
    Greg Nash; Greg Nash
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tularosa Basin
    Description

    This submission includes raster datasets for each layer of evidence used for weights of evidence analysis as well as the deterministic play fairway analysis (PFA). Data representative of heat, permeability and groundwater comprises some of the raster datasets. Additionally, the final deterministic PFA model is provided along with a certainty model. All of these datasets are best used with an ArcGIS software package, specifically Spatial Data Modeler.

  3. Analysis of Banking DATA model

    • kaggle.com
    zip
    Updated Apr 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanskruti Panda (2020). Analysis of Banking DATA model [Dataset]. https://www.kaggle.com/datasets/sanskrutipanda/analysis-of-banking-data-model
    Explore at:
    zip(267804 bytes)Available download formats
    Dataset updated
    Apr 22, 2020
    Authors
    Sanskruti Panda
    Description

    Dataset

    This dataset was created by Sanskruti Panda

    Released under Other (specified in description)

    Contents

  4. Storm surge model projections, statistical analysis, and summary data set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Sep 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). Storm surge model projections, statistical analysis, and summary data set [Dataset]. https://catalog.data.gov/dataset/storm-surge-model-projections-statistical-analysis-and-summary-data-set
    Explore at:
    Dataset updated
    Sep 1, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    All data associated with this data entry are the simulations related storm surge in three case study locations. These simulated water height, wind and other physical parameters are used for analysis to construct all the figures presented herein. This dataset is associated with the following publication: Liang, M., Z. Dong, S. Julius, J. Neal, and J. Yang. Storm Surge Projection for Objective-based Risk Management for Climate Change Adaptation along the US Atlantic Coast. JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT. American Society of Civil Engineers (ASCE), Reston, VA, USA, 150(6): e04024014-1, (2024).

  5. Dataset: Worst Performers, Best Predictors

    • figshare.com
    xlsx
    Updated Apr 22, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bradly Alicea (2016). Dataset: Worst Performers, Best Predictors [Dataset]. http://doi.org/10.6084/m9.figshare.944542.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 22, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Bradly Alicea
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset accompanying the Synthetic Daisies post "Are the Worst Performers the Best Predictors?" and the technical paper (on viXra) "From Worst to Most Variable? Only the worst performers may be the most informative".

  6. Data from: CLPX-Model: Local Analysis and Prediction System: 4-D Atmospheric...

    • data.nasa.gov
    • nsidc.org
    • +5more
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). CLPX-Model: Local Analysis and Prediction System: 4-D Atmospheric Analyses, Version 1 [Dataset]. https://data.nasa.gov/dataset/clpx-model-local-analysis-and-prediction-system-4-d-atmospheric-analyses-version-1-da048
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The Local Analysis and Prediction System (LAPS), run by the NOAA's Forecast Systems Laboratory (FSL), combines numerous observed meteorological data sets into a collection of atmospheric analyses.

  7. d

    Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atmospheric Radiation Measurement Data Center (2020). Rapid Update Cycle (RUC) model: hybrid analysis data, 20-km resolution [Dataset]. https://catalog.data.gov/dataset/rapid-update-cycle-ruc-model-hybrid-analysis-data-20-km-resolution
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    Atmospheric Radiation Measurement Data Center
    Description

    No description found

  8. H

    Replication data for: A Statistical Model for Party Systems Analysis

    • dataverse.harvard.edu
    Updated Oct 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arturas Rozenas (2014). Replication data for: A Statistical Model for Party Systems Analysis [Dataset]. http://doi.org/10.7910/DVN/HQ3I8K
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2014
    Dataset provided by
    Harvard Dataverse
    Authors
    Arturas Rozenas
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Empirical researchers studying party systems often struggle with the question of how to count parties. Indexes of party system fragmentation used to address this problem (e.g., the effective number of parties) have a fundamental shortcoming: since the same index value may represent very different party systems, they are impossible to interpret and may lead to erroneous inference. We offer a novel approach to this problem: instead of focusing on index measures, we develop a model that predicts the \emph{entire distribution} of party vote-shares and, thus, does not require any index measure. First, a model of party-counts predicts the number of parties. Second, a set of multivariate t models predicts party vote-shares. Compared to the standard index-based approach, our approach helps to avoid inferential errors and, in addition, yields a much richer set of insights into the variation of party systems. For illustration, we apply the model on two datasets. Our analyses call into question the conclusions one would arrive at by the index-based approach. A publicly available software is provided to implement the proposed model.

  9. D

    Background data for: Latent-variable modeling of ordinal outcomes in...

    • dataverse.azure.uit.no
    • dataone.org
    • +1more
    pdf, text/tsv, txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manfred Krug; Manfred Krug; Fabian Vetter; Fabian Vetter; Lukas Sönning; Lukas Sönning (2025). Background data for: Latent-variable modeling of ordinal outcomes in language data analysis [Dataset]. http://doi.org/10.18710/WI9TEH
    Explore at:
    txt(8660), pdf(287207), pdf(160867), text/tsv(1079156), text/tsv(4475)Available download formats
    Dataset updated
    Jul 17, 2025
    Dataset provided by
    DataverseNO
    Authors
    Manfred Krug; Manfred Krug; Fabian Vetter; Fabian Vetter; Lukas Sönning; Lukas Sönning
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2008 - Dec 31, 2018
    Area covered
    Malta
    Dataset funded by
    German Humboldt Foundation
    Spanish Ministry of Education and Science with European Regional Development Fund
    Bavarian Ministry for Science, Research and the Arts
    Description

    This dataset contains tabular files with information about the usage preferences of speakers of Maltese English with regard to 63 pairs of lexical expressions. These pairs (e.g. truck-lorry or realization-realisation) are known to differ in usage between BrE and AmE (cf. Algeo 2006). The data were elicited with a questionnaire that asks informants to indicate whether they always use one of the two variants, prefer one over the other, have no preference, or do not use either expression (see Krug and Sell 2013 for methodological details). Usage preferences were therefore measured on a symmetric 5-point ordinal scale. Data were collected between 2008 to 2018, as part of a larger research project on lexical and grammatical variation in settings where English is spoken as a native, second, or foreign language. The current dataset, which we use for our methodological study on ordinal data modeling strategies, consists of a subset of 500 speakers that is roughly balanced on year of birth. Abstract: Related publication In empirical work, ordinal variables are typically analyzed using means based on numeric scores assigned to categories. While this strategy has met with justified criticism in the methodological literature, it also generates simple and informative data summaries, a standard often not met by statistically more adequate procedures. Motivated by a survey of how ordered variables are dealt with in language research, we draw attention to an un(der)used latent-variable approach to ordinal data modeling, which constitutes an alternative perspective on the most widely used form of ordered regression, the cumulative model. Since the latent-variable approach does not feature in any of the studies in our survey, we believe it is worthwhile to promote its benefits. To this end, we draw on questionnaire-based preference ratings by speakers of Maltese English, who indicated on a 5-point scale which of two synonymous expressions (e.g. package-parcel) they (tend to) use. We demonstrate that a latent-variable formulation of the cumulative model affords nuanced and interpretable data summaries that can be visualized effectively, while at the same time avoiding limitations inherent in mean response models (e.g. distortions induced by floor and ceiling effects). The online supplementary materials include a tutorial for its implementation in R.

  10. POLARIS Analysis Model Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). POLARIS Analysis Model Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/polaris-analysis-model-data-d0371
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    POLARIS_Analysis_ER2_Data is the modeled trajectories and meteorological data along the flight path for the ER-2 aircraft collected during the Photochemistry of Ozone Loss in the Arctic Region in Summer (POLARIS) campaign. Data collection for this product is complete.The POLARIS mission was a joint effort of NASA and NOAA that occurred in 1997 and was designed to expand on the photochemical and transport processes that cause the summer polar decreases in the stratospheric ozone. The POLARIS campaign had the overarching goal of better understanding the change of stratospheric ozone levels from very high concentrations in the spring to very low concentrations in the autumn. The NASA ER-2 high-altitude aircraft was the primary platform deployed along with balloons, satellites, and ground-sites. The POLARIS campaign was based in Fairbanks, Alaska with some flights being conducted from California and Hawaii. Flights were conducted between the summer solstice and fall equinox at mid- to high latitudes. The data collected included meteorological variables; long-lived tracers in reference to summertime transport questions; select species with reactive nitrogen (NOy), halogen (Cly), and hydrogen (HOx) reservoirs; and aerosols. More specifically, the ER-2 utilized various techniques/instruments including Laser Absorption, Gas Chromatography, Non-dispersive IR, UV Photometry, Catalysis, and IR Absorption. These techniques/instruments were used to collect data including N2O, CH4, CH3CCl3, CO2, O3, H2O, and NOy. Ground stations were responsible for collecting SO2 and O3, while balloons recorded pressure, temperature, wind speed, and wind directions. Satellites partnered with these platforms collected meteorological data and Lidar imagery. The observations were used to constrain stratospheric computer models to evaluate ozone changes due to chemistry and transport.

  11. H

    Replication Data for: Bayesian Sensitivity Analysis for Unmeasured...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Licheng Liu; Teppei Yamamoto (2025). Replication Data for: Bayesian Sensitivity Analysis for Unmeasured Confounding in Causal Panel Data Models [Dataset]. http://doi.org/10.7910/DVN/HL7ZYO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Licheng Liu; Teppei Yamamoto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the replication file for 'Bayesian Sensitivity Analysis for Unmeasured Confounding in Causal Panel Data Models', including the package that implements the proposed method, as well as replication code for Monte Carlo studies, simulated example and empirical analysis.

  12. Data from: STRAT Analysis Model Data

    • data.nasa.gov
    • datasets.ai
    • +5more
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). STRAT Analysis Model Data [Dataset]. https://data.nasa.gov/dataset/strat-analysis-model-data-97de5
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    STRAT_Analysis_ER2_Data is the modeled trajectories and meteorological data along the flight path for the ER-2 aircraft collected during the Stratospheric Tracers of Atmospheric Transport (STRAT) campaign. Data collection for this product is complete.The STRAT campaign was a field campaign conducted by NASA from May 1995 to February 1996. The primary goal of STRAT was to collect measurements of the change of long-lived tracers and functions of altitude, latitude, and season. These measurements were taken to aid with determining rates for global-scale transport and future distributions of high-speed civil transport (HSCT) exhaust that was emitted into the lower atmosphere. STRAT had four main objectives: defining the rate of transport of trace gases from the stratosphere and troposphere (i.e., HSCT exhaust emissions), improving the understanding of dynamical coupling rates for transport of trace gases between tropical regions and higher latitudes and lower altitudes (between tropical regions, higher latitudes, and lower altitudes are where most ozone resides), improving understanding of chemistry in the upper troposphere and lower stratosphere, and finally, providing data sets for testing two-dimensional and three-dimensional models used in assessments of impacts from stratospheric aviation. To accomplish these objectives, the STRAT Science Team conducted various surface-based remote sensing and in-situ measurements. NASA flew the ER-2 aircraft along with balloons such as ozonesondes and radiosondes just below the tropopause in the Northern Hemisphere to collect data. Along with the ER-2 and balloons, NASA also utilized satellite imagery, theoretical models, and ground sites. The ER-2 collected data on HOx, NOy, CO2, ozone, water vapor, and temperature. The ER-2 also collected in-situ stratospheric measurements of N2O, CH4, CO, HCL, and NO using the Aircraft Laser Infrared Absorption Spectrometer (ALIAS). Ozonesondes and radiosondes were also deployed to collect data on CO2, NO/NOy, air temperature, pressure, and 3D wind. These balloons also took in-situ measurements of N2O, CFC-11, CH4, CO, HCL, and NO2 using the ALIAS. Ground stations were responsible for taking measurements of O3, ozone mixing ratio, pressure, and temperature. Satellites took infrared images of the atmosphere with the goal of aiding in completing STRAT objectives. Pressure and temperature models were created to help plan the mission.

  13. Synthetic AR Medical Dataset with Realistic Denial

    • kaggle.com
    zip
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abuthahir1998 (2025). Synthetic AR Medical Dataset with Realistic Denial [Dataset]. https://www.kaggle.com/datasets/abuthahir1998/synthetic-ar-medical-dataset-with-realistic-denial
    Explore at:
    zip(13843 bytes)Available download formats
    Dataset updated
    Aug 31, 2025
    Authors
    Abuthahir1998
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Subtitle

    A fully synthetic dataset simulating real-world medical billing scenarios, including claim status, denials, team allocation, and AR follow-up logic.

    Description

    This dataset represents a synthetic Account Receivable (AR) data model for medical billing, created using realistic healthcare revenue cycle management (RCM) workflows. It is designed for data analysis, machine learning modeling, automation testing, and process simulation in the healthcare billing domain.

    The dataset includes realistic business logic, mimicking the actual process of claim submission, denial management, follow-ups, and payment tracking. This is especially useful for: ✔ Medical billing trainingPredictive modeling (claim outcomes, denial prediction, payment forecasting)RCM process automation and AI researchData visualization and dashboard creation

    Key Features of This Dataset

    Patient & Claim Information:

    • Visit ID: Unique alphanumeric ID in the format XXXXXZXXXXXX
    • Patient Name: Randomly generated names
    • Date of Service (DOS): In MM/DD/YYYY format
    • Aging Days: Calculated as Today - DOS
    • Aging Bucket: Categorized as 0-30, 31-60, 61-90, 91-120, 120+

    Claim Status & Denial Logic:

    • Status Column: Indicates whether response received or not
    • If No Response → Simulates a follow-up call → Claim may result in denial
    • Status Code: Reflects actual denial reason (e.g., Dx inconsistent with CPT)
    • Action Code: Required follow-up action (e.g., Need Coding Assistance)
    • Team Allocation: Based on denial type

      • Coding-related denialCoding Team
      • Submission/Claim-related denialBilling Team
      • Payment-related denialPayment Team

    Realistic Denial Scenarios Covered:

    • Coding Errors (Dx inconsistent with CPT, Missing Modifier)
    • Claim Issues (Duplicate Claim, Invalid Subscriber ID)
    • Payment Issues (Allowed Amount Paid, No Coverage)

    Other Important Columns:

    • Claim Amount, Paid Amount, Balance
    • Insurance Details (Primary, Secondary, Tertiary)
    • Notes explaining denial or next steps

    Columns in the Dataset

    Column NameDescription
    ClientName of the client/provider
    StateUS State where service provided
    Visit ID#Unique alphanumeric ID (XXXXXZXXXXXX)
    Patient NamePatient’s full name
    DOSDate of Service (MM/DD/YYYY)
    Aging DaysDays from DOS to today
    Aging BucketAging category
    Claim AmountOriginal claim billed
    Paid AmountAmount paid so far
    BalanceRemaining balance
    StatusInitial claim status (No Response, Paid, etc.)
    Status CodeActual reason (e.g., Dx inconsistent with CPT)
    Action CodeNext step (e.g., Need Coding Assistance)
    Team AllocationResponsible team (Coding, Billing, Payment)
    NotesFollow-up notes

    Data Generation Rules Applied

    • Date format: MM/DD/YYYY
    • Aging Days: Calculated dynamically based on DOS
    • Visit ID: Always follows the XXXXXZXXXXXX format
    • Denial Workflow:

      • If claim denied → Status Code & Action Code updated
      • Team allocation based on denial type
    • Payments: Realistic logic where payment may be partial, full, or none

    • Insurance Flow: Balance moves from primary → secondary → tertiary → patient responsibility

    Use Cases

    • Predictive modeling for claim outcome
    • Identifying high-risk claims for early intervention
    • Denial pattern analysis for improving first-pass resolution rate
    • Building RCM dashboards and AR management tools

    License

    CC BY 4.0 – Free to use, modify, and share with attribution.

  14. m

    Data sets and their analysis

    • data.mendeley.com
    Updated Jan 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Morteza Jafarzadeh (2018). Data sets and their analysis [Dataset]. http://doi.org/10.17632/v4wfk72yyn.1
    Explore at:
    Dataset updated
    Jan 8, 2018
    Authors
    Morteza Jafarzadeh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Our model is examined based on an extensive data set. Due to the unavailability of real data sets, the values of model parameters are generated randomly using discrete uniform distribution.

  15. Data from: CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2...

    • zenodo.org
    bin, png, zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado (2024). CONCEPT- DM2 DATA MODEL TO ANALYSE HEALTHCARE PATHWAYS OF TYPE 2 DIABETES [Dataset]. http://doi.org/10.5281/zenodo.7778291
    Explore at:
    bin, png, zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Berta Ibáñez-Beroiz; Berta Ibáñez-Beroiz; Asier Ballesteros-Domínguez; Asier Ballesteros-Domínguez; Ignacio Oscoz-Villanueva; Ignacio Oscoz-Villanueva; Ibai Tamayo; Ibai Tamayo; Julián Librero; Julián Librero; Mónica Enguita-Germán; Mónica Enguita-Germán; Francisco Estupiñán-Romero; Francisco Estupiñán-Romero; Enrique Bernal-Delgado; Enrique Bernal-Delgado
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Technical notes and documentation on the common data model of the project CONCEPT-DM2.

    This publication corresponds to the Common Data Model (CDM) specification of the CONCEPT-DM2 project for the implementation of a federated network analysis of the healthcare pathway of type 2 diabetes.

    Aims of the CONCEPT-DM2 project:

    General aim: To analyse chronic care effectiveness and efficiency of care pathways in diabetes, assuming the relevance of care pathways as independent factors of health outcomes using data from real life world (RWD) from five Spanish Regional Health Systems.

    Main specific aims:

    • To characterize the care pathways in patients with diabetes through the whole care system in terms of process indicators and pharmacologic recommendations
    • To compare these observed care pathways with the theoretical clinical pathways derived from the clinical practice guidelines
    • To assess if the adherence to clinical guidelines influence on important health outcomes, such as cardiovascular hospitalizations.
    • To compare the traditional analytical methods with process mining methods in terms of modeling quality, prediction performance and information provided.

    Study Design: It is a population-based retrospective observational study centered on all T2D patients diagnosed in five Regional Health Services within the Spanish National Health Service. We will include all the contacts of these patients with the health services using the electronic medical record systems including Primary Care data, Specialized Care data, Hospitalizations, Urgent Care data, Pharmacy Claims, and also other registers such as the mortality and the population register.

    Cohort definition: All patients with code of Type 2 Diabetes in the clinical health records

    • Inclusion criteria: patients that, at 01/01/2017 or during the follow-up from 01/01/2017 to 31/12/2022 had active health card (active TIS - tarjeta sanitaria activa) and code of type 2 diabetes (T2D, DM2 in spanish) in the clinical records of primary care (CIAP2 T90 in case of using CIAP code system)
    • Exclusion criteria:
      • patients with no contact with the health system from 01/01/2017 to 31/12/2022
      • patients that had a T1D (DM1) code opened after the T2D code during the follow-up.
    • Study period. From 01/01/2017 to 31/12/2022

    Files included in this publication:

    • Datamodel_CONCEPT_DM2_diagram.png
    • Common data model specification (Datamodel_CONCEPT_DM2_v.0.1.0.xlsx)
    • Synthetic datasets (Datamodel_CONCEPT_DM2_sample_data)
      • sample_data1_dm_patient.csv
      • sample_data2_dm_param.csv
      • sample_data3_dm_patient.csv
      • sample_data4_dm_param.csv
      • sample_data5_dm_patient.csv
      • sample_data6_dm_param.csv
      • sample_data7_dm_param.csv
      • sample_data8_dm_param.csv
    • Datamodel_CONCEPT_DM2_explanation.pptx
  16. Interpretation and identification of within-unit and cross-sectional...

    • plos.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Kropko; Robert Kubinec (2023). Interpretation and identification of within-unit and cross-sectional variation in panel data models [Dataset]. http://doi.org/10.1371/journal.pone.0231349
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jonathan Kropko; Robert Kubinec
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.

  17. u

    ECMWF Deterministic Model Analysis Data on hybrid levels in GRIB2 (D1D)

    • data.ucar.edu
    grib
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ECMWF Deterministic Model Analysis Data on hybrid levels in GRIB2 (D1D) [Dataset]. http://doi.org/10.26023/Y8YC-YJKM-4G02
    Explore at:
    gribAvailable download formats
    Dataset updated
    Oct 7, 2025
    Time period covered
    Oct 1, 2011 - Apr 1, 2012
    Area covered
    Earth
    Description

    This data set contains the European Center for Medium range Weather Forecasting (ECMWF) deterministic forecast model analysis data on hybrid levels in GRIB2 format. The forecast files and analysis files on pressure levels are in the companion data sets (see below). For each day there are global 0.25 degree resolution analysis files at 00, 06, 12, and 18 UTC. This data set is password protected. Contact Steve Williams (see below) for password information. This is a large data set with each day containing ~6.1 Gb in 4 files. Each order can contain a maximum of 16 Gb of data (~2.5 days). For very large orders, it may be preferable for you to access these data via a loaner external hard drive. Please contact Steve Williams (see below) for these very large orders.

  18. Sentiment Analysis Dataset

    • kaggle.com
    zip
    Updated Nov 18, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SonamSrivastava (2019). Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/sonaam1234/sentimentdata
    Explore at:
    zip(970011 bytes)Available download formats
    Dataset updated
    Nov 18, 2019
    Authors
    SonamSrivastava
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Timo Bozsolik

    Released under Database: Open Database, Contents: Database Contents

    Contents

    Data for sentiment analysis

  19. i

    Travel Demand Model Data (c25q2)

    • datahub.cmap.illinois.gov
    Updated Sep 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Metropolitan Agency for Planning (2025). Travel Demand Model Data (c25q2) [Dataset]. https://datahub.cmap.illinois.gov/documents/8e370db608d24ba39bed02ed182c6865
    Explore at:
    Dataset updated
    Sep 17, 2025
    Dataset authored and provided by
    Chicago Metropolitan Agency for Planning
    Description

    This dataset includes the analysis year inputs and outputs from the Air Quality Conformity Analysis approved in June 2025. The horizon year is 2050 and reflects the policies and projects adopted in the ON TO 2050 Regional Comprehensive Plan.The air quality analysis is completed twice annually, in the second quarter and the fourth quarter. The data associated with the analysis is named based on the year the analysis was completed and the quarter it was completed. Therefore, the files in this dataset are referred to as c25q2 data.The analysis years for this conformity cycle include 2019, 2025, 2030, 2035, 2040, and 2050. We associate scenario numbers with the analysis years as shown below. You will notice the scenario numbers 100–700 referenced in many of the filenames or in headers within the files.Analysis year scenario numbers:2019 | 1002025 | 2002030 | 3002035 | 4002040 | 5002050 | 700 Links to download data files:Travel Demand Model Data (c25q2) - 2019 BaseTravel Demand Model Data (c25q2) - 2025 ForecastTravel Demand Model Data (c25q2) - 2030 ForecastTravel Demand Model Data (c25q2) - 2035 ForecastTravel Demand Model Data (c25q2) - 2040 ForecastTravel Demand Model Data (c25q2) - 2050 Forecast For additional information, see the travel demand model documentation.

  20. d

    SNACS Atmospheric Model: Time Series Data - 3-hourly

    • search.dataone.org
    • data.ucar.edu
    • +2more
    Updated Oct 22, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arctic Data Center (2016). SNACS Atmospheric Model: Time Series Data - 3-hourly [Dataset]. https://search.dataone.org/view/urn%3Auuid%3A6eb3caf0-a1f8-410f-a292-6dcea6522446
    Explore at:
    Dataset updated
    Oct 22, 2016
    Dataset provided by
    Arctic Data Center
    Time period covered
    Sep 1, 1957 - Dec 31, 2006
    Area covered
    Description

    This dataset archives the 3-hourly SNACS Polar MM5 atmospheric model data for simulations run with the following forcing data. Model years Data 1957-1958 to 1978-1979 ERA40 1979-1980 to 2000-2001 ERA40 + SSMR/SSMI sea ice from National Snow and Ice Data Center (NSIDC) 2001-2002 to 2006-2007 TOGA + SSMR/SSMI sea ice from National Snow and Ice Data Center (NSIDC) V2 + NNRP for soil moisture There are time series data from 28 model grid points near Barrow, AK.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas (2023). ODM Data Analysis—A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data [Dataset]. http://doi.org/10.1371/journal.pone.0199242
Organization logo

ODM Data Analysis—A tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data

Explore at:
9 scholarly articles cite this dataset (View in Google Scholar)
mp4Available download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Tobias Johannes Brix; Philipp Bruland; Saad Sarfraz; Jan Ernsting; Philipp Neuhaus; Michael Storck; Justin Doods; Sonja Ständer; Martin Dugas
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

IntroductionA required step for presenting results of clinical studies is the declaration of participants demographic and baseline characteristics as claimed by the FDAAA 801. The common workflow to accomplish this task is to export the clinical data from the used electronic data capture system and import it into statistical software like SAS software or IBM SPSS. This software requires trained users, who have to implement the analysis individually for each item. These expenditures may become an obstacle for small studies. Objective of this work is to design, implement and evaluate an open source application, called ODM Data Analysis, for the semi-automatic analysis of clinical study data.MethodsThe system requires clinical data in the CDISC Operational Data Model format. After uploading the file, its syntax and data type conformity of the collected data is validated. The completeness of the study data is determined and basic statistics, including illustrative charts for each item, are generated. Datasets from four clinical studies have been used to evaluate the application’s performance and functionality.ResultsThe system is implemented as an open source web application (available at https://odmanalysis.uni-muenster.de) and also provided as Docker image which enables an easy distribution and installation on local systems. Study data is only stored in the application as long as the calculations are performed which is compliant with data protection endeavors. Analysis times are below half an hour, even for larger studies with over 6000 subjects.DiscussionMedical experts have ensured the usefulness of this application to grant an overview of their collected study data for monitoring purposes and to generate descriptive statistics without further user interaction. The semi-automatic analysis has its limitations and cannot replace the complex analysis of statisticians, but it can be used as a starting point for their examination and reporting.

Search
Clear search
Close search
Google apps
Main menu