91 datasets found
  1. r

    CMS Synthetic Patient Data OMOP

    • redivis.com
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). CMS Synthetic Patient Data OMOP [Dataset]. https://redivis.com/workflows/6e6p-cfgn5hgz1
    Explore at:
    Dataset updated
    Feb 24, 2025
    Description

    This is a synthetic patient dataset in the OMOP Common Data Model v5.2, originally released by the CMS and accessed via BigQuery. The dataset includes 24 tables and records for 2 million synthetic patients from 2008 to 2010.

  2. Synthea synthetic patient generator data in OMOP Common Data Model

    • registry.opendata.aws
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon Web Sevices (2023). Synthea synthetic patient generator data in OMOP Common Data Model [Dataset]. https://registry.opendata.aws/synthea-omop/
    Explore at:
    Dataset updated
    Jan 4, 2023
    Dataset provided by
    Amazon.comhttp://amazon.com/
    Description

    The Synthea generated data is provided here as a 1,000 person (1k), 100,000 person (100k), and 2,800,000 persom (2.8m) data sets in the OMOP Common Data Model format. SyntheaTM is a synthetic patient generator that models the medical history of synthetic patients. Our mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions. It can be used without restriction for a variety of secondary uses in academia, research, industry, and government (although a citation would be appreciated). You can read our first academic paper here: https://doi.org/10.1093/jamia/ocx079

  3. Synthetic Patient Data in OMOP

    • console.cloud.google.com
    Updated Jun 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:U.S.%20Department%20of%20Health%20%26%20Human%20Services&inv=1&invt=Ab2v-Q (2020). Synthetic Patient Data in OMOP [Dataset]. https://console.cloud.google.com/marketplace/product/hhs/synpuf
    Explore at:
    Dataset updated
    Jun 25, 2020
    Dataset provided by
    Googlehttp://google.com/
    Description

    The Synthetic Patient Data in OMOP Dataset is a synthetic database released by the Centers for Medicare and Medicaid Services (CMS) Medicare Claims Synthetic Public Use Files (SynPUF). It is synthetic data containing 2008-2010 Medicare insurance claims for development and demonstration purposes. It has been converted to the Observational Medical Outcomes Partnership (OMOP) common data model from its original form, CSV, by the open source community as released on GitHub Please refer to the CMS Linkable 2008–2010 Medicare Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF) User Manual for details regarding how DE-SynPUF was created." This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .

  4. Domain

    • redivis.com
    Updated Sep 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redivis Demo Organization (2020). Domain [Dataset]. https://redivis.com/datasets/ye2v-6skh7wdr7
    Explore at:
    Dataset updated
    Sep 6, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Time period covered
    2008 - 2010
    Description

    The DOMAIN table includes a list of OMOP-defined Domains the Concepts of the Standardized Vocabularies can belong to. A Domain defines the set of allowable Concepts for the standardized fields in the CDM tables.

  5. Cost

    • redivis.com
    Updated Sep 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cost [Dataset]. https://redivis.com/datasets/ye2v-6skh7wdr7
    Explore at:
    Dataset updated
    Sep 7, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Time period covered
    2008 - 2010
    Description

    The COST table captures records containing the cost of any medical event recorded in one of the OMOP clinical event tables such as DRUG_EXPOSURE, PROCEDURE_OCCURRENCE, VISIT_OCCURRENCE, VISIT_DETAIL, DEVICE_OCCURRENCE, OBSERVATION or MEASUREMENT.

  6. CMS 2008-2010 Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF) in...

    • registry.opendata.aws
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CMS 2008-2010 Data Entrepreneurs’ Synthetic Public Use File (DE-SynPUF) in OMOP Common Data Model [Dataset]. https://registry.opendata.aws/cmsdesynpuf-omop/
    Explore at:
    Dataset updated
    Jan 18, 2023
    Dataset provided by
    Amazon.comhttp://amazon.com/
    Description

    DE-SynPUF is provided here as a 1,000 person (1k), 100,000 person (100k), and 2,300,000 persom (2.3m) data sets in the OMOP Common Data Model format. The DE-SynPUF was created with the goal of providing a realistic set of claims data in the public domain while providing the very highest degree of protection to the Medicare beneficiaries’ protected health information. The purposes of the DE-SynPUF are to:

    1. allow data entrepreneurs to develop and create software and applications that may eventually be applied to actual CMS claims data;
    2. train researchers on the use and complexity of conducting analyses with CMS claims data prior to initiating the process to obtain access to actual CMS data; and,
    3. support safe data mining innovations that may reveal unanticipated knowledge gains while preserving beneficiary privacy. The files have been designed so that programs and procedures created on the DE-SynPUF will function on CMS Limited Data Sets. The data structure of the Medicare DE-SynPUF is very similar to the CMS Limited Data Sets, but with a smaller number of variables. The DE-SynPUF also provides a robust set of metadata on the CMS claims data that have not been previously available in the public domain. Although the DE-SynPUF has very limited inferential research value to draw conclusions about Medicare beneficiaries due to the synthetic processes used to create the file, the Medicare DE-SynPUF does increase access to a realistic Medicare claims data file in a timely and less expensive manner to spur the innovation necessary to achieve the goals of better care for beneficiaries and improve the health of the population.

  7. u

    Example (synthetic) electronic health record data

    • rdr.ucl.ac.uk
    application/csv
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steve Harris; Wai Shing Lai (2024). Example (synthetic) electronic health record data [Dataset]. http://doi.org/10.5522/04/25676298.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Apr 24, 2024
    Dataset provided by
    University College London
    Authors
    Steve Harris; Wai Shing Lai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    These data are modelled using the OMOP Common Data Model v5.3.Correlated Data SourceNG tube vocabulariesGeneration RulesThe patient’s age should be between 18 and 100 at the moment of the visit.Ethnicity data is using 2021 census data in England and Wales (Census in England and Wales 2021) .Gender is equally distributed between Male and Female (50% each).Every person in the record has a link in procedure_occurrence with the concept “Checking the position of nasogastric tube using X-ray”2% of person records have a link in procedure_occurrence with the concept of “Plain chest X-ray”60% of visit_occurrence has visit concept “Inpatient Visit”, while 40% have “Emergency Room Visit”NotesVersion 0Generated by man-made rule/story generatorStructural correct, all tables linked with the relationshipWe used national ethnicity data to generate a realistic distribution (see below)2011 Race Census figure in England and WalesEthnic Group : Population(%)Asian or Asian British: Bangladeshi - 1.1Asian or Asian British: Chinese - 0.7Asian or Asian British: Indian - 3.1Asian or Asian British: Pakistani - 2.7Asian or Asian British: any other Asian background -1.6Black or African or Caribbean or Black British: African - 2.5Black or African or Caribbean or Black British: Caribbean - 1Black or African or Caribbean or Black British: other Black or African or Caribbean background - 0.5Mixed multiple ethnic groups: White and Asian - 0.8Mixed multiple ethnic groups: White and Black African - 0.4Mixed multiple ethnic groups: White and Black Caribbean - 0.9Mixed multiple ethnic groups: any other Mixed or multiple ethnic background - 0.8White: English or Welsh or Scottish or Northern Irish or British - 74.4White: Irish - 0.9White: Gypsy or Irish Traveller - 0.1White: any other White background - 6.4Other ethnic group: any other ethnic group - 1.6Other ethnic group: Arab - 0.6

  8. h

    OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes...

    • healthdatagateway.org
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158), OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes [Dataset]. https://healthdatagateway.org/dataset/139
    Explore at:
    unknownAvailable download formats
    Dataset authored and provided by
    This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158)
    License

    https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/

    Description

    OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes Dataset number 2.0

    Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 6 million cases & more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) & death. There is a pressing need for tools to stratify patients, to identify those at greatest risk. Acuity scores are composite scores which help identify patients who are more unwell to support & prioritise clinical care. There are no validated acuity scores for COVID-19 & it is unclear whether standard tools are accurate enough to provide this support. This secondary care COVID OMOP dataset contains granular demographic, morbidity, serial acuity and outcome data to inform risk prediction tools in COVID-19.

    PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 & 2.

    EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date. This is a subset of data in OMOP format.

    Scope: All COVID swab confirmed hospitalised patients to UHB from January – August 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes.

    Available supplementary data: Health data preceding & following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data. Further OMOP data available as an additional service.

    Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.

  9. CPRD Primary Care and Linked Data OMOP Common Data Model

    • healthdatagateway.org
    unknown
    Updated Dec 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CPRD NHS England (2024). CPRD Primary Care and Linked Data OMOP Common Data Model [Dataset]. http://doi.org/10.48329/cyhc-9068
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    National Health Servicehttps://www.nhs.uk/
    Authors
    CPRD NHS England
    License

    HTTPS://CPRD.COM/DATA-ACCESSHTTPS://CPRD.COM/DATA-ACCESS

    Description

    The CPRD Primary Care and Linked Data OMOP CDM database contains longitudinal routinely-collected health records (EHR data) from UK primary care practices, and hospital episode data provided by NHS England. The data has been transformed into a common format (data model) using an open community data standard and structure from the OHDSI standardised vocabularies. The approach allows organisation, standardisation and common representation of medical terms and variables that have been obtained from various clinical data sources. Access to anonymised data from CPRD is subject to a full licence agreement containing detailed terms and conditions of use. Anonymised patient datasets can be extracted for researchers against specific study specifications, following protocol approval.

  10. r

    Austin Health OMOP Dataset

    • researchdata.edu.au
    • figshare.unimelb.edu.au
    Updated Dec 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ROGER WARD; GRAEME HART (2023). Austin Health OMOP Dataset [Dataset]. http://doi.org/10.26188/24562789.V2
    Explore at:
    Dataset updated
    Dec 21, 2023
    Dataset provided by
    The University of Melbourne
    Authors
    ROGER WARD; GRAEME HART
    Description

    The Austin Health Dataset is an OMOP dataset based on records held at Austin Health.

    The data is derived from an Electronic Medical Records System held in Cerner.

    While the data is not open access, researchers can enquire about access subject to ethics and governance approvals.

  11. Provider

    • redivis.com
    Updated Sep 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redivis Demo Organization (2020). Provider [Dataset]. https://redivis.com/datasets/ye2v-6skh7wdr7
    Explore at:
    Dataset updated
    Sep 7, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Time period covered
    2008 - 2010
    Description

    The PROVIDER table contains a list of uniquely identified healthcare providers. These are individuals providing hands-on healthcare to patients, such as physicians, nurses, midwives, physical therapists etc.

  12. f

    OMOP primary database assessment of risk.

    • figshare.com
    xls
    Updated Apr 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roger Ward; Christine Mary Hallinan; David Ormiston-Smith; Christine Chidgey; Dougie Boyle (2024). OMOP primary database assessment of risk. [Dataset]. http://doi.org/10.1371/journal.pone.0301557.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Roger Ward; Christine Mary Hallinan; David Ormiston-Smith; Christine Chidgey; Dougie Boyle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe use of routinely collected health data for secondary research purposes is increasingly recognised as a methodology that advances medical research, improves patient outcomes, and guides policy. This secondary data, as found in electronic medical records (EMRs), can be optimised through conversion into a uniform data structure to enable analysis alongside other comparable health metric datasets. This can be achieved with the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM), which employs a standardised vocabulary to facilitate systematic analysis across various observational databases. The concept behind the OMOP-CDM is the conversion of data into a common format through the harmonisation of terminologies, vocabularies, and coding schemes within a unique repository. The OMOP model enhances research capacity through the development of shared analytic and prediction techniques; pharmacovigilance for the active surveillance of drug safety; and ‘validation’ analyses across multiple institutions across Australia, the United States, Europe, and the Asia Pacific. In this research, we aim to investigate the use of the open-source OMOP-CDM in the PATRON primary care data repository.MethodsWe used standard structured query language (SQL) to construct, extract, transform, and load scripts to convert the data to the OMOP-CDM. The process of mapping distinct free-text terms extracted from various EMRs presented a substantial challenge, as many terms could not be automatically matched to standard vocabularies through direct text comparison. This resulted in a number of terms that required manual assignment. To address this issue, we implemented a strategy where our clinical mappers were instructed to focus only on terms that appeared with sufficient frequency. We established a specific threshold value for each domain, ensuring that more than 95% of all records were linked to an approved vocabulary like SNOMED once appropriate mapping was completed. To assess the data quality of the resultant OMOP dataset we utilised the OHDSI Data Quality Dashboard (DQD) to evaluate the plausibility, conformity, and comprehensiveness of the data in the PATRON repository according to the Kahn framework.ResultsAcross three primary care EMR systems we converted data on 2.03 million active patients to version 5.4 of the OMOP common data model. The DQD assessment involved a total of 3,570 individual evaluations. Each evaluation compared the outcome against a predefined threshold. A ’FAIL’ occurred when the percentage of non-compliant rows exceeded the specified threshold value. In this assessment of the primary care OMOP database described here, we achieved an overall pass rate of 97%.ConclusionThe OMOP CDM’s widespread international use, support, and training provides a well-established pathway for data standardisation in collaborative research. Its compatibility allows the sharing of analysis packages across local and international research groups, which facilitates rapid and reproducible data comparisons. A suite of open-source tools, including the OHDSI Data Quality Dashboard (Version 1.4.1), supports the model. Its simplicity and standards-based approach facilitates adoption and integration into existing data processes.

  13. Vocabulary

    • redivis.com
    Updated Sep 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redivis Demo Organization (2020). Vocabulary [Dataset]. https://redivis.com/datasets/ye2v-6skh7wdr7
    Explore at:
    Dataset updated
    Sep 6, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Time period covered
    2008 - 2010
    Description

    The VOCABULARY table includes a list of the Vocabularies collected from various sources or created de novo by the OMOP community. This reference table is populated with a single record for each Vocabulary source and includes a descriptive name and other associated attributes for the Vocabulary.

  14. h

    Connected Bradford - Secondary Care BRI OMOP database

    • healthdatagateway.org
    unknown
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Connected Bradford. Yorkshire & Humber Secure Data Environment. (2025). Connected Bradford - Secondary Care BRI OMOP database [Dataset]. https://healthdatagateway.org/en/dataset/1101
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Jan 31, 2025
    Dataset authored and provided by
    Connected Bradford. Yorkshire & Humber Secure Data Environment.
    License

    https://bradfordresearch.nhs.uk/connected-bradford/https://bradfordresearch.nhs.uk/connected-bradford/

    Description

    This dataset is an extract from the Bradford Royal Infirmary EPR system. This contains current and some historical data, and is based on extracting the relevant tables from EPR, mapping to the OMOP schema and outputting in omop cdm 5.3 format.

  15. r

    Western Health OMOP Dataset

    • researchdata.edu.au
    • figshare.unimelb.edu.au
    Updated Dec 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Western Health OMOP Dataset [Dataset]. https://researchdata.edu.au/western-health-omop-dataset/2837361
    Explore at:
    Dataset updated
    Dec 21, 2023
    Dataset provided by
    The University of Melbourne
    Authors
    ROGER WARD; BILL KARANATSIOS
    Description

    The Western Health Dataset is an OMOP dataset based on records held at Western Health.

    The data is derived from an Electronic Medical Records System held in Cerner.

    While the data is not open access, researchers can enquire about access subject to ethics and governance approvals.

  16. Payer plan period

    • redivis.com
    Updated Sep 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Redivis Demo Organization (2020). Payer plan period [Dataset]. https://redivis.com/datasets/ye2v-6skh7wdr7
    Explore at:
    Dataset updated
    Sep 6, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Redivis Demo Organization
    Time period covered
    2008 - 2010
    Description

    The PAYER_PLAN_PERIOD table captures details of the period of time that a Person is continuously enrolled under a specific health Plan benefit structure from a given Payer.

  17. f

    EMR tables and related tables in the OMOP CDM.

    • figshare.com
    xls
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roger Ward; Christine Mary Hallinan; David Ormiston-Smith; Christine Chidgey; Dougie Boyle (2024). EMR tables and related tables in the OMOP CDM. [Dataset]. http://doi.org/10.1371/journal.pone.0301557.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Roger Ward; Christine Mary Hallinan; David Ormiston-Smith; Christine Chidgey; Dougie Boyle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe use of routinely collected health data for secondary research purposes is increasingly recognised as a methodology that advances medical research, improves patient outcomes, and guides policy. This secondary data, as found in electronic medical records (EMRs), can be optimised through conversion into a uniform data structure to enable analysis alongside other comparable health metric datasets. This can be achieved with the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM), which employs a standardised vocabulary to facilitate systematic analysis across various observational databases. The concept behind the OMOP-CDM is the conversion of data into a common format through the harmonisation of terminologies, vocabularies, and coding schemes within a unique repository. The OMOP model enhances research capacity through the development of shared analytic and prediction techniques; pharmacovigilance for the active surveillance of drug safety; and ‘validation’ analyses across multiple institutions across Australia, the United States, Europe, and the Asia Pacific. In this research, we aim to investigate the use of the open-source OMOP-CDM in the PATRON primary care data repository.MethodsWe used standard structured query language (SQL) to construct, extract, transform, and load scripts to convert the data to the OMOP-CDM. The process of mapping distinct free-text terms extracted from various EMRs presented a substantial challenge, as many terms could not be automatically matched to standard vocabularies through direct text comparison. This resulted in a number of terms that required manual assignment. To address this issue, we implemented a strategy where our clinical mappers were instructed to focus only on terms that appeared with sufficient frequency. We established a specific threshold value for each domain, ensuring that more than 95% of all records were linked to an approved vocabulary like SNOMED once appropriate mapping was completed. To assess the data quality of the resultant OMOP dataset we utilised the OHDSI Data Quality Dashboard (DQD) to evaluate the plausibility, conformity, and comprehensiveness of the data in the PATRON repository according to the Kahn framework.ResultsAcross three primary care EMR systems we converted data on 2.03 million active patients to version 5.4 of the OMOP common data model. The DQD assessment involved a total of 3,570 individual evaluations. Each evaluation compared the outcome against a predefined threshold. A ’FAIL’ occurred when the percentage of non-compliant rows exceeded the specified threshold value. In this assessment of the primary care OMOP database described here, we achieved an overall pass rate of 97%.ConclusionThe OMOP CDM’s widespread international use, support, and training provides a well-established pathway for data standardisation in collaborative research. Its compatibility allows the sharing of analysis packages across local and international research groups, which facilitates rapid and reproducible data comparisons. A suite of open-source tools, including the OHDSI Data Quality Dashboard (Version 1.4.1), supports the model. Its simplicity and standards-based approach facilitates adoption and integration into existing data processes.

  18. Addressing the Challenges of Health Data Standard Adoption and Usage: A...

    • zenodo.org
    bin
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alberto Marfoglia; Alberto Marfoglia; Valerio Antonio Arcobelli; Valerio Antonio Arcobelli; SERENA MOSCATO; SERENA MOSCATO; Antonino Amedeo La Mattina; Antonino Amedeo La Mattina; Sabato Mellone; Sabato Mellone; ANTONELLA CARBONARO; ANTONELLA CARBONARO (2025). Addressing the Challenges of Health Data Standard Adoption and Usage: A Systematic Review - Data Extraction [Dataset]. http://doi.org/10.5281/zenodo.15358180
    Explore at:
    binAvailable download formats
    Dataset updated
    May 12, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alberto Marfoglia; Alberto Marfoglia; Valerio Antonio Arcobelli; Valerio Antonio Arcobelli; SERENA MOSCATO; SERENA MOSCATO; Antonino Amedeo La Mattina; Antonino Amedeo La Mattina; Sabato Mellone; Sabato Mellone; ANTONELLA CARBONARO; ANTONELLA CARBONARO
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 7, 2025
    Description

    This table presents the data extraction from the 99 studies included according to the criteria outlined in the main manuscript. It is provided as supplementary material to enhance the readability of the paper while ensuring that all relevant information is preserved and accessible without loss of detail.

    The names of the variables and their descriptions are provided in the attached file, along with the following details:

    VariableDescription
    Ref.The citation in the format: First author et al. [Year] (e.g., AuthorA et al. [2022]). This identifies the study's primary citation for easy reference.
    TitleThe title of the paper
    StandardThe healthcare data standard used in the study. Possible values are: OMOP, OpenEHR, FHIR.
    Study LocationThe country where the study was conducted.
    Objective for using the standardDetailedThe comprehensive explanation of the specific objective of using the standard in the study, describing how it supports the study’s goals.
    ShortThe primary purpose for applying the healthcare standard. Possible values are: Secondary data reuse, Data exchange, Clinical decision support, Vocabulary definition, EHR system design,
    Application domainTypeThe application domain type that represents the healthcare standard. Possible solution are: Clinical: Studies with a direct impact on clinical practice, applying established tools or methods in healthcare settings (e.g., predicting in-hospital mortality for heart attack patients) and Research: Studies proposing innovative tools, methodologies, or frameworks still in the design/testing phase, not yet clinically implemented.
    Healthcare AreaThe relevant healthcare domain for the study, such as Cardiovascular, Intensive Care Unit, Emergency Department, Oncology, Biology, etc.
    ClusterThe healthcare domain clusterized for easier readability. Possible values include: Clinical Medicine, Clinical Services and Diagnostics, Public Health, Health Information Management and Biomedical Sciences
    UseThis report if the results of the paper serving a Primary use (direct care) or a Secondary use (repurposing existing data or tools for new objectives).
    ScaleThe scale of the study. Possible values are: Single center (one hospital/clinic), Multi-center (multiple institutions), Regional (specific region), National level (countrywide).
    Dataset magnitude in patientsThe magnitude of the dataset expressed in chars. Possible values are: A (<10 to 99), B (100 to 9,999), C (10,000 to 999,999) and D (1,000,000 and above).
    N° ElementsThe number of variables of input in the process of standardization.
    Percentuage of mapped variablesThe percentage of successful data standardisation.
    Coverage of the standardThe methodology of standardisation wheter it was adapted or not.
    ETL ToolsData cleaning & extractionThe tools adopted for supporting data cleaning and extraction.
    MappingThe tools adopted for the mapping of the variables.
    ValidationThe tools adopted for the validation of the standardization process.
    DatabaseThe database adopted for storing the result of the healthcare data standardization.
    Process efficiency and Economic assessmentThe information about the economic impact if the consequences are concrete and measured by the authors (e.g., actual cost savings, resource usage reductions). If the authors did not measure the economic impact, this field remains blank.
    Comments by authorsLimitationsThe significant limitations or challenges faced during the study about the standard adopted, such as issues with data compatibility, scalability, or the need for customization.
    AdvantagesThe benefits of applying the standard model, such as improved data consistency, enhanced clinical outcomes, better interoperability, or more efficient workflows.
  19. h

    University College London Hospitals NHS OMOP dataset

    • healthdatagateway.org
    unknown
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). University College London Hospitals NHS OMOP dataset [Dataset]. https://healthdatagateway.org/dataset/1336
    Explore at:
    unknownAvailable download formats
    Dataset updated
    May 28, 2025
    License

    https://safehr-data.org/https://safehr-data.org/

    Description

    UCLH has an OMOP extraction system (omop_es) that connects our Electronic Health Record (EHR) to an architecture that delivers high quality, standardised extracts meeting the OMOP CDM standards. Our EHR contains records for 6 million patients, 13 million diagnoses and 50 million medication events. These derive from the UCLH patient population which includes national referrals for tertiary and quaternary services (cancer, neurology etc.) and general medical admissions from an inner city teaching hospital that treats >1m outpatients per year, and has >100k inpatient admissions.

    UCLH has invested efforts and expertise to align international terminology systems e.g. SNOMED CT, LOINC, UCUM with NHS data standards, during EHR system build and post implementation. Our standardisation work has covered clinical domains i.e. Diagnosis and past medical history, Surgical and Ambulatory procedures, Diagnostic Imaging, Cardiac Echo, Lab Medicine including Biochemistry, Haematology, Microbiology, Immunology, Virology, Allergens, Medications (including route of administration); and Demographic information like Religion, Ethnicity. For some domains (e.g. diagnosis and surgical procedures) we have achieved 100% standardisation, others are an ongoing task.

    Our data pipeline, the OMOP-Extraction System (OMOP-ES) is a modular, re-usable architecture written in over 20,000 lines of R. Extractions proceed through four stages.

    1. Standardisation - translates source data to OMOP concepts at full fidelity
    2. Projection - applies rules to redact, filter, transform & link
    3. Post-processing - allows linking of de-identified non-OMOP data
    4. Output - multiple formats & destinations incl. CSV, Parquet or SQLite for direct use or import in a TRE

    The system is ● configurable to a variety of OMOP projects via a settings file ● reproducible and automated ● queries EPIC EHR and other sources ● automates filtering of sensitive data with safe defaults and ability for Information Governance teams to inspect settings before & after running ● tests and reports quality of standardisation ● being extended both by the 'core' team and by other trusts in an inner source fashion ● has a small mock database for system development and testing

  20. f

    ARCH ontologies and terminologies vs OMOP.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffrey G. Klann; Matthew A. H. Joss; Kevin Embree; Shawn N. Murphy (2023). ARCH ontologies and terminologies vs OMOP. [Dataset]. http://doi.org/10.1371/journal.pone.0212463.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jeffrey G. Klann; Matthew A. H. Joss; Kevin Embree; Shawn N. Murphy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ARCH ontologies and terminologies vs OMOP.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). CMS Synthetic Patient Data OMOP [Dataset]. https://redivis.com/workflows/6e6p-cfgn5hgz1

CMS Synthetic Patient Data OMOP

Explore at:
Dataset updated
Feb 24, 2025
Description

This is a synthetic patient dataset in the OMOP Common Data Model v5.2, originally released by the CMS and accessed via BigQuery. The dataset includes 24 tables and records for 2 million synthetic patients from 2008 to 2010.

Search
Clear search
Close search
Google apps
Main menu