This is a synthetic patient dataset in the OMOP Common Data Model v5.2, originally released by the CMS and accessed via BigQuery. The dataset includes 24 tables and records for 2 million synthetic patients from 2008 to 2010.
This dataset takes on the format of the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). As shown in the diagram below, the purpose of the Common Data Model is to convert various distinctly-formatted datasets into a well-known, universal format with a set of standardized vocabularies. See the diagram below from the Observational Health Data Sciences and Informatics (OHDSI) webpage.
https://redivis.com/fileUploads/d1a95a4e-074a-44d1-92e5-9adfd2f4068a%3E" alt="Why-CDM.png">
Such universal data models ultimately enable researchers to streamline the analysis of observational medical data. For more information regarding the OMOP CDM, refer to the OHSDI OMOP site.
%3Cli%3EFor documentation regarding the source data format from the Center for Medicare and Medicaid Services (CMS), refer to the %3Ca href="https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs/DE_Syn_PUF"%3ECMS Synthetic Public Use File%3C/a%3E.%3C/li%3E
%3Cli%3EFor information regarding the conversion of the CMS data file to the OMOP CDM v5.2, refer to %3Ca href="https://github.com/OHDSI/ETL-CMS"%3Ethis OHDSI GitHub page%3C/a%3E. %3C/li%3E
%3Cli%3EFor information regarding each of the 24 tables in this dataset, including more detailed variable metadata, see %3Ca href="https://github.com/OHDSI/CommonDataModel/wiki"%3Ethe OHDSI CDM GitHub Wiki page%3C/a%3E. All variable labels and descriptions as well as table descriptions come from this Wiki page. Note that this GitHub page includes information primarily regarding the 6.0 version of the CDM and that this dataset works with the 5.2 version. %3C/li%3E
https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Clinical trial data management (CDM) providers have experienced robust growth in recent years, driven by several key factors. Two major catalysts contributing to this growth are an increasing demand for innovative therapies and treatments and the rising prevalence of chronic diseases worldwide. As pharmaceutical companies race to develop new drugs and biologics to address unmet medical needs, the volume and complexity of clinical trials have surged. A jump in clinical trial activity has fueled the need for efficient and reliable data management solutions to handle the vast amounts of data generated throughout the drug development process. At the same time, regulatory bodies in the US and internationally mounting scrutiny of clinical trial data integrity has prompted pharmaceutical companies to outsource data management to compliance and transparency. In all, revenue has been expanding at a CAGR of 5.9% to an estimated $8.9 billion over the past five years, including expected growth of 2.7% in 2024. One central trend behind clinical trial data management providers’ growth is the increasingly complex clinical trial landscape. Medical and tech advances have made the clinical trial process more intricate, expanding the volume and variety of data collected during clinical trials, introducing significant challenges for data management. Clinical trial data management companies have developed an increasingly vital role in addressing these challenges by providing specialized services. Outsourcing data management has been especially crucial for smaller biopharmaceutical companies that depend heavily on successful clinical trials but lack the capital or resources to invest in in-house capabilities. Outsourcing aspects of the research and development stage, including clinical trial data management, will become an increasingly attractive option for downstream pharmaceutical and medical device manufacturers, positioning the industry for growth. Competition between smaller or mid-sized pharma and the leading multinational manufacturers to bring novel therapies to market will strengthen CDM companies’ role. An approaching patent cliff will also drive demand for clinical trial data management services as revenue declines and heightened competition from generic drugs accelerate clinical trial activity and cost mitigation efforts. Revenue will continue growing, rising at a CAGR of 3.3% over the next five years, reaching an estimated $10.5 billion in 2029.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These data are modelled using the OMOP Common Data Model v5.3.Correlated Data SourceNG tube vocabulariesGeneration RulesThe patient’s age should be between 18 and 100 at the moment of the visit.Ethnicity data is using 2021 census data in England and Wales (Census in England and Wales 2021) .Gender is equally distributed between Male and Female (50% each).Every person in the record has a link in procedure_occurrence with the concept “Checking the position of nasogastric tube using X-ray”2% of person records have a link in procedure_occurrence with the concept of “Plain chest X-ray”60% of visit_occurrence has visit concept “Inpatient Visit”, while 40% have “Emergency Room Visit”NotesVersion 0Generated by man-made rule/story generatorStructural correct, all tables linked with the relationshipWe used national ethnicity data to generate a realistic distribution (see below)2011 Race Census figure in England and WalesEthnic Group : Population(%)Asian or Asian British: Bangladeshi - 1.1Asian or Asian British: Chinese - 0.7Asian or Asian British: Indian - 3.1Asian or Asian British: Pakistani - 2.7Asian or Asian British: any other Asian background -1.6Black or African or Caribbean or Black British: African - 2.5Black or African or Caribbean or Black British: Caribbean - 1Black or African or Caribbean or Black British: other Black or African or Caribbean background - 0.5Mixed multiple ethnic groups: White and Asian - 0.8Mixed multiple ethnic groups: White and Black African - 0.4Mixed multiple ethnic groups: White and Black Caribbean - 0.9Mixed multiple ethnic groups: any other Mixed or multiple ethnic background - 0.8White: English or Welsh or Scottish or Northern Irish or British - 74.4White: Irish - 0.9White: Gypsy or Irish Traveller - 0.1White: any other White background - 6.4Other ethnic group: any other ethnic group - 1.6Other ethnic group: Arab - 0.6
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
its population is characterized as Brazilian, Chinese, and Indian companies that presented financial information to external users through securities markets’ regulatory agencies in Brazil, China, and India and that implemented CDM projects during the 2005–2012 period, ranking in the “registered” status on the UNFCCC website.
Quantitative data were obtaining to test the statistical hypothesis proposed in the study from information referring to the companies and CDM projects that made up the sample as follows: (i) the financial information referring to the equity (E) of companies that have their shares listed in the capital markets of Brazil, China, and India; and (ii) the emission reduction estimates of CDM projects, available from the UNFCCC website.
The data collection, referring to the financial information of the companies that have made themselves available via regulatory bodies in the securities markets of the countries under study, was carried out through Thomson Reuters Eikon’s Electronic and Financial Database on July 30, 2013. Thus, when the data collection was carried out, financial information was obtained and converted into euros, referring to the equity (E) of 380 Brazilian companies, 2,584 Chinese companies, and 4,219 Indian companies, for the period under review.
The collection of data concerning CDM projects with the status “registered” on the UNFCCC site, on the other hand, was carried out using the Bloomberg Economic and Financial Database on July 29, 2013, at which time a total of 289 projects registered by the Brazilian DNA, 3,651 projects registered by the Chinese DNA, and 1,296 projects registered by the Indian DNA were available for analysis for the 2005–2012 period. On November 18, 2004, just one project was registered by the Brazilian DNA, entitled “Brazil NovaGerar Landfill Gas to Energy Project” (UNFCCC, 2014). This project was eliminated from the research because of its set limits defined between 2005 and 2012, the first stage of the Kyoto Protocol.
However, it was necessary to carry out new searches directly on the UNFCCC site for supplementary information that was crucial to implementing the research, given the fact that it did not include full descriptions concerning the names of the receiving agencies in each country (host party), in the Bloomberg Economic and Financial database, on the date mentioned above, information that was characterized as the only link between the CDM project database (Bloomberg) and the financial information database (Thomson Reuters Eikon). These searches were carried during the October 2013–May 2014 period.
Subsequently, on September 1, 2014, new searches were carried out on the UNFCCC website to update the information referring to CDM projects registered by the agency during the 2005–2012 period.
Thus, this research was carried out based on CDM projects located in the “registered” status section of the UNFCCC site over the 2005–2012 period, the records of which were finalized by the body prior to September 1, 2014, containing 299 projects registered by the DNA of Brazil, 3,682 projects registered by the DNA of China, and 1,371 projects registered by the DNA of India, adding up to 5,353 projects, that is, 74.69% of the total implemented projects in all the developing countries that ratified the Kyoto Protocol.
To allow the measurement to be applied to the fair value of estimates of project emission reduction approved by the companies that make up the research sample, we obtained the interest rate EURIBOR – Euro Interbank Offered Rate (average annual rates) from the Bloomberg Financial and Economic Database on July 29, 2013 to adjust the future flows of economic benefits of CER estimates to the present value.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An example of outpatient visit event logs.
The DOMAIN table includes a list of OMOP-defined Domains the Concepts of the Standardized Vocabularies can belong to. A Domain defines the set of allowable Concepts for the standardized fields in the CDM tables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
thus preserving patient privacy and confidentiality.This Dataset contains sample data using the PCORnet Common Data Model for running the regression tests supplied with PopMedNet™.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The Clinical Data Management (CDM) and Statistical Analysis market is experiencing robust growth, driven by the increasing volume of clinical trial data generated by the burgeoning pharmaceutical and biotechnology industries. The market's complexity is amplified by the stringent regulatory requirements surrounding data integrity and analysis in clinical trials. While precise figures for market size and CAGR are not provided, based on industry reports and observable trends, a reasonable estimation would place the 2025 market size at approximately $15 billion, with a projected Compound Annual Growth Rate (CAGR) of 8% from 2025 to 2033. This growth is fueled by several key factors, including the rising adoption of electronic data capture (EDC) systems, the increasing demand for advanced statistical analysis techniques, and the growing outsourcing of CDM and statistical analysis services by pharmaceutical and biotech companies. This outsourcing trend allows companies to focus on core competencies while leveraging the expertise of specialized service providers. The market also witnesses significant investments in innovative technologies like artificial intelligence (AI) and machine learning (ML) for data processing and analysis, streamlining workflows and improving the efficiency of clinical trials. Despite this positive outlook, the market faces challenges. The high cost of implementing and maintaining advanced CDM systems can be a barrier to entry for smaller companies. Furthermore, the need for highly skilled professionals in biostatistics and data management creates a talent shortage that impacts service delivery and overall market expansion. However, the ongoing technological advancements and the increasing demand for efficient clinical trials are expected to outweigh these restraints, ensuring continued growth in the coming years. The market is segmented across various service providers, including large multinational CROs like IQVIA and Charles River Laboratories, as well as specialized smaller firms catering to niche markets. Geographic variations in regulatory landscapes and adoption rates also play a significant role in shaping the market's dynamics.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The Coolant Distribution Manifolds (CDM) market is experiencing robust growth, driven by the increasing demand for efficient thermal management solutions in data centers and industrial applications. The market size in 2025 is estimated at $850 million, exhibiting a Compound Annual Growth Rate (CAGR) of 7.5% from 2025 to 2033. This growth is fueled by several key factors, including the rising adoption of high-density computing infrastructure, the proliferation of edge data centers requiring advanced cooling systems, and stringent regulations regarding energy efficiency. Furthermore, the ongoing trend towards liquid cooling, offering superior heat dissipation capabilities compared to traditional air cooling, significantly contributes to the CDM market's expansion. Key players like Vertiv, Schneider Electric, and Rittal are driving innovation through advanced designs and materials, expanding their product portfolios to cater to diverse customer needs. However, the market faces certain restraints. The high initial investment associated with CDM implementation can hinder adoption in smaller facilities. Furthermore, the complexity of integration and maintenance requires specialized expertise, potentially posing a barrier to entry for certain companies. Despite these challenges, the long-term benefits of enhanced cooling performance and energy savings are expected to outweigh the initial costs, leading to sustained market growth. The segmentation of the market is influenced by factors such as application (data centers, industrial, others), material type (aluminum, copper), and cooling capacity. The North American and European regions are currently leading the market, but significant growth opportunities are emerging in the Asia-Pacific region, driven by rapid technological advancements and increasing infrastructure investments.
The 'Drug' domain captures records about the utilization of a Drug when ingested or otherwise introduced into the body.
This table presents the data extraction from the 99 studies included according to the criteria outlined in the main manuscript. It is provided as supplementary material to enhance the readability of the paper while ensuring that all relevant information is preserved and accessible without loss of detail.
The names of the variables and their descriptions are provided in the attached file, along with the following details:
Variable | Description | |
Ref. | The citation in the format: First author et al. [Year] (e.g., AuthorA et al. [2022]). This identifies the study's primary citation for easy reference. | |
Title | The title of the paper | |
Standard | The healthcare data standard used in the study. Possible values are: OMOP, OpenEHR, FHIR. | |
Study Location | The country where the study was conducted. | |
Objective for using the standard | Detailed | The comprehensive explanation of the specific objective of using the standard in the study, describing how it supports the study’s goals. |
Short | The primary purpose for applying the healthcare standard. Possible values are: Secondary data reuse, Data exchange, Clinical decision support, Vocabulary definition, EHR system design, | |
Application domain | Type | The application domain type that represents the healthcare standard. Possible solution are: Clinical: Studies with a direct impact on clinical practice, applying established tools or methods in healthcare settings (e.g., predicting in-hospital mortality for heart attack patients) and Research: Studies proposing innovative tools, methodologies, or frameworks still in the design/testing phase, not yet clinically implemented. |
Healthcare Area | The relevant healthcare domain for the study, such as Cardiovascular, Intensive Care Unit, Emergency Department, Oncology, Biology, etc. | |
Cluster | The healthcare domain clusterized for easier readability. Possible values include: Clinical Medicine, Clinical Services and Diagnostics, Public Health, Health Information Management and Biomedical Sciences | |
Use | This report if the results of the paper serving a Primary use (direct care) or a Secondary use (repurposing existing data or tools for new objectives). | |
Scale | The scale of the study. Possible values are: Single center (one hospital/clinic), Multi-center (multiple institutions), Regional (specific region), National level (countrywide). | |
Dataset magnitude in patients | The magnitude of the dataset expressed in chars. Possible values are: A (<10 to 99), B (100 to 9,999), C (10,000 to 999,999) and D (1,000,000 and above). | |
N° Elements | The number of variables of input in the process of standardization. | |
Percentuage of mapped variables | The percentage of successful data standardisation. | |
Coverage of the standard | The methodology of standardisation wheter it was adapted or not. | |
ETL Tools | Data cleaning & extraction | The tools adopted for supporting data cleaning and extraction. |
Mapping | The tools adopted for the mapping of the variables. | |
Validation | The tools adopted for the validation of the standardization process. | |
Database | The database adopted for storing the result of the healthcare data standardization. | |
Process efficiency and Economic assessment | The information about the economic impact if the consequences are concrete and measured by the authors (e.g., actual cost savings, resource usage reductions). If the authors did not measure the economic impact, this field remains blank. | |
Comments by authors | Limitations | The significant limitations or challenges faced during the study about the standard adopted, such as issues with data compatibility, scalability, or the need for customization. |
Advantages | The benefits of applying the standard model, such as improved data consistency, enhanced clinical outcomes, better interoperability, or more efficient workflows. |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison with other studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of the datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An example log for patient journey.
Cleaned midday and predawn water potential data from 14 stem psychrometers installed within the footprint of the US-CdM flux tower, from May-October 2021. These data were collected in order to link environmental drivers, vegetation responses to seasonal water stress, and ecosystem fluxes of carbon and water at a pinyon-juniper woodland in southeastern Utah. Data include a comma separated values file containing all water potential data used in the published paper. Data were first published in Kannenberg et al. 2023 Agricultural and Forest Meteorology (doi: 10.1016/j.agrformet.2022.109269). Flux data are publicly available from the AmeriFlux portal (doi: 10.17190/AMF/1865477). This data file and the AmeriFlux data are all that is necessary to replicate the analyses in Kannenberg et al. 2023 Agricultural and Forest Meteorology.
2005 draft data standard for the collection of chronic disease management (CDM) clinical information created by the Western Health Information Collaborative. CDM data is used to assess, report and manage chronic diseases to help improve delivery of primary health care to affected individuals.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1704 Global exporters importers export import shipment records of Masterc cdm with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
The Person Domain contains records that uniquely identify each patient in the source data who is time at-risk to have clinical observations recorded within the source systems.
This is a synthetic patient dataset in the OMOP Common Data Model v5.2, originally released by the CMS and accessed via BigQuery. The dataset includes 24 tables and records for 2 million synthetic patients from 2008 to 2010.
This dataset takes on the format of the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). As shown in the diagram below, the purpose of the Common Data Model is to convert various distinctly-formatted datasets into a well-known, universal format with a set of standardized vocabularies. See the diagram below from the Observational Health Data Sciences and Informatics (OHDSI) webpage.
https://redivis.com/fileUploads/d1a95a4e-074a-44d1-92e5-9adfd2f4068a%3E" alt="Why-CDM.png">
Such universal data models ultimately enable researchers to streamline the analysis of observational medical data. For more information regarding the OMOP CDM, refer to the OHSDI OMOP site.
%3Cli%3EFor documentation regarding the source data format from the Center for Medicare and Medicaid Services (CMS), refer to the %3Ca href="https://www.cms.gov/Research-Statistics-Data-and-Systems/Downloadable-Public-Use-Files/SynPUFs/DE_Syn_PUF"%3ECMS Synthetic Public Use File%3C/a%3E.%3C/li%3E
%3Cli%3EFor information regarding the conversion of the CMS data file to the OMOP CDM v5.2, refer to %3Ca href="https://github.com/OHDSI/ETL-CMS"%3Ethis OHDSI GitHub page%3C/a%3E. %3C/li%3E
%3Cli%3EFor information regarding each of the 24 tables in this dataset, including more detailed variable metadata, see %3Ca href="https://github.com/OHDSI/CommonDataModel/wiki"%3Ethe OHDSI CDM GitHub Wiki page%3C/a%3E. All variable labels and descriptions as well as table descriptions come from this Wiki page. Note that this GitHub page includes information primarily regarding the 6.0 version of the CDM and that this dataset works with the 5.2 version. %3C/li%3E