26 datasets found
  1. Corporate Data Warehouse (CDW)

    • catalog.data.gov
    • datahub.va.gov
    • +3more
    Updated Aug 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2025). Corporate Data Warehouse (CDW) [Dataset]. https://catalog.data.gov/dataset/corporate-data-warehouse-cdw
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    The Veterans Health Administration (VHA) is increasingly dependent upon data. Most of its employees generate and use vast amounts of data on a daily basis. To improve our capacity for data analysis while providing the most efficient and the highest quality health care to our Veteran patients, VHA, working with the VA Office of Information and Technology, implemented a health data warehouse. Central to this plan is consolidating data from disparate sources into a coherent single logical data model. The Corporate Data Warehouse (CDW) is the physical implementation of this logical data model at the enterprise level for VHA. Although the CDW initially began to store data as early as 2006, a renewed effort began in 2010 to accelerate CDW's content by including more subject areas from Veterans Health Information Systems and Technology Architecture (VistA) and content from other existing national data systems. CDW supports fully developed subject areas in its production environment as well as supporting rapid prototyping by extracting data directly from source systems with very minor data transformations. The Regional Data Warehouses and the Veterans Integrated Service Network (VISN) Data Warehouses share content from CDW and allow for greater reporting flexibility at the local level throughout the VHA organization.

  2. C

    Cloud Data Warehouse Solutions Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Cloud Data Warehouse Solutions Report [Dataset]. https://www.datainsightsmarket.com/reports/cloud-data-warehouse-solutions-1385894
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Cloud Data Warehouse (CDW) solutions market is experiencing robust growth, driven by the increasing need for scalable, cost-effective, and secure data storage and analytics solutions across various industries. The market's expansion is fueled by several factors, including the proliferation of big data, the rise of cloud computing adoption, and the growing demand for real-time business intelligence. Organizations are migrating from on-premise data warehouses to cloud-based solutions to leverage the benefits of scalability, elasticity, and pay-as-you-go pricing models. This shift is further accelerated by the increasing complexity of data management and the need for advanced analytics capabilities to gain actionable insights from vast datasets. Competition is fierce, with major players like Amazon Redshift, Snowflake, Google Cloud, and Microsoft Azure Synapse leading the market, each offering unique strengths and capabilities. However, the market also witnesses the emergence of niche players catering to specific industry needs or geographical regions. The overall market is segmented based on deployment models (public, private, hybrid), service models (SaaS, PaaS, IaaS), and industry verticals (finance, healthcare, retail, etc.). Future growth will likely be influenced by advancements in technologies such as AI, machine learning, and serverless computing, further enhancing the analytical capabilities of CDW solutions. The projected Compound Annual Growth Rate (CAGR) suggests a substantial increase in market value over the forecast period (2025-2033). Assuming a conservative CAGR of 15% (a reasonable estimate considering the rapid technological advancements in this space), and a 2025 market size of $50 billion (a reasonable estimate based on industry reports), the market is poised for significant expansion. This growth will be influenced by factors such as increasing data volumes, advancements in data analytics techniques, and the growing adoption of cloud-based technologies by small and medium-sized businesses (SMBs). Despite the rapid growth, challenges remain, including data security concerns, integration complexities, and vendor lock-in. However, continuous innovation and the development of robust security measures will mitigate these challenges, paving the way for sustained market growth in the coming years.

  3. a

    Veterans Affairs Corporate Data Warehouse

    • atlaslongitudinaldatasets.ac.uk
    url
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Veterans Affairs (VA) (2024). Veterans Affairs Corporate Data Warehouse [Dataset]. https://atlaslongitudinaldatasets.ac.uk/datasets/va-cdw
    Explore at:
    urlAvailable download formats
    Dataset updated
    Oct 21, 2024
    Dataset provided by
    Atlas of Longitudinal Datasets
    Authors
    United States Department of Veterans Affairs (VA)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States of America
    Variables measured
    None
    Measurement technique
    Healthcare records, Secondary data, Registry, None
    Dataset funded by
    National Institutes of Health (NIH)
    Description

    VA CDW is a repository comprising data from multiple Veterans Health Administration (VHA) clinical and administrative systems. VHA is one of the largest integrated healthcare systems in the United States with data from over 20 years of sustained electronic health record (EHR) use. VA CDW was developed in 2006 to accommodate the massive amounts of data being generated and to streamline the process of knowledge discovery to application. The registry consists of approximately 7,500 databases hosted across 86 servers. Information that appears in the VA CDW includes demographic information, information on medication dispensing from VA pharmacies, laboratory test result information, free text from progress notes and radiology reports, as well as billing and claims-related data.

  4. Claims Data Warehouse (CDW)

    • data.wu.ac.at
    Updated Feb 27, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Personnel Management (2015). Claims Data Warehouse (CDW) [Dataset]. https://data.wu.ac.at/odso/data_gov/YmIzYjg4YzMtYWExMy00MmJiLTgwNmMtZjc0NWJhNTljMGEy
    Explore at:
    Dataset updated
    Feb 27, 2015
    Dataset provided by
    United States Office of Personnel Managementhttps://opm.gov/
    Description

    Database of health care claims from the Federal Employees Health Benefits Program (FEHBP) used for FEHBP audits, investigations, and debarment actions.

  5. Table_1_Long-term stroke and major bleeding risk in patients with...

    • frontiersin.figshare.com
    docx
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hancheol Lee; Jung Hwa Hong; Kwon-Duk Seo (2023). Table_1_Long-term stroke and major bleeding risk in patients with non-valvular atrial fibrillation: A comparative analysis between non-vitamin K antagonist oral anticoagulants and warfarin using a clinical data warehouse.DOCX [Dataset]. http://doi.org/10.3389/fneur.2023.1058781.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Hancheol Lee; Jung Hwa Hong; Kwon-Duk Seo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionNon-vitamin K antagonist oral anticoagulants (NOACs) has been the drug of choice for preventing ischemic stroke in patients with atrial fibrillation (AF) since 2014. Many studies based on claim data revealed that NOACs had comparable effect to warfarin in preventing ischemic stroke with fewer hemorrhagic side effects. We analyzed the difference in clinical outcomes according to the drugs in patients with AF based on the clinical data warehouse (CDW).MethodsWe extracted data of patients with AF from our hospital's CDW and obtained clinical information including test results. All claim data of the patients were extracted from National Health Insurance Service, and dataset was constructed by combining it with CDW data. Separately, another dataset was constructed with patients who could obtain sufficient clinical information from the CDW. The patients were divided NOAC and warfarin groups. The occurrence of ischemic stroke, intracranial hemorrhage, gastrointestinal bleeding, and death were confirmed as clinical outcome. The factors influencing the risk of clinical outcomes were analyzed.ResultsThe patients who were diagnosed AF between 2009 and 2020 were included in the dataset construction. In the combined dataset, 858 patients were treated with warfarin, 2,343 patients were treated with NOACs. After the diagnosis of AF, the incidence of ischemic stroke during follow-up was 199 (23.2%) in the warfarin group, 209 (8.9%) in the NOAC group. Intracranial hemorrhage occurred in 70 patients (8.2%) among the warfarin group, 61 (2.6%) of the NOAC group. Gastrointestinal bleeding occurred in 69 patients (8.0%) in the warfarin group, 78 patients (3.3%) in the NOAC group. NOAC's hazard ratio (HR) of ischemic stroke was 0.479 (95% CI 0.39–0.589, p < 0.0001), HR of intracranial hemorrhage was 0.453 (95% CI 0.31–0.664, p < 0.0001), and HR of gastrointestinal bleeding was 0.579 (95% CI 0.406–0.824, p = 0.0024). In the dataset constructed using only CDW, the NOAC group also had a lower risk of ischemic stroke and intracranial hemorrhage than warfarin group.ConclusionsIn this CDW based study, NOACs are more effective and safer than warfarin in patients with AF even with long-term follow-up. NOACs should be used to prevent ischemic stroke in patients with AF

  6. f

    Type of data integrated into the French CDWs.

    • plos.figshare.com
    xls
    Updated Jul 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthieu Doutreligne; Adeline Degremont; Pierre-Alain Jachiet; Antoine Lamer; Xavier Tannier (2023). Type of data integrated into the French CDWs. [Dataset]. http://doi.org/10.1371/journal.pdig.0000298.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 6, 2023
    Dataset provided by
    PLOS Digital Health
    Authors
    Matthieu Doutreligne; Adeline Degremont; Pierre-Alain Jachiet; Antoine Lamer; Xavier Tannier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    France, French
    Description

    Real-world data (RWD) bears great promises to improve the quality of care. However, specific infrastructures and methodologies are required to derive robust knowledge and brings innovations to the patient. Drawing upon the national case study of the 32 French regional and university hospitals governance, we highlight key aspects of modern clinical data warehouses (CDWs): governance, transparency, types of data, data reuse, technical tools, documentation, and data quality control processes. Semi-structured interviews as well as a review of reported studies on French CDWs were conducted in a semi-structured manner from March to November 2022. Out of 32 regional and university hospitals in France, 14 have a CDW in production, 5 are experimenting, 5 have a prospective CDW project, 8 did not have any CDW project at the time of writing. The implementation of CDW in France dates from 2011 and accelerated in the late 2020. From this case study, we draw some general guidelines for CDWs. The actual orientation of CDWs towards research requires efforts in governance stabilization, standardization of data schema, and development in data quality and data documentation. Particular attention must be paid to the sustainability of the warehouse teams and to the multilevel governance. The transparency of the studies and the tools of transformation of the data must improve to allow successful multicentric data reuses as well as innovations in routine care.

  7. w

    Converged Registries Solution (CRS)

    • data.wu.ac.at
    Updated Jul 26, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2017). Converged Registries Solution (CRS) [Dataset]. https://data.wu.ac.at/schema/data_gov/NjBkNGM0M2UtMmE5YS00YWVlLWE0M2ItYTZjZjlmZTExNTFj
    Explore at:
    Dataset updated
    Jul 26, 2017
    Dataset provided by
    Department of Veterans Affairs
    Description

    The Converged Registries platform is a hardware and software architecture designed to host individual patient registries and eliminate duplicative development effort while maximizing VAs ability to create new patient registries. The common platform includes a relational database, software classes, security modules, extraction services and other components. The Converged Registries obtains data from the Corporate Data Warehouse (CDW), directly from the Veterans Health Information Systems and Technology Architecture (VistA) as well as by direct user input. Registries Projects - Embedded Fragment Registry (EFR), Eye Injury Data Store, Traumatic Brain Injury (TBI) Registry and Veterans Implant Tracking and Alert System (VITAS).

  8. Data sources and definitions for homelessness from the VA medical record.

    • plos.figshare.com
    xls
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz (2023). Data sources and definitions for homelessness from the VA medical record. [Dataset]. http://doi.org/10.1371/journal.pone.0279973.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sources and definitions for homelessness from the VA medical record.

  9. Short-term PM2.5 exposure and early-readmission risk in Heart Failure...

    • catalog.data.gov
    Updated Nov 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). Short-term PM2.5 exposure and early-readmission risk in Heart Failure Patients [Dataset]. https://catalog.data.gov/dataset/short-term-pm2-5-exposure-and-early-readmission-risk-in-heart-failure-patients
    Explore at:
    Dataset updated
    Nov 15, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is not publicly accessible because: Dataset is PII in the form of electronic health records. It can be accessed through the following means: Data can be accessed with an approved IRB. Format: In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is associated with the following publication: Wyatt, L., A. Weaver, J. Moyer, J. Schwartz, Q. Di, D. Diazsanchez, W. Cascio, and C. Ward-Caviness. Short-term PM2.5 exposure and early-readmission risk: A retrospective cohort study in North Carolina Heart Failure Patients. American Heart Journal. Mosby Year Book Incorporated, Orlando, FL, USA, 248: 130-138, (2022).

  10. T

    National Prosthetic Patient Database (NPPD (Prosthetics & Sensory Aids...

    • data.va.gov
    • datahub.va.gov
    • +2more
    csv, xlsx, xml
    Updated Sep 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). National Prosthetic Patient Database (NPPD (Prosthetics & Sensory Aids Service)) [Dataset]. https://www.data.va.gov/dataset/National-Prosthetic-Patient-Database-NPPD-Prosthet/46q6-zer4
    Explore at:
    xml, xlsx, csvAvailable download formats
    Dataset updated
    Sep 12, 2019
    Description

    The National Prosthetics Patient Database (NPPD) established a central database of Prosthetics data recorded at each Veterans Health Administration facility. Its objective was to enable clinical reviews to increase quality, reduce costs, and improve efficiency of the Prosthetics program. Increase the quality of the services to our Veterans by providing a means to develop consistency in services, review prescription and management practices, develop training, monitor Home Medical Equipment, and measure performance improvements. Reduce costs by comparing costs system-wide, identifying common items for consolidated contracting, identifying costs for Medical Cost Care Funds (MCCF) purposes and improving contracting cost benefit. Improve efficiency by validating the data, improving budget management, determining where coding errors occur, providing training, and comparing unique social security numbers for multiple site usage and item issue. The NPPD Menu provides patient information, patient eligibility, Prosthetic treatment, date of provision, cost, vendor, and purchasing agent information. This system tracks average cost data and its usage and provides on both a monthly and quarterly basis detailed and summary reports by station, Veterans Integrated Service Network (VISN) and agency. The NPPD Menu resides in Veterans Health Information Systems and Technology Architecture (VistA) at the medical center level. This data is updated quarterly. Data is rolled up at each facility and transmitted to Hines. The data is then loaded into the Corporate Data Warehouse (CDW) from which data extracts are done. The data is also put into a ProClarity cube and is available to VA local, regional, and national managers online. National managers have the ability to properly monitor, oversee and manage the national program and regional managers are able to effectively manage their respective areas using this tool. The primary purpose of this database is to provide financial and clinical oversight of the Prosthetics program and is used primarily by the Prosthetics and Sensory Aids (PSA) including VISN staff, VISN Prosthetics Representatives, Prosthetics Program Managers and other Prosthetics staff.

  11. Converged Registries Solution (CRS)

    • catalog.data.gov
    • datahub.va.gov
    • +2more
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2025). Converged Registries Solution (CRS) [Dataset]. https://catalog.data.gov/dataset/converged-registries-solution-crs
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    United States Department of Veterans Affairshttp://va.gov/
    Description

    The Converged Registries Solution (CRS) has been replaced by the Veterans Integrated Registries Platform (VIRP). The information contained in this entry discusses the CRS prior to its replacement. The Converged Registries platform was a hardware and software architecture designed to host individual patient registries and eliminate duplicative development effort while maximizing VAs ability to create new patient registries. The common platform included a relational database, software classes, security modules, extraction services and other components. The Converged Registries obtained data from the Corporate Data Warehouse (CDW), directly from the Veterans Health Information Systems and Technology Architecture (VistA) as well as by direct user input. Registries Projects - Embedded Fragment Registry (EFR), Eye Injury Data Store, Traumatic Brain Injury (TBI) Registry and Veterans Implant Tracking and Alert System (VITAS).

  12. Data source, codes, and descriptions for defining housing instability in...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz (2023). Data source, codes, and descriptions for defining housing instability in Veterans Health Administration medical records. [Dataset]. http://doi.org/10.1371/journal.pone.0279973.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data source, codes, and descriptions for defining housing instability in Veterans Health Administration medical records.

  13. D

    Police Department Reported Victim and Suspect Demographic Data

    • data.sfgov.org
    • catalog.data.gov
    csv, xlsx, xml
    Updated Oct 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Police Department Reported Victim and Suspect Demographic Data [Dataset]. https://data.sfgov.org/d/cd9v-umhr
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Oct 23, 2025
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY This dataset provides aggregated counts of victims and suspects involved in crimes that fall under San Francisco’s mandated crime reporting categories, as recorded by the San Francisco Police Department (SFPD). The data is sourced from Crime Data Warehouse (CDW), which has been in operation since January 1, 2013.

    Because CDW was implemented on that date, data prior to 2013 is incomplete or unavailable. To protect the privacy and safety of vulnerable individuals, the dataset is aggregated and does not contain any personally identifiable information or individual case records. Crime categories are organized using:

    • San Francisco’s 96A.5 “Quarterly Crime Victim Data Reporting”, legislated for victim demographic reporting (Definitions of crime types can be found in Chapter 96A.1)

    • FBI Uniform Crime Reporting (UCR) system (Definitions can be found on the SFPD website.)

    This dataset also powers the public crime dashboards on the SFPD website, where users can explore summary statistics.

    B. HOW THE DATASET IS CREATED Data is added to open data once a quarter after extraction, transformation, and aggregation.

    Disclaimer: The San Francisco Police Department does not guarantee the accuracy, completeness, timeliness or correct sequencing of the information as the data is subject to change as modifications and updates are completed.

    C. UPDATE PROCESS Information is updated on a quarterly basis.

    D. HOW TO USE THIS DATASET This dataset provides aggregated counts of individuals involved in reported crimes, categorized by key demographics and crime-related attributes. It is used to power public-facing dashboards on the San Francisco Police Department (SFPD) website, where summary statistics and visualizations allow users to explore crime and victimization trends across the city. While the SFPD public dashboard provides many useful summaries and visualizations, not all data details are displayed there. For deeper or custom analysis, the full dataset can be downloaded for personal use.

  14. Results of multivariable logistic regression.

    • plos.figshare.com
    xls
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiromi Matsumoto; Taichi Fukushima; Nobuaki Kobayashi; Yuuki Higashino; Suguru Muraoka; Yukiko Ohtsu; Momo Hirata; Kohei Somekawa; Ayami Kaneko; Ryo Nagasawa; Sousuke Kubo; Katsushi Tanaka; Kota Murohashi; Hiroaki Fujii; Keisuke Watanabe; Nobuyuki Horita; Yu Hara; Takeshi Kaneko (2024). Results of multivariable logistic regression. [Dataset]. http://doi.org/10.1371/journal.pone.0299760.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Hiromi Matsumoto; Taichi Fukushima; Nobuaki Kobayashi; Yuuki Higashino; Suguru Muraoka; Yukiko Ohtsu; Momo Hirata; Kohei Somekawa; Ayami Kaneko; Ryo Nagasawa; Sousuke Kubo; Katsushi Tanaka; Kota Murohashi; Hiroaki Fujii; Keisuke Watanabe; Nobuyuki Horita; Yu Hara; Takeshi Kaneko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundImmune checkpoint inhibitors (ICIs) have improved outcomes in cancer treatment but are also associated with adverse events and financial burdens. Identifying accurate biomarkers is crucial for determining which patients are likely to benefit from ICIs. Current markers, such as PD-L1 expression and tumor mutation burden, exhibit limited predictive accuracy. This study utilizes a Clinical Data Warehouse (CDW) to explore the prognostic significance of novel blood-based factors, such as the neutrophil-to-lymphocyte ratio and red cell distribution width (RDW), to enhance the prediction of ICI therapy benefit.MethodsThis retrospective study utilized an exploratory cohort from the CDW that included a variety of cancers to explore factors associated with pembrolizumab treatment duration, validated in a non-small cell lung cancer (NSCLC) patient cohort from electronic medical records (EMR) and CDW. The CDW contained anonymized data on demographics, diagnoses, medications, and tests for cancer patients treated with ICIs between 2017–2022. Logistic regression identified factors predicting ≤2 or ≥5 pembrolizumab doses as proxies for progression-free survival (PFS), and Receiver Operating Characteristic analysis was used to examine their predictive ability. These factors were validated by correlating doses with PFS in the EMR cohort and re-testing their significance in the CDW cohort with other ICIs. This dual approach utilized the CDW for discovery and EMR/CDW cohorts for validating prognostic biomarkers before ICI treatment.ResultsA total of 609 cases (428 in the exploratory cohort and 181 in the validation cohort) from CDW and 44 cases from EMR were selected for study. CDW analysis revealed that elevated red cell distribution width (RDW) correlated with receiving ≤2 pembrolizumab doses (p = 0.0008), with an AUC of 0.60 for predicting treatment duration. RDW’s correlation with PFS (r = 0.80, p

  15. Patient characteristics for validation part 1.

    • plos.figshare.com
    xls
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiromi Matsumoto; Taichi Fukushima; Nobuaki Kobayashi; Yuuki Higashino; Suguru Muraoka; Yukiko Ohtsu; Momo Hirata; Kohei Somekawa; Ayami Kaneko; Ryo Nagasawa; Sousuke Kubo; Katsushi Tanaka; Kota Murohashi; Hiroaki Fujii; Keisuke Watanabe; Nobuyuki Horita; Yu Hara; Takeshi Kaneko (2024). Patient characteristics for validation part 1. [Dataset]. http://doi.org/10.1371/journal.pone.0299760.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Hiromi Matsumoto; Taichi Fukushima; Nobuaki Kobayashi; Yuuki Higashino; Suguru Muraoka; Yukiko Ohtsu; Momo Hirata; Kohei Somekawa; Ayami Kaneko; Ryo Nagasawa; Sousuke Kubo; Katsushi Tanaka; Kota Murohashi; Hiroaki Fujii; Keisuke Watanabe; Nobuyuki Horita; Yu Hara; Takeshi Kaneko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundImmune checkpoint inhibitors (ICIs) have improved outcomes in cancer treatment but are also associated with adverse events and financial burdens. Identifying accurate biomarkers is crucial for determining which patients are likely to benefit from ICIs. Current markers, such as PD-L1 expression and tumor mutation burden, exhibit limited predictive accuracy. This study utilizes a Clinical Data Warehouse (CDW) to explore the prognostic significance of novel blood-based factors, such as the neutrophil-to-lymphocyte ratio and red cell distribution width (RDW), to enhance the prediction of ICI therapy benefit.MethodsThis retrospective study utilized an exploratory cohort from the CDW that included a variety of cancers to explore factors associated with pembrolizumab treatment duration, validated in a non-small cell lung cancer (NSCLC) patient cohort from electronic medical records (EMR) and CDW. The CDW contained anonymized data on demographics, diagnoses, medications, and tests for cancer patients treated with ICIs between 2017–2022. Logistic regression identified factors predicting ≤2 or ≥5 pembrolizumab doses as proxies for progression-free survival (PFS), and Receiver Operating Characteristic analysis was used to examine their predictive ability. These factors were validated by correlating doses with PFS in the EMR cohort and re-testing their significance in the CDW cohort with other ICIs. This dual approach utilized the CDW for discovery and EMR/CDW cohorts for validating prognostic biomarkers before ICI treatment.ResultsA total of 609 cases (428 in the exploratory cohort and 181 in the validation cohort) from CDW and 44 cases from EMR were selected for study. CDW analysis revealed that elevated red cell distribution width (RDW) correlated with receiving ≤2 pembrolizumab doses (p = 0.0008), with an AUC of 0.60 for predicting treatment duration. RDW’s correlation with PFS (r = 0.80, p

  16. Geographic variability in use of different indicators of homelessness.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz (2023). Geographic variability in use of different indicators of homelessness. [Dataset]. http://doi.org/10.1371/journal.pone.0279973.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jack Tsai; Dorota Szymkowiak; Eric Jutkowitz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Geographic variability in use of different indicators of homelessness.

  17. Police_Department_Reported

    • kaggle.com
    zip
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2025). Police_Department_Reported [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/police-department-reported
    Explore at:
    zip(2312966 bytes)Available download formats
    Dataset updated
    Nov 8, 2025
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This graph was created in Tableau,PowerBi and Loocker Studio:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F09fbaa9a70092cb0ddd88f33d177dad7%2Fgraph1.png?generation=1762638206213004&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F4be9c5e8b4949843a4937e682c075329%2Fgraph2.jpg?generation=1762638212245517&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F86e095199b8ac2b1d77b35424a2c73cb%2Fgraph3.jpg?generation=1762638218785165&alt=media" alt="">

    This dataset contains summarized information about victims and suspects involved in crimes reported by the San Francisco Police Department (SFPD). The information comes from the Crime Data Warehouse (CDW), which has been operating since January 1, 2013. Because the system started that year, older data may be missing or incomplete. To protect privacy, the dataset does not include personal details or individual case records—everything is aggregated. Crimes are categorized according to two official systems: San Francisco’s local law 96A.5, which defines how demographic information about victims must be reported, and the FBI’s Uniform Crime Reporting (UCR) system, which provides national crime definitions. The data is processed and published every quarter. After crimes are recorded, the information goes through cleaning, transformation, and aggregation steps before being added to the public dataset. The SFPD notes that updates and corrections may change the data over time, so it cannot guarantee complete accuracy or timeliness. The dataset is updated quarterly and is mainly used to support public dashboards on the SFPD website. These dashboards show visual summaries and trends about crime and victimization across San Francisco. For anyone who wants to perform a deeper or customized analysis, the full dataset can be downloaded and explored independently.

  18. f

    Baseline characteristics of stone formers with and without a 24-hour urine...

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao (2023). Baseline characteristics of stone formers with and without a 24-hour urine collection in the top and bottom decile of VHA facilities that administer 24-hour urine testing. [Dataset]. http://doi.org/10.1371/journal.pone.0220768.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Baseline characteristics of stone formers with and without a 24-hour urine collection in the top and bottom decile of VHA facilities that administer 24-hour urine testing.

  19. Multivariable logistic regression reporting the odds of completing a 24-hour...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao (2023). Multivariable logistic regression reporting the odds of completing a 24-hour urine collection. [Dataset]. http://doi.org/10.1371/journal.pone.0220768.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multivariable logistic regression reporting the odds of completing a 24-hour urine collection.

  20. f

    Baseline characteristics of stone formers with and without a 24-hour urine...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao (2023). Baseline characteristics of stone formers with and without a 24-hour urine collection. [Dataset]. http://doi.org/10.1371/journal.pone.0220768.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Calyani Ganesan; I-Chun Thomas; Shen Song; Andrew J. Sun; Ericka M. Sohlberg; Manjula Kurella Tamura; Glenn M. Chertow; Joseph C. Liao; Simon Conti; Christopher S. Elliott; John T. Leppert; Alan C. Pao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Baseline characteristics of stone formers with and without a 24-hour urine collection.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Department of Veterans Affairs (2025). Corporate Data Warehouse (CDW) [Dataset]. https://catalog.data.gov/dataset/corporate-data-warehouse-cdw
Organization logo

Corporate Data Warehouse (CDW)

Explore at:
10 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 2, 2025
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description

The Veterans Health Administration (VHA) is increasingly dependent upon data. Most of its employees generate and use vast amounts of data on a daily basis. To improve our capacity for data analysis while providing the most efficient and the highest quality health care to our Veteran patients, VHA, working with the VA Office of Information and Technology, implemented a health data warehouse. Central to this plan is consolidating data from disparate sources into a coherent single logical data model. The Corporate Data Warehouse (CDW) is the physical implementation of this logical data model at the enterprise level for VHA. Although the CDW initially began to store data as early as 2006, a renewed effort began in 2010 to accelerate CDW's content by including more subject areas from Veterans Health Information Systems and Technology Architecture (VistA) and content from other existing national data systems. CDW supports fully developed subject areas in its production environment as well as supporting rapid prototyping by extracting data directly from source systems with very minor data transformations. The Regional Data Warehouses and the Veterans Integrated Service Network (VISN) Data Warehouses share content from CDW and allow for greater reporting flexibility at the local level throughout the VHA organization.

Search
Clear search
Close search
Google apps
Main menu