The National Center for Advancing Translational Sciences (NCATS) has systematically compiled clinical, laboratory and diagnostic data from electronic health records to support COVID-19 research efforts via the National COVID Cohort Collaborative (N3C) Data Enclave. As of August 2, 2022, the repository contains information from over 15 million patients (including 5.8 million COVID-19 positive patients) across the United States.
The N3C Data Enclave is organized into 3 levels of data with varying access restrictions:
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The N3C Data Enclave is a secure platform through which harmonized clinical data provided by our contributing members are stored. The Enclave includes demographic and clinical characteristics of patients who have been tested for or diagnosed with COVID-19, and further information about the strategies and outcomes of treatments for those suspected or confirmed to have the virus. Additional data from individuals infected with pathogens such as SARS 1, MERS, and H1N1 are also included to support comparative studies. Data can be accessed only within the N3C Data Enclave and cannot be downloaded or removed. Three tiers of access are available for users depending on the scope and nature of their research; however, all will require verification and approval by the Data Access Committee (DAC) before data can be accessed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OMOP2OBO Mappings - N3C OMOP to OBO Working group
This repository stores OMOP2OBO mappings which have been processed for use within the National COVID Cohort Collaborative (N3C) Enclave. The version of the mappings stored in this repository have been specifically formatted for use within the N3C Enclave.
N3C OMOP to OBO Working Group: https://covid.cd2h.org/ontology
Accessing the N3C-Formatted Mappings
You can access the three OMOP2OBO HPO mapping files in the Enclave from the Knowledge store using the following link: https://unite.nih.gov/workspace/compass/view/ri.compass.main.folder.1719efcf-9a87-484f-9a67-be6a29598567.
The mapping set includes three files, but you only need to merge the following two files with existing data in the Enclave in order to be able to create the concept sets:
The first file OMOP2OBO_v1.0.0_N3C_Enclave_CSV_concept_set_expression_items.csv, contains columns for the OMOP concept ids and codes as well as specifies information like whether or not the OMOP concept’s descendants should be included when deriving the concept sets (defaults to FALSE). The other file OMOP2OBO_v1.0.0_N3C_Enclave_CSV_concept_set_version.csv, contains details on the mapping’s label (i.e., the HPO curie and label in the concept_set_id field) and its provenance/evidence (the specific column to access for this information is called intention).
Creating Concept Sets
Merge these files together on the column named codeset_id and then join them with existing Enclave tables like concept and condition_occurrence to populate the actual concept sets. The name of the concept set can be obtained from the OMOP2OBO_v1.0.0_N3C_Enclave_CSV_concept_set_version.csv file and is stored as a string in the column called concept_set_id. Although not ideal (but is the best way to approach this currently given what fields are available in the Enclave), to get the HPO CURIE and label will require applying a regex to this column.
An example mapping is shown below (highlighting some of the most useful columns):
codeset_id: 900000000
concept_set_id: [OMOP2OBO] hp_0002031-abnormal_esophagus_morphology
concept: 23868
code: 69771008
codeSystem: SNOMED
includeDescendants: False
intention:
Mixed - This mapping was created using the OMOP2OBO mapping algorithm (https://github.com/callahantiff/OMOP2OBO).
The Mapping Category and Evidence supporting the mappings are provided below, by OMOP concept:
23868
*******
Mapping Category: Automatic Exact - Concept
------------------------------------------------
Mapping Provenance
------------------
OBO_DbXref-OMOP_ANCESTOR_SOURCE_CODE:snomed_69771008 | OBO_DbXref-OMOP_CONCEPT_SOURCE_CODE:snomed_69771008 | CONCEPT_SIMILARITY:HP_0002031_0.713
Release Notes - v1.0.0
Preparation
In order to import data into the Enclave, the following items are needed:
Data
Script
Generated Output
Need to have the codeset_id filled from self-generation (ideally, from a conserved range) prior to beginning any of the API steps. The current list of assigned identifiers is stored in the file named omop2obo_enclave_codeset_id_dict_v1.0.0.json.
To be consistent with OMOP tools, specifically Atlas, we have also created Atlas-formatted json files for each mapping, which are stored in the zipped directory named atlas_json_files_v1.0.0.zip.
File 1: concept_set_container
File 2: concept_set_expression_items
File 3: concept_set_version
Generated Output:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Baseline characteristics of all patients in N3C receiving 2 doses of mRNA vaccine between January 1, 2021, and September 21, 2021.
Objectives: Although the World Health Organization (WHO) Clinical Progression Scale for COVID-19 is useful in prospective clinical trials, it cannot be effectively used with retrospective Electronic Health Record (EHR) datasets. Modifying the existing WHO Clinical Progression Scale, we developed an ordinal severity scale (OS) and assessed its usefulness in the analyses of COVID-19 patient outcomes using retrospective EHR data. Results: The data set used in this analysis consists of 2,880,456 patients. PCA of the day-to-day variation in OS levels over the totality of the 28-day period revealed contrasting patterns of variation in disease severity within the first and second 14 days and illustrated the importance of evaluation over the full 28-day period. Discussion: An OS with well-defined, robust features, based on discrete EHR data elements, is useful for assessments of COVID-19 patient outcomes, providing insights on progression of COVID-19 disease severity over time. Conclusion: The ...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supplementary data for N3C MACE study
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Representative examples of false negatives for positive mentions of “fever” in N3C COVID corpus, “diarrhea” in UMN PASC corpus and “chest pain” in N3C corpus as returned by BioMedICUS and both LLMs along with explanations.
https://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRDhttps://www.nist.gov/open/copyright-fair-use-and-licensing-statements-srd-data-software-and-technical-series-publications#SRD
This page, "anti-N3C(O)N", is part of the NIST Chemistry WebBook. This site and its contents are part of the NIST Standard Reference Data Program.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main entity of this document is a taxonomy with accession number 1776758
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to xn--n3c.com (Domain). Get insights into ownership history and changes over time.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to xn--reisercktritts-versicherung-n3c.com (Domain). Get insights into ownership history and changes over time.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundPatient symptoms, crucial for disease progression and diagnosis, are often captured in unstructured clinical notes. Large language models (LLMs) offer potential advantages in extracting patient symptoms compared to traditional rule-based information extraction (IE) systems.MethodsThis study compared fine-tuned LLMs (LLaMA2-13B and LLaMA3-8B) against BioMedICUS, a rule-based IE system, for extracting symptoms related to acute and post-acute sequelae of SARS-CoV-2 from clinical notes. The study utilized three corpora: UMN-COVID, UMN-PASC, and N3C-COVID. Prevalence, keyword and fairness analyses were conducted to assess symptom distribution and model equity across demographics.ResultsBioMedICUS outperformed fine-tuned LLMs in most cases. On the UMN PASC dataset, BioMedICUS achieved a macro-averaged F1-score of 0.70 for positive mention detection, compared to 0.66 for LLaMA2-13B and 0.62 for LLaMA3-8B. For the N3C COVID dataset, BioMedICUS scored 0.75, while LLaMA2-13B and LLaMA3-8B scored 0.53 and 0.68, respectively for positive mention detection. However, LLMs performed better in specific instances, such as detecting positive mentions of change in sleep in the UMN PASC dataset, where LLaMA2-13B (0.79) and LLaMA3-8B (0.65) outperformed BioMedICUS (0.60). For fairness analysis, BioMedICUS generally showed stronger performance across patient demographics. Keyword analysis using ANOVA on symptom distributions across all three corpora showed that both corpus (df = 2, p
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to xn--warnemnde-zimmervermittlung-n3c.info (Domain). Get insights into ownership history and changes over time.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Macro-averaged metrics with 95% confidence intervals for evaluation of BioMedICUS’, LLaMA2-13B, and LLaMA3-8B extraction performance in positive and negative symptom mentions for N3C COVID.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SARS-CoV-2 infection has been associated with increased autoimmune disease risk. Past studies have not aligned regarding the most prevalent autoimmune diseases after infection, however. Furthermore, the relationship between infection severity and new autoimmune disease risk has not been well examined. We used RECOVER’s electronic health record (EHR) networks, N3C, PCORnet, and PEDSnet, to estimate types and frequency of autoimmune diseases arising after SARS-CoV-2 infection and assessed how infection severity related to autoimmune disease risk. We identified patients of any age with SARS-CoV-2 infection between April 1, 2020 and April 1, 2021, and assigned them to a World Health Organization COVID-19 severity category for adults or the PEDSnet acute COVID-19 illness severity classification system for children (30 days after SARS-CoV-2 infection index date and occurring ≥1 day apart. We calculated overall and infection severity-stratified incidence ratesper 1000 person-years for all autoimmune diseases. With least severe COVID-19 severity as reference, survival analyses examined incident autoimmune disease risk. The most common new-onset autoimmune diseases in all networks were thyroid disease, psoriasis/psoriatic arthritis, and inflammatory bowel disease. Among adults, inflammatory arthritis was the most common, and Sjögren’s disease also had high incidence. Incident type 1 diabetes and hematological autoimmune diseases were specifically found in children. Across networks, after adjustment, patients with highest COVID-19 severity had highest risk for new autoimmune disease vs. those with least severe disease (N3C: adjusted Hazard Ratio, (aHR) 1.47 (95%CI 1.33–1.66); PCORnet aHR 1.14 (95%CI 1.02–1.26); PEDSnet: aHR 3.14 (95%CI 2.42–4.07)]. Overall, severe acute COVID-19 was most strongly associated with autoimmune disease risk in three EHR networks.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Amid the ongoing global repercussions of SARS-CoV-2, it is crucial to comprehend its potential long-term psychiatric effects. Several recent studies have suggested a link between COVID-19 and subsequent mental health disorders. Our investigation joins this exploration, concentrating on Schizophrenia Spectrum and Psychotic Disorders (SSPD). Different from other studies, we took acute respiratory distress syndrome (ARDS) and COVID-19 lab-negative cohorts as control groups to accurately gauge the impact of COVID-19 on SSPD. Data from 19,344,698 patients, sourced from the N3C Data Enclave platform, were methodically filtered to create propensity matched cohorts: ARDS (n = 222,337), COVID-19 positive (n = 219,264), and COVID-19 negative (n = 213,183). We systematically analyzed the hazard rate of new-onset SSPD across three distinct time intervals: 0-21 days, 22-90 days, and beyond 90 days post-infection. COVID-19 positive patients consistently exhibited a heightened hazard ratio (HR) across all intervals [0-21 days (HR: 4.6; CI: 3.7-5.7), 22-90 days (HR: 2.9; CI: 2.3 -3.8), beyond 90 days (HR: 1.7; CI: 1.5-1.)]. These are notably higher than both ARDS and COVID-19 lab-negative patients. Validations using various tests, including the Cochran Mantel Haenszel Test, Wald Test, and Log-rank Test confirmed these associations. Intriguingly, our data indicated that younger individuals face a heightened risk of SSPD after contracting COVID-19, a trend not observed in the ARDS and COVID-19 negative groups. These results, aligned with the known neurotropism of SARS-CoV-2 and earlier studies, accentuate the need for vigilant psychiatric assessment and support in the era of Long-COVID, especially among younger populations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Concepts categorization by accuracy level.
https://www.icpsr.umich.edu/web/ICPSR/studies/39023/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/39023/terms
This study explores whether perinatal telehealth uptake has mitigated the pandemic's effects on disparities in maternal care access, quality, and outcomes by race, ethnicity, and rural or urban residence. Research to date has approached this question in several ways. First, researchers have utilized census data to assess whether community-wide broadband infrastructure exists to support the use of telehealth services in areas with high travel times to maternal care units. Findings suggest that socioeconomically disadvantaged communities face significant barriers to maternity care access, both with substantial travel burdens and inadequate digital access to facilitate telehealth services. Second, to examine maternal care quality, researchers have employed South Carolina hospital-based claims data and vital statistics to identify racial, ethnic, and urban/rural disparities in rates of cesarean delivery before and during the COVID-19 pandemic period. Results indicate that cesarean rates differed by rural vs. urban facility locations and racial and ethnic groups but observed disparities were not significantly exacerbated by the pandemic. Third, using South Carolina hospital-based claims data and COVID-19 testing data, researchers found significant racial, ethnic, and rural disparities in postpartum readmissions involving mental health and substance use disorders from childbirth discharge through one year postpartum during the COVID-19 pandemic. Finally, drawing on data from the National COVID Cohort Collaborative (N3C), research has shown that hybrid care increased substantially during the COVID-19 public health emergency, but pregnant people living in rural areas had lower levels of hybrid care than urban people, and individuals who belonged to racial and ethnic minority groups were more likely to have hybrid care than White individuals. Future research will investigate the impact of the COVID-19 pandemic and perinatal telehealth uptake on additional maternity care and birth outcomes by race, ethnicity, and urbanicity. The study also aims to assess how state-level telehealth policies relate to perinatal telehealth uptake by race, ethnicity, and urbanicity, and to develop a model to predict long-term changes in maternal care access, quality, outcomes, and expenditures, with and without state telehealth policies. The ICPSR provides variable-level metadata for the data associated with this study. The actual data may only be available from the Principal Investigator directly. The variable descriptions available through ICPSR also include information regarding the source of each variable listed, as does the Data Source field of these metadata.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Confusion matrix of the accuracy rating for the performance of the GA algorithm.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Macro-averaged metrics for evaluation of BioMedICUS’, LLaMA2-13B, and LLaMA3-8B equity for race and gender in positive (+) and negative (-) symptom mentions for UMN PASC.
The National Center for Advancing Translational Sciences (NCATS) has systematically compiled clinical, laboratory and diagnostic data from electronic health records to support COVID-19 research efforts via the National COVID Cohort Collaborative (N3C) Data Enclave. As of August 2, 2022, the repository contains information from over 15 million patients (including 5.8 million COVID-19 positive patients) across the United States.
The N3C Data Enclave is organized into 3 levels of data with varying access restrictions: