Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: This dataset is no longer being updated due to the end of the COVID-19 Public Health Emergency.
The California Department of Public Health (CDPH) is identifying the prevalence of circulating SARS-CoV-2 variants by analyzing CDPH Genomic Surveillance Data and CalREDIE, CDPH's communicable disease reporting and surveillance system. Viruses mutate into new strains or variants over time. Some variants emerge and then disappear. Other variants become common and circulate for a long time. Several specialized laboratories statewide sequence the genomes of a fraction of all positive COVID-19 tests to determine which variants are circulating. Sequencing and reporting of variant results takes several days after a test is identified as a positive for COVID-19. Not all viruses from positive COVID-19 tests are sequenced. Knowing what variants are circulating in California informs public health and clinical action.
Note: There is a natural reporting lag in these data due to the time commitment to complete whole genome sequencing; therefore, a 14 day lag is applied to these datasets to allow for data completeness. Please note that more recent data should be used with caution.
For more information, please see: https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/COVID-Variants.aspx
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The California Department of Public Health (CDPH) is identifying the prevalence of circulating SARS-CoV-2 variants by analysing CDPH Genomic Surveillance Data and CalREDIE, CDPH's communicable disease reporting and surveillance system. Viruses mutate into new strains or variants over time. Some variants emerge and then disappear. Other variants become common and circulate for a long time. Several specialized laboratories state-wide sequence the genomes of a fraction of all positive COVID-19 tests to determine which variants are circulating. Sequencing and reporting of variant results takes several days after a test is identified as a positive for COVID-19. Not all viruses from positive COVID-19 tests are sequenced. Knowing what variants are circulating in California informs public health and clinical action.
Facebook
TwitterThis dataset includes three tables with the model-based projections and estimates as shown on CalCAT in 2025 (http://calcat.cdph.ca.gov) for California state, regions, and counties.
(1) COVID-19 Nowcasts includes the R-effective estimates for COVID-19 from the different models available for the past 80 days from the archive date and the median ensemble thereof.
(2) CalCAT Forecasts includes hospital census and admissions forecasts for COVID-19 and Influenza, and the corresponding ensemble metrics for a 4 week horizon from the archive date.
(3) Variant Proportion Nowcasts contains the Integrated Genomic Epidemiology Dataset (IGED)-based and Terra-based estimates of COVID-19 variants circulating over the past 3 months as well as model-based predictions for the proportions of the variants of concern for dates leading up to the archive date. Prediction intervals are included when available.
This dataset provides CalCAT users with programmatic access to the downloadable datasets on CalCAT.
This dataset also includes a zipped file with the historical archives of the COVID-19 Nowcasts, CalCAT Forecasts and Variant Proportion Nowcasts through 2023.
Facebook
Twitterhttps://www.immport.org/agreementhttps://www.immport.org/agreement
To characterize the genomic variation within a circulating variant and identifying potential mutations associated with breakthrough infection among persons with Delta variant SARS-CoV-2 infection
Facebook
Twitterhttps://www.immport.org/agreementhttps://www.immport.org/agreement
To identify risk factors for severe clinical outcomes among persons with SARS-CoV-2 infection and persons with varying vaccination status for COVID-19 during periods of Omicron versus Delta variant circulation
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The project is a collaborative effort of investigators from the University of California, Berkeley’s Innovative Genomics Institute (IGI) and School of Public Health (SPH); Kaiser Permanente Northern California (KPNC); and the California Department of Public Health (CDPH), with administrative and programmatic support provided by Heluna Health. Over the project period, the collaborating investigators will analyze approximately 35,000 genomes of SARS-CoV-2 specimens obtained from KPNC members and sequenced by the CDPH through its COVIDNet activities. By combining results from the genomic analysis of low-frequency alleles with clinical and epidemiologic data available in patient records, including demographic variables, COVID-19 vaccination status (dates of vaccination; number of doses; manufacturer), COVID-19 disease severity, and underlying medical conditions, we assessed which shared genomic variations are associated with a greater risk of symptomatic infection and severe clinical outcomes; COVID-19 vaccine effectiveness; and transmission of SARS-CoV-2 in the household. The project and its results can serve as a model for community-based monitoring of the evolution and spread of SARS-CoV-2 and use of the data to inform decisions about the formulation and use of COVID-19 vaccines, including booster doses and next-generation vaccines. Methods Sample collection Our samples are from Kaiser Northern California patients testing positive for SARS-CoV-2 starting June 1, 2021, and through the present. The RNA is sent to the California Department of Public Health (CDPH) lab to be sequenced by COVIDNet–a consortium of primarily UC system labs helping CDPH with the overflow and backlog of samples. Once the genomes have been sequenced, the lineage information and unique deidentified PAUI number are returned to Kaiser where this information is recorded. Metadata from this list of PAUI’s is sent weekly to UC Berkeley. The KPNC sequencing data is returned to us through a third party that is processing all CDPH genomes and stored on a server at UC Berkeley and matched with metadata using PAUI’s. Sequence analysis The raw sequencing data is processed through a SARS-CoV-2 analysis pipeline that has been modified for this work as follows. Adapter removal and trimming are performed using bbduk. The reads are then aligned to the Wuhan reference genome using minimap2 followed by primer trimming using iVAR . We next create a pileup file using samtools and use that input to create a consensus file. This consensus file is created with iVAR using a minimum depth of 10 reads and majority rule for base calling. We next use iVAR to call variants from the pileup file where we set the threshold for calling a mutation to be 0.01. This will call mutations for any loci where at least one percent of the reads are non-reference. This very low threshold allows us to capture all variation that is seen in the sequencing data. The list of variants is then annotated with the gene and amino acid change (if there is one), and whether the mutation is considered defining in any SARS-CoV-2 variants and whether that mutation is seen in only one variant. This dataset includes the fasta consensus sequences and mutation calls for each genome.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: This dataset is no longer being updated due to the end of the COVID-19 Public Health Emergency.
The California Department of Public Health (CDPH) is identifying the prevalence of circulating SARS-CoV-2 variants by analyzing CDPH Genomic Surveillance Data and CalREDIE, CDPH's communicable disease reporting and surveillance system. Viruses mutate into new strains or variants over time. Some variants emerge and then disappear. Other variants become common and circulate for a long time. Several specialized laboratories statewide sequence the genomes of a fraction of all positive COVID-19 tests to determine which variants are circulating. Sequencing and reporting of variant results takes several days after a test is identified as a positive for COVID-19. Not all viruses from positive COVID-19 tests are sequenced. Knowing what variants are circulating in California informs public health and clinical action.
Note: There is a natural reporting lag in these data due to the time commitment to complete whole genome sequencing; therefore, a 14 day lag is applied to these datasets to allow for data completeness. Please note that more recent data should be used with caution.
For more information, please see: https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/COVID-Variants.aspx