50 datasets found

[Archived] COVID-19 Deaths by Population Characteristics Over Time
healthdata.gov
data.sfgov.org
+1more
application/rdfxml +5
Updated Apr 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). [Archived] COVID-19 Deaths by Population Characteristics Over Time [Dataset]. https://healthdata.gov/dataset/-Archived-COVID-19-Deaths-by-Population-Characteri/hs5f-amst
Explore at:
csv, json, xml, application/rssxml, tsv, application/rdfxmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.sfgov.org
Description
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.

A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.

B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.

Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates

Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.

To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.

Gender * The City collects information on gender identity using these guidelines.

C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.

Dataset will not update on the business day following any federal holiday.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.

New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.

This data may not be immediately available for more recent deaths. Data updates as more information becomes available.

To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.

E. CHANGE LOG
9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department o
d
Current Population Survey (CPS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
i
COVID-19 Case Demographics Daily Trend
hub.mph.in.gov
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
COVID-19 Case Demographics Daily Trend [Dataset]. https://hub.mph.in.gov/dataset/covid-19-case-demographics-daily-trend
Explore at:
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note: 11/1/2023: Publication of the COVID data will be delayed because of technical difficulties. Note: 9/20/2023: With the end of the federal emergency and reporting requirements continuing to evolve, the Indiana Department of Health will no longer publish and refresh the COVID-19 datasets after November 15, 2023 - one final dataset publication will continue to be available. Note: 5/10/2023: Due to a technical issue updates are delayed for COVID data. New files will be published as soon as they are available. Note: 3/22/2023: Due to a technical issue updates are delayed for COVID data. New files will be published as soon as they are available. Note: 3/15/2023 test data will be removed from the COVID dashboards and HUB files in recognition of the fact that widespread use of at-home tests and a decrease in lab testing no longer provides an accurate representation of COVID-19 spread. Number of Indiana COVID-19 cases and deaths by age group, gender, race and ethnicity by day. All data displayed is preliminary and subject to change as more information is reported to IDOH. Expect historical data to change as data is reported to IDOH. Historical Changes: 1/11/2023: Due to a technical issue updates are delayed for COVID data. New files will be published as soon as they are available. 1/5/2023: Due to a technical issue the COVID datasets were not updated on 1/4/23. Updates will be published as soon as they are available. 9/29/22: Due to a technical difficulty, the weekly COVID datasets were not generated yesterday. They will be updated with current data today - 9/29 - and may result in a temporary discrepancy with the numbers published on the dashboard until the normal weekly refresh resumes 10/5. 9/27/2022: As of 9/28, the Indiana Department of Health (IDOH) is moving to a weekly COVID update for the dashboard and all associated datasets to continue to provide trend data that is applicable and usable for our partners and the public. This is to maintain alignment across the nation as states move to weekly updates. 2/10/2022: Data was not published on 2/9/2022 due to a technical issue, but updated data was released 2/10/2022. 12/30/21: This dataset has been updated, and should continue to receive daily updates. 12/15/21: The file has been adjusted with data through 12/13, and regular updates will resume to it today. 11/12/2021: Historical re-infections have been added to the case counts for all pertinent COVID datasets back to 9/1/2021 and new re-infections will be added to the total case counts as they are reported in accordance with CDC guidance. 06/23/2021: COVID Hub files will no longer be updated on Saturdays. The normal refresh of these files has been changed to Mon-Fri. 06/10/2021: COVID Hub files will no longer be updated on Sundays. The normal refresh of these files has been changed to Mon-Sat. 6/03/2021 : A batch of historical negative and positive test results added 16,492 historical tests administered, 7,082 tested individuals, and 765 historical cases to today's counts. These cases are not included in the new positive counts but have been added to the total positive cases. Today’s total case counts include historical cases received from other states. 2/4/2021 : Today’s dataset now includes 1,507 historical deaths identified through an audit of 2020 and 2021 COVID death records and test results.
A
‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by...
analyst-2.ai
Updated Feb 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-covid-19-cases-by-population-characteristics-over-time-097d/6c8f14dd/?iid=004-510&v=presentation
Explore at:
Dataset updated
Feb 15, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘COVID-19 Cases by Population Characteristics Over Time’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/a3291d85-0076-43c5-a59c-df49480cdc6d on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Note: On January 22, 2022, system updates to improve the timeliness and accuracy of San Francisco COVID-19 cases and deaths data were implemented. You might see some fluctuations in historic data as a result of this change. Due to the changes, starting on January 22, 2022, the number of new cases reported daily will be higher than under the old system as cases that would have taken longer to process will be reported earlier.

A. SUMMARY This dataset shows San Francisco COVID-19 cases by population characteristics and by specimen collection date. Cases are included on the date the positive test was collected.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how cases have been distributed among different subgroups. This information can reveal trends and disparities among groups.

Data is lagged by five days, meaning the most recent specimen collection date included is 5 days prior to today. Tests take time to process and report, so more recent data is less reliable.

B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases and deaths are from: * Case interviews * Laboratories * Medical providers

These multiple streams of data are merged, deduplicated, and undergo data verification processes. This data may not be immediately available for recently reported cases because of the time needed to process tests and validate cases. Daily case totals on previous days may increase or decrease. Learn more.

Data are continually updated to maximize completeness of information and reporting on San Francisco residents with COVID-19.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

Sexual orientation * Sexual orientation data is collected from individuals who are 18 years old or older. These individuals can choose whether to provide this information during case interviews. Learn more about our data collection guidelines. * The City began asking for this information on April 28, 2020.

Gender * The City collects information on gender identity using these guidelines.

Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

Transmission type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation
* the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures.
These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing

--- Original source retains full ownership of the source dataset ---
ARCHIVED: COVID-19 Cases by Population Characteristics Over Time
healthdata.gov
data.sfgov.org
+2more
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). ARCHIVED: COVID-19 Cases by Population Characteristics Over Time [Dataset]. https://healthdata.gov/dataset/ARCHIVED-COVID-19-Cases-by-Population-Characterist/a68b-pyq7
Explore at:
application/rdfxml, csv, tsv, json, application/rssxml, xmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.

B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from:  * Case interviews  * Laboratories  * Medical providers    These multiple streams of data are merged, deduplicated, and undergo data verification processes.  

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

Gender * The City collects information on gender identity using these guidelines.

Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives.  * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’.

Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. https://www.sfdph.org/dph/files/PoliciesProcedures/COM9_SexualOrientationGuidelines.pdf">Learn more about our data collection guidelines pertaining to sexual orientation.

Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews.

Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco po
O
COVID-19 case rate per 100,000 population and percent test positivity in the...
data.ct.gov
catalog.data.gov
application/rdfxml +5
Updated Oct 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Public Health (2020). COVID-19 case rate per 100,000 population and percent test positivity in the last 14 days by town - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/COVID-19-case-rate-per-100-000-population-and-perc/hree-nys2
Explore at:
application/rssxml, xml, csv, json, tsv, application/rdfxmlAvailable download formats
Dataset updated
Oct 22, 2020
Dataset authored and provided by
Department of Public Health
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve.

The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj.

The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 .

The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 .

The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed.

This dataset includes a count and rate per 100,000 population for COVID-19 cases, a count of COVID-19 molecular diagnostic tests, and a percent positivity rate for tests among people living in community settings for the previous two-week period. Dates are based on date of specimen collection (cases and positivity).

A person is considered a new case only upon their first COVID-19 testing result because a case is defined as an instance or bout of illness. If they are tested again subsequently and are still positive, it still counts toward the test positivity metric but they are not considered another case.

Percent positivity is calculated as the number of positive tests among community residents conducted during the 14 days divided by the total number of positive and negative tests among community residents during the same period. If someone was tested more than once during that 14 day period, then those multiple test results (regardless of whether they were positive or negative) are included in the calculation.

These case and test counts do not include cases or tests among people residing in congregate settings, such as nursing homes, assisted living facilities, or correctional facilities.

These data are updated weekly and reflect the previous two full Sunday-Saturday (MMWR) weeks (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf).

DPH note about change from 7-day to 14-day metrics: Prior to 10/15/2020, these metrics were calculated using a 7-day average rather than a 14-day average. The 7-day metrics are no longer being updated as of 10/15/2020 but the archived dataset can be accessed here: https://data.ct.gov/Health-and-Human-Services/COVID-19-case-rate-per-100-000-population-and-perc/s22x-83rd

As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well.

With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county).

Additional notes: As of 11/5/2020, CT DPH has added antigen testing for SARS-CoV-2 to reported test counts in this dataset. The tests included in this dataset include both molecular and antigen datasets. Molecular tests reported include polymerase chain reaction (PCR) and nucleic acid amplicfication (NAAT) tests.

The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used.

Data suppression is applied when the rate is <5 cases per 100,000 or if there are <5 cases within the town. Information on why data suppression rules are applied can be found online here: https://www.cdc.gov/cancer/uscs/technical_notes/stat_methods/suppression.htm
A
‘Population by Country - 2020’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘Population by Country - 2020’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-population-by-country-2020-c8b7/latest
Explore at:
Dataset updated
Feb 13, 2020
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Population by Country - 2020’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/tanuprabhu/population-by-country-2020 on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

I always wanted to access a data set that was related to the world’s population (Country wise). But I could not find a properly documented data set. Rather, I just created one manually.

Content

Now I knew I wanted to create a dataset but I did not know how to do so. So, I started to search for the content (Population of countries) on the internet. Obviously, Wikipedia was my first search. But I don't know why the results were not acceptable. And also there were only I think 190 or more countries. So then I surfed the internet for quite some time until then I stumbled upon a great website. I think you probably have heard about this. The name of the website is Worldometer. This is exactly the website I was looking for. This website had more details than Wikipedia. Also, this website had more rows I mean more countries with their population.

Once I got the data, now my next hard task was to download it. Of course, I could not get the raw form of data. I did not mail them regarding the data. Now I learned a new skill which is very important for a data scientist. I read somewhere that to obtain the data from websites you need to use this technique. Any guesses, keep reading you will come to know in the next paragraph.

https://fiverr-res.cloudinary.com/images/t_main1,q_auto,f_auto/gigs/119580480/original/68088c5f588ec32a6b3a3a67ec0d1b5a8a70648d/do-web-scraping-and-data-mining-with-python.png" alt="alt text">

You are right its, Web Scraping. Now I learned this so that I could convert the data into a CSV format. Now I will give you the scraper code that I wrote and also I somehow found a way to directly convert the pandas data frame to a CSV(Comma-separated fo format) and store it on my computer. Now just go through my code and you will know what I'm talking about.

Below is the code that I used to scrape the code from the website

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3200273%2Fe814c2739b99d221de328c72a0b2571e%2FCapture.PNG?generation=1581314967227445&alt=media" alt="">

Acknowledgements

Now I couldn't have got the data without Worldometer. So special thanks to the website. It is because of them I was able to get the data.

Inspiration

As far as I know, I don't have any questions to ask. You guys can let me know by finding your ways to use the data and let me know via kernel if you find something interesting

--- Original source retains full ownership of the source dataset ---
[Dataset] Data for the course "Population Genomics" at Aarhus University
zenodo.org
application/gzip, bin
Updated Jan 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch (2025). [Dataset] Data for the course "Population Genomics" at Aarhus University [Dataset]. http://doi.org/10.5281/zenodo.7670839
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7670839
Dataset updated
Jan 8, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch. This course material is maintained by the health data science sandbox. This webpage shows the latest version of the course material.

Data.tar.gz Contains the datasets and executable files for some of the softwares
You can unpack by simply doing
tar -zxf Data.tar.gz -C ./
This will create a folder called Data with the uncompressed material inside

Course_Env.packed.tar.gz Contains the conda environment used for the course. This needs to be unpacked to adjust all the prefixes (Note this environment is created on Ubuntu 22.10). You do this in the command line by

creating the folder Course_Env: mkdir Course_Env

untar the file: tar -zxf Course_Env.packed.tar.gz -C Course_Env

Activate the environment: conda activate ./Course_Env

Run the unpacking script (it can take quite some time to get it done): conda-unpack

Course_Env.unpacked.tar.gz The same environment as above, but will work only if untarred into the folder /usr/Material - so use the version above if you are using it in another folder. This file is mostly to execute the course in our own cloud environment.

environment_with_args.yml The file needed to generate the conda environment. Create and activate the environment with the following commands:

conda env create -f environment_with_args.yml -p ./Course_Env

conda activate ./Course_Env

The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.

Description

The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.

The participants must at the end of the course be able to:

Identify an experimental platform relevant to a population genomic analysis.

Apply commonly used population genomic methods.

Explain the theory behind common population genomic methods.

Reflect on strengths and limitations of population genomic methods.

Interpret and analyze results of population genomic inference.

Formulate population genetics hypotheses based on data

The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.

Curriculum

The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.

Course plan

Course intro and overview:

Coop chapters 1, 2, 3, Paper: Genome Diversity Project

Drift and the coalescent:

Coop chapter 4; Paper: Platypus

Exercise: Read mapping and base calling

Recombination:

Lecture: Review: Recombination in eukaryotes, Review: Recombination rate estimation

Exercise: Phasing and recombination rate

Population strucure and incomplete lineage sorting:

Lecture: Coop chapter 6, Review: Incomplete lineage sorting

Exercise: Working with VCF files

Hidden Markov models:

Lecture: Durbin chapter 3, Paper: population structure

Exercise: Inference of population structure and admixture

Ancestral recombination graphs:

Lecture: Paper: Approximating the ARG, Paper: Tree inference

Exercise: ARG dashboard exercises + Inference of trees along sequence

Past population demography:

Lecture: Coop chapter 4, Paper: PSMC, revisit Paper: Tree inference

Exercise: Inferring historical populations

Direct and linked selection:

Lecture: Coop chapters 12, 13, revisit Paper: Tree inference

Admixture:

Lecture: Review: Admixture, Paper: Admixture inference

Exercise: Detecting archaic ancestry in modern humans

Genome-wide association study (GWAS):

Lecture: Coop lecture notes 99-120

Exercise: GWAS quality control

Heritability:

Lecture: Coop Lecture notes Sec. 2.2 (p23-36) + Chap. 7 (p119-142)

Exercise: Association testing

Evolution and disease:

Lecture: Coop Lecture notes Sec. 11.0.1 (p217-221)

Exercise: Estimating heritability
d
DSS Benefit and Payment Recipient Demographics - quarterly data
data.gov.au
researchdata.edu.au
.xlsx, csv +3
Updated May 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Social Services (2025). DSS Benefit and Payment Recipient Demographics - quarterly data [Dataset]. https://data.gov.au/data/dataset/dss-payment-demographic-data
Explore at:
excel (.xlsx)(1566083), csv, excel (.xlsx), excel (.xlsx)(1612709), xlsx(1328672), xlsx, xlsx(1620878), xlsx(1318808), xlsx(1293409), .xlsx(1582185), excel (.xlsx)(1719096), excel (xlsx)(1619658), xlsx(1615572), excel (.xlsx)(1620917), excel (.xlsx)(544421), xlsx(1572129), xlsx(1556969), xlsx(1474650), excel (.xlsx)(1593519), excel (.xlsx)(1618018), excel (.xlsx)(1100863), xlsx(1613556), xlsx(1128550), excel (.xlsx)(2319953), excel (.xlsx)(1549173), excel (.xlsx)(1035515), excel (.xlsx)(2317250), excel (.xlsx)(1091961), xlsx(1057446), excel (.xlsx)(1334077), xlsx(1582550), xlsx(1371015), excel (.xlsx)(1646224), xlsx(1556837), excel (.xlsx)(2322747), xlsx(1096182), excel (.xlsx)(2337811), xlsx(1534161), xlsx(1054524), excel (.xlsx)(1825047), excel (.xlsx)(1383273)Available download formats
Dataset updated
May 30, 2025
Dataset authored and provided by
Department of Social Services
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
The DSS Payment Demographic data set is made up of:

Selected DSS payment data by

Geography: state/territory, electorate, postcode, LGA and SA2 (for 2015 onwards)

Demographic: age, sex and Indigenous/non-Indigenous

Duration on Payment (Working Age & Pensions)

Duration on Income Support (Working Age, Carer payment & Disability Support Pension)

Rate (Working Age & Pensions)

Earnings (Working Age & Pensions)

Age Pension assets data

JobSeeker Payment and Youth Allowance (other) Principal Carers

Activity Tested Recipients by Partial Capacity to Work (NSA,PPS & YAO)

Exits within 3, 6 and 12 months (Newstart Allowance/JobSeeker Payment, Parenting Payment, Sickness Allowance & Youth Allowance)

Disability Support Pension by medical condition

Care Receiver by medical conditions

Commonwealth Rent Assistance by Payment type and Income Unit type have been added from March 2017. For further information about Commonwealth Rent Assistance and Income Units see the Data Descriptions and Glossary included in the dataset.

From December 2022, the "DSS Expanded Benefit and Payment Recipient Demographics – quarterly data" publication has introduced expanded reporting populations for income support recipients. As a result, the reporting population for Jobseeker Payment and Special Benefit has changed to include recipients who are current but on zero rate of payment and those who are suspended from payment. The reporting population for ABSTUDY, Austudy, Parenting Payment and Youth Allowance has changed to include those who are suspended from payment. The expanded report will replace the standard report after June 2023.

Additional data for DSS Expanded Benefit and Payment Recipient Demographics – quarterly data includes:

• A new contents page to assist users locate the information within the spreadsheet

• Additional data for the ‘Suspended’ population in the ‘Payment by Rate’ tab to enable users to calculate the old reporting rules.

• Additional information on the Employment Earning by ‘Income Free Area’ tab.

From December 2022, Services Australia have implemented a change in the Centrelink payment system to recognise gender other than the sex assigned at birth or during infancy, or as a gender which is not exclusively male or female. To protect the privacy of individuals and comply with confidentialisation policy, persons identifying as ‘non-binary’ will initially be grouped with ‘females’ in the period immediately following implementation of this change. The Department will monitor the implications of this change and will publish the ‘non-binary’ gender category as soon as privacy and confidentialisation considerations allow.

Local Government Area has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2022 boundaries from June 2023.

Commonwealth Electorate Division has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.

SA2 has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.

From December 2021, the following are included in the report:

selected payments by work capacity, by various demographic breakdowns

rental type and homeownership

Family Tax Benefit recipients and children by payment type

Commonwealth Rent Assistance by proportion eligible for the maximum rate

an age breakdown for Age Pension recipients

For further information, please see the Glossary.

From June 2021, data on the Paid Parental Leave Scheme is included yearly in June releases. This includes both Parental Leave Pay and Dad and Partner Pay, across multiple breakdowns. Please see Glossary for further information.

From March 2017 the DSS demographic dataset will include top 25 countries of birth. For further information see the glossary.

From March 2016 machine readable files containing the three geographic breakdowns have also been published for use in National Map, links to these datasets are below:

Statistical Area 2 - SA2

Commonwealth Electoral Division - CED

Local Government Area - LGA

Pre June 2014 Quarter Data contains:

Selected DSS payment data by

Geography: state/territory; electorate; postcode and LGA

Demographic: age, sex and Indigenous/non-Indigenous

Note: JobSeeker Payment replaced Newstart Allowance and other working age payments from 20 March 2020, for further details see: https://www.dss.gov.au/benefits-payments/jobseeker-payment

For data on DSS payment demographics as at June 2013 or earlier, the department has published data which was produced annually. Data is provided by payment type containing timeseries’, state, gender, age range, and various other demographics. Links to these publications are below:

Statistical Paper series

Concession card data in the March and June 2020 quarters have been re-stated to address an over-count in reported cardholder numbers.

28/06/2024 – The March 2024 and December 2023 reports were republished with updated data in the ‘Carer Receivers by Med Condition’ section, updates are exclusive to the ‘Care Receivers of Carer Payment recipients’ table, under ‘Intellectual / Learning’ and ‘Circulatory System’ conditions only.
Annual Population Survey Household, 2004-2021: Secure Access
beta.ukdataservice.ac.uk
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Survey Division Office For National Statistics (2024). Annual Population Survey Household, 2004-2021: Secure Access [Dataset]. http://doi.org/10.5255/ukda-sn-6725-9
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-6725-9
Dataset updated
2024
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Authors
Social Survey Division Office For National Statistics
Description
Background
The Annual Population Survey (APS) Household datasets are produced annually and are available from 2004 (Secure Access) and 2006 (End User Licence). They allow production of family and household labour market statistics at local areas and for small sub-groups of the population across the UK. The data comprise key variables from the Labour Force Survey (LFS) (held at the UK Data Archive under GN 33246) and the APS (person) datasets (held at the Data Archive under GN 33357). The former is a quarterly survey of households living at private addresses in the UK. The latter is created by combining individuals in waves one and five from four consecutive LFS quarters with the English, Welsh and Scottish Local Labour Force Surveys (LLFS). The APS Household datasets therefore contain results from four different sources.

The APS Household datasets include all the variables on the LFS and APS person datasets except for the income variables. They also include key family and household level derived variables. These variables allow for an analysis of the combined economic activity status of the family or household. In addition they also include more detailed geographical, industry, occupation, health and age variables.

For information on the main (person) APS datasets, for which EUL and Secure Access versions are available, please see GNs 33357 and 33427, respectively.

New reweighting policy
Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published in 2021.

Secure Access APS Household data
Secure Access datasets for the APS Household survey include additional variables not included in the EUL versions (GN 33455). Extra variables that may be found in the Secure Access version but not in the EUL version relate to:
geography (see 'Spatial Units' below)
individual demographics, including age bands, day of birth, sex/marital status and detailed ethnicity
main reason for coming to the UK
number of bedrooms
health problems, work-related health problems, sickness absence from work
reasons why not in work, including health and other reasons, wage received when not in work, time away from job, and whether and when will work in the future
type of benefit claimed
education and training, including
vocational and work-related qualifications and training
class of first degree
qualifications from government schemes
number of O levels/GCSEs, etc held
qualifications held from UK and abroad
qualifications gained from school/home schooling
qualifications below highest level
other qualifications
time spent in taught courses
who paid for training
main place of education/training
length of training course
level of Welsh baccalaureate
worst 30 local authorities based on Indices of Deprivation
casual/holiday work
disability, including learning difficulty/disability
payment of own National Insurance and tax
Prospective users of the Secure Access version of an APS Household dataset will need to fulfil additional requirements, including completion of face-to-face training and agreement to Secure Access' User Agreement, in order to obtain permission to use that version (see 'Access' section below). The EUL version of the data, for which less stingent access conditions apply, may suffice for many users' research requirements. Further details and links to all APS studies can be found via the APS Key Data series webpage.

Documentation and coding frames
The APS is compiled from variables present in the LFS. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation (e.g. coding frames for education, industrial and geographic variables, which are held in LFS User Guide Vol.5, Classifications), users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.

Weighting 2022
The LFS team have been working on reweighting the datasets to account for newly delivered Real Time Information (RTI) tax information, adjusting Northern Ireland non-responses, and fixing the grossing factors where ONS had combined England and Wales (rather than doing them separately). The first two issues have been resolved but the grossing factors for England and Wales were not fully revised. This means that error remains in the calculation of some of the population weights in the APS and therefore the age breakdown of the population in both England and Wales remain affected to a small extent. The affected APS Household annual dataset is January - December 2020, and this will be revised again in the future.

Latest edition information
For the ninth edition (October 2023), the data file covering January - December 2021 has been revised.
L2 Voter and Demographic Dataset
redivis.com
application/jsonl +7
Updated Apr 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford University Libraries (2025). L2 Voter and Demographic Dataset [Dataset]. http://doi.org/10.57761/5bw8-1v66
Explore at:
sas, arrow, csv, parquet, application/jsonl, spss, avro, stataAvailable download formats
Unique identifier
https://doi.org/10.57761/5bw8-1v66
Dataset updated
Apr 9, 2025
Dataset provided by
Redivis Inc.
Authors
Stanford University Libraries
Description
Abstract

The L2 Voter and Demographic Dataset includes demographic and voter history tables for all 50 states and the District of Columbia. The dataset is built from publicly available government records about voter registration and election participation. These records indicate whether a person voted in an election or not, but they do not record whom that person voted for. Voter registration and election participation data are augmented by demographic information from outside data sources.

The L2 Voter and Demographic Dataset is current as of April 7 2025.

Methodology

To create this file, L2 processes registered voter data on an ongoing basis for all 50 states and the District of Columbia, with refreshes of the underlying state voter data typically at least every six months and refreshes of telephone numbers and National Change of Address processing approximately every 30 to 60 days. These data are standardized and enhanced with propriety commercial data and modeling codes and consist of approximately 185,000,000 records nationwide.

Usage

For each state, there are two available tables: demographic and voter history. The demographic and voter tables can be joined on the LALVOTERIDvariable. One can also use the LALVOTERIDvariable to link the L2 Voter and Demographic Dataset with the L2 Consumer Dataset.

In addition, the LALVOTERIDvariable can be used to validate the state. For example, let's look at the LALVOTERID = LALCA3169443. The characters in the fourth and fifth positions of this identifier are 'CA' (California). The second way to validate the state is by using the RESIDENCE_ADDRESSES_STATEvariable, which should have a value of 'CA' (California).

The date appended to each table name represents when the data was last updated. These dates will differ state by state because states update their voter files at different cadences.

The demographic files use 698 consistent variables. For more information about these variables, see 2025-01-10-VM2-File-Layout.xlsx.

The voter history files have different variables depending on the state. The ***2025-04-07-L2-Voter-Dictionaries.tar.gz file contains .csv data dictionaries for each state's demographic and voter files. While the demographic file data dictionaries should mirror the 2025-01-10-VM2-File-Layout.xlsx*** file, the voter file data dictionaries will be unique to each state.

***2025-01-10-National-File-Notes.pdf ***contains L2 Voter and Demographic Dataset ("National File") release notes from 2018 to 2025.

***2025-04-07-L2-Voter-Fill-Rate.tar.gz ***contains .tab files tracking the percent of non-null values for any given field.

Bulk Data Access

Data access is required to view this section.

DataMapping Tool

Data access is required to view this section.
COVID-19 Case Surveillance Public Use Data
data.cdc.gov
healthdata.gov
+6more
application/rdfxml +5
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Data, Analytics and Visualization Task Force (2024). COVID-19 Case Surveillance Public Use Data [Dataset]. https://data.cdc.gov/w/vbim-akqf/tdwk-ruhb?cur=Il2CHDHWMfO
Explore at:
csv, application/rssxml, application/rdfxml, tsv, xml, jsonAvailable download formats
Dataset updated
Jul 9, 2024
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Data, Analytics and Visualization Task Force
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.

CDC has three COVID-19 case surveillance datasets:
COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)
COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)
COVID-19 Case Surveillance Restricted Access Detailed Data: Restricted access, patient-level dataset with clinical and symptom data, demographics, and state and county of residence. Access requires a registration process and a data use agreement. (33 data elements)
The following apply to all three datasets:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data cells are suppressed to protect individual privacy.
The datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensures that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.

For more information: NNDSS Supports the COVID-19 Response | CDC.

The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.

COVID-19 Case Reports

COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.

All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.

Data are Considered Provisional

The COVID-19 case surveillance data are dynamic; case reports can be modified at any time by the jurisdictions sharing COVID-19 data with CDC. CDC may update prior cases shared with CDC based on any updated information from jurisdictions. For instance, as new information is gathered about previously reported cases, health departments provide updated data to CDC. As more information and data become available, analyses might find changes in surveillance data and trends during a previously reported time window. Data may also be shared late with CDC due to the volume of COVID-19 cases.
Annual finalized data: To create the final NNDSS data used in the annual tables, CDC works carefully with the reporting jurisdictions to reconcile the data received during the year until each state or territorial epidemiologist confirms that the data from their area are correct.
Access Addressing Gaps in Public Health Reporting of Race and Ethnicity for COVID-19, a report from the Council of State and Territorial Epidemiologists, to better understand the challenges in completing race and ethnicity data for COVID-19 and recommendations for improvement.

Data Limitations

To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.

Data Quality Assurance Procedures

CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
Questions that have been left unanswered (blank) on the case report form are reclassified to a Missing value, if applicable to the question. For example, in the question “Was the individual hospitalized?” where the possible answer choices include “Yes,” “No,” or “Unknown,” the blank value is recoded to Missing because the case report form did not include a response to the question.
Logic checks are performed for date data. If an illogical date has been provided, CDC reviews the data with the reporting jurisdiction. For example, if a symptom onset date in the future is reported to CDC, this value is set to null until the reporting jurisdiction updates the date appropriately.
Additional data quality processing to recode free text data is ongoing. Data on symptoms, race and ethnicity, and healthcare worker status have been prioritized.

Data Suppression

To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.

For questions, please contact Ask SRRG (eocevent394@cdc.gov).

Additional COVID-19 Data

COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
n
L2 Political Academic Voter File, 2020-03-01 Delivery
ultraviolet.library.nyu.edu
Updated Apr 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
L2 Data Company (2025). L2 Political Academic Voter File, 2020-03-01 Delivery [Dataset]. http://doi.org/10.58153/g7ang-ptb12
Explore at:
Unique identifier
https://doi.org/10.58153/g7ang-ptb12
Dataset updated
Apr 25, 2025
Dataset provided by
L2 Data Company
Time period covered
Feb 19, 2020 - Jul 30, 2020
Description
NYU Libraries has licensed access to the L2 Political Academic Voter File. The file is a continuously updated dataset consisting of public information for every registered voter in the United States and includes basic socio-demographic indicators (some of which are modeled), consumer preferences, political party affiliation, voting history, and more.

The data consists of .tab files organized into individual state folders (all states and DC). Each state folder contains two files: demographics data and voter history data, with a data dictionary for each dataset. The size of the folders vary by state and data for all states adds up to approximately 40 GB. The data is organized into releases, generally two per year (spring and fall), which represent a snapshot of the country's voters at the time of the dataset creation.

NYU has also licensed access to L2 Political historical backlog of data. This backlog includes versions of the L2 Processed voter file going back to 2008 (for most U.S. states) and unprocessed "raw" state voter rolls, also going back to 2008 for most U.S. states.

This collection is available to NYU faculty and students only, and requires user to first submit a data management plan to account for how access and storage of the data will be handled. Information on how to submit a request to use this data and create a data management plan is available at https://guides.nyu.edu/l2political.
D
ARCHIVED: COVID-19 Cases by Geography Over Time
data.sfgov.org
application/rdfxml +5
Updated Jul 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Public Health - Population Health Division (2020). ARCHIVED: COVID-19 Cases by Geography Over Time [Dataset]. https://data.sfgov.org/w/d2ef-idww/ikek-yizv?cur=6pe39zMjfCR&from=f5tFBDuJcU8
Explore at:
xml, application/rdfxml, json, tsv, csv, application/rssxmlAvailable download formats
Dataset updated
Jul 17, 2020
Dataset authored and provided by
Department of Public Health - Population Health Division
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
A. SUMMARY This dataset contains COVID-19 positive confirmed cases aggregated by several different geographic areas and by day. COVID-19 cases are mapped to the residence of the individual and shown on the date the positive test was collected. In addition, 2016-2020 American Community Survey (ACS) population estimates are included to calculate the cumulative rate per 10,000 residents.

Dataset covers cases going back to 3/2/2020 when testing began. This data may not be immediately available for recently reported cases and data will change to reflect as information becomes available. Data updated daily.

Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas

B. HOW THE DATASET IS CREATED Addresses from the COVID-19 case data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area for a given date.

The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a cumulative rate which is equal to ([cumulative count up to that date] / [acs_population]) * 10000) representing the number of total cases per 10,000 residents (as of the specified date).

COVID-19 case data undergo quality assurance and other data verification processes and are continually updated to maximize completeness and accuracy of information. This means data may change for previous days as information is updated.

C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 05:00 Pacific Time.

D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset can be used to track the spread of COVID-19 throughout the city, in a variety of geographic areas. Note that the new cases column in the data represents the number of new cases confirmed in a certain area on the specified day, while the cumulative cases column is the cumulative total of cases in a certain area as of the specified date.

Privacy rules in effect To protect privacy, certain rules are in effect: 1. Any area with a cumulative case count less than 10 are dropped for all days the cumulative count was less than 10. These will be null values. 2. Once an area has a cumulative case count of 10 or greater, that area will have a new row of case data every day following. 3. Cases are dropped altogether for areas where acs_population < 1000 4. Deaths data are not included in this dataset for privacy reasons. The low COVID-19 death rate in San Francisco, along with other publicly available information on deaths, means that deaths data by geography and day is too granular and potentially risky. Read more in our privacy guidelines

Rate suppression in effect where counts lower than 20 Rates are not calculated unless the cumulative case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology.

A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website.

Rows included for Citywide case counts Rows are included for the Citywide case counts and incidence rate every day. These Citywide rows can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongoing data quality efforts result in improved mapping on a rolling bases.

Related dataset See the dataset of the most recent cumulative counts for all geographic areas here: https://data.sfgov.org/COVID-19/COVID-19-Cases-and-Deaths-Summarized-by-Geography/tpyr-dvnc

E. CHANGE LOG
9/11/2023 - data on COVID-19 cases by geography over time are no longer being updated. This data is currently through 9/6/2023 and will not include any new data after this date.
4/6/2023 - the State implemented system updates to improve the integrity of historical data.
2/21/2023 - system updates to improve reliability and accuracy of cases data were implemented.
1/31/2023 - updated “acs_population” column to reflect the 2020 Census Bureau American Community Survey (ACS) San Francisco Population estimates.
1/31/2023 - implemented system updates to streamline and improve our geo-coded data, resulting in small shifts in our case data by geography.
1/31/2023 - renamed column “last_updated_at” to “data_as_of”.
1/31/2023 - removed the “multipolygon” column. To access the multipolygon geometry column for each geography unit, refer to COVID-19 Cases and Deaths Summarized by Geography.
1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
4/16/2021 - dataset updated to refresh with a five-day data lag.
d
Johns Hopkins COVID-19 Case Tracker
data.world
csv, zip
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 2, 2025
Authors
The Associated Press
Description
Updates

Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

CDC Weekly case and death counts (national and state level)

CDC County level cases and deaths

HHS New hospital admissions

CDC NowCast COVID variant proportions (national and regional level)

April 9, 2020

The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.

April 20, 2020

Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.

April 29, 2020

The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.

September 1st, 2020

Johns Hopkins is now providing counts for the five New York City counties individually.

February 12, 2021

The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."

Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.

February 16, 2021

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

The AP is updating this dataset hourly at 45 minutes past the hour.

To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

Queries

Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

Filter cases by state here

Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac

Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true

Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.

Pull the 100 counties with the highest per-capita confirmed cases here

Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.

Interactive

The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

@(https://datawrapper.dwcdn.net/nRyaf/15/)

Interactive Embed Code

<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>

Caveats

This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.

In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.

In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"

This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.

Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.

The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

Attribution

This data should be credited to Johns Hopkins University COVID-19 tracking project
d
ACS 5 Year Data by Ward
catalog.data.gov
data.cityofchicago.org
+1more
Updated Jun 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofchicago.org (2025). ACS 5 Year Data by Ward [Dataset]. https://catalog.data.gov/dataset/acs-5-year-data-by-ward
Explore at:
Dataset updated
Jun 7, 2025
Dataset provided by
data.cityofchicago.org
Description
Selected variables from the most recent 5 year ACS Community Survey (Released 2023) aggregated by Ward. Additional years will be added as they become available. The underlying algorithm to create the dataset calculates the percent of a census tract that falls within the boundaries of a given ward. Given that census tracts and ward boundaries are not aligned, these figures should be considered an estimate. Total Population in this Dataset: 2,649,803 Total Population of Chicago reported by ACS 2023: 2,664,452 % Difference: %-0.55 There are different approaches in common use for displaying Hispanic or Latino population counts. In this dataset, following the approach taken by the Census Bureau, a person who identifies as Hispanic or Latino will also be counted in the race category with which they identify. However, again following the Census Bureau data, there is also a column for White Not Hispanic or Latino. The City of Chicago is actively soliciting community input on how best to represent race, ethnicity, and related concepts in its data and policy. Every dataset, including this one, has a "Contact dataset owner" link in the Actions menu. You can use it to offer any input you wish to share or to indicate if you would be interested in participating in live discussions the City may host. Code can be found here: https://github.com/Chicago/5-Year-ACS-Survey-Data Ward Shapefile: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Wards-2023-Map/cdf7-bgn3 Census Area Python Package Documentation: https://census-area.readthedocs.io/en/latest/index.html
d
ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography
catalog.data.gov
Updated Mar 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-summarized-by-geography
Explore at:
Dataset updated
Mar 29, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by several different geographic areas and normalized by 2016-2020 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents. On September 12, 2021, a new case definition of COVID-19 was introduced that includes criteria for enumerating new infections after previous probable or confirmed infections (also known as reinfections). A reinfection is defined as a confirmed positive PCR lab test more than 90 days after a positive PCR or antigen test. The first reinfection case was identified on December 7, 2021. Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset. Dataset is cumulative and covers cases going back to 3/2/2020 when testing began. Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas B. HOW THE DATASET IS CREATED Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents. C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 7:30 Pacific Time. D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS). Privacy rules in effect To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Death counts greater than 0 and less than 10 are dropped - these will be null (blank) values 3. Cases and deaths dropped altogether for areas where acs_population < 1000 Rate suppression in effect where counts lower than 20 Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology. A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website. Row included for Citywide case counts, incidence rate, and deaths A single row is included that has the Citywide case counts and incidence rate. This can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongo
g
Census of Population and Housing, 2000 [United States]: Summary File 2,...
search.gesis.org
Updated Feb 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Commerce. Bureau of the Census (2021). Census of Population and Housing, 2000 [United States]: Summary File 2, Advance National - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR13288
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR13288
Dataset updated
Feb 16, 2021
Dataset provided by
ICPSR - Interuniversity Consortium for Political and Social Research
GESIS search
Authors
United States Department of Commerce. Bureau of the Census
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de446233https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de446233
Area covered
United States
Description
Abstract (en): Summary File 2 contains 100-percent United States decennial Census data, which is the information compiled from the questions asked of all people and about every housing unit. Population items include sex, age, race, Hispanic or Latino origin, household relationship, and group quarters occupancy. Housing items include occupancy status, vacancy status, and tenure (owner-occupied or renter- occupied). The 100-percent data are presented in 36 population tables ("PCT") and 11 housing tables ("HCT") down to the census tract level. Each table is iterated for 250 population groups: the total population, 132 race groups, 78 American Indian and Alaska Native tribe categories (reflecting 39 individual tribes), and 39 Hispanic or Latino groups. The presentation of tables for any of the 250 population groups is subject to a population threshold of 100 or more people -- that is, if there were fewer than 100 people in a specific population group in a specific geographic area, their population and housing characteristics data are not available for that geographic area. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.. All persons in housing units in United States in 2000. 2013-05-24 Multiple Census data file segments were repackaged for distribution into a single zip archive per dataset. No changes were made to the data or documentation.2006-01-12 All files were removed from dataset 256 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 255 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 254 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 253 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 252 and flagged as study-level files, so that they will accompany all downloads. The data are provided in four segments (files) per iteration. These segments are PCT1-PCT4, PCT5-PCT19, PCT20-PCT36, and HCT1-HCT11. The iterations are Parts 1-250, the Geographic Header file is Part 251. The Geographic Header file is in fixed-format ASCII and the Table files are in comma-delimited ASCII format. The Geographic Header file has 85 variables, Segment 01 has 224 variables, Segment 02 has 240 variables, Segment 03 has 179 variables, and Segment 04 has 141 variables. When all the segments are merged there are 849 variables.
c
Data from: Populations Past Data: Demographic and Socio-economic Data for...
datacatalogue.cessda.eu
Updated May 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reid, A; Jaadla, H; Garrett, E; Schurer, K (2025). Populations Past Data: Demographic and Socio-economic Data for Registration Sub-districts of England and Wales, 1851-1911, and Registration Districts of Scotland, 1851-1901 [Dataset]. http://doi.org/10.5255/UKDA-SN-857758
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-857758
Dataset updated
May 27, 2025
Dataset provided by
University of Edinburgh
University of Cambridge
Authors
Reid, A; Jaadla, H; Garrett, E; Schurer, K
Area covered
England, Scotland
Variables measured
Geographic Unit, Time unit
Measurement technique
These data were derived from existing data sources - see answer to Data sourcing, processing and preparation for more details. The data covers the entire nations of England and Wales (1851-1871) and Scotland (1851-1901).
Description
This dataset contains a variety of demographic measures (related to fertility, marriage, mortality and migration), plus a range of socio-economic indicators (related to households, age structure, and social class) for the 2000+ Registration Sub Districts (RSDs) in England and Wales for each census year between 1851 and 1911, and for the 600+ Registration Districts of Scotland 1851-1901. The measures have mainly been derived from the computerised individual level census enumerators' books (and household schedules for 1911) enhanced under the I-CeM project. I-CeM does not currently include data for England and Wales 1871, although the project has been able to access a version of the data for that year it does not contain information necessary to calculate many of the variables presented here. Scotland 1911 is also not available. Users should therefore beware that 1871 does not contain data for many of the variables. Additional data has been derived from the tables summarising numbers of births and deaths by year and areas, which were published by the Registrar General of England and Wales in his quarterly, annual and decennial reports of births, deaths and marriages. Data from the decennial reports was obtained from Woods (SN 3552) and we transcribed data from the quarterly and annual reports ourselves. Counts of births and deaths for Scottish Registration Districts were obtained from the Digitising Scotland project at the University of Edinburgh. The dataset builds on SN 8613 and SN 853547 which provide data for a more limited set of variables and for England and Wales only (the same dataset also has two UKDS SN numbers as it was re-routed by UKDS during the deposit process).
This project will present the first historic population geography of Great Britain during the late nineteenth century. This was a period of unprecedented demographic change, when both mortality and fertility started the dramatic secular declines of the first demographic transition. National trends are well established: mortality decline started in childhood and early adulthood, with infant mortality lagging behind, particularly in urban-industrial areas. The fall in fertility was led by the middle classes but quickly spread throughout society. Urban growth was fuelled by movement from the countryside to the city, but there was also considerable migration overseas, particularly from Scotland, although to some extent outmigration was offset by immigration. There was local and regional variation in these patterns, and a contrast between the demographic experiences of Scotland and of England and Wales. Marriage was later in Scotland but fertility within marriage higher, and the improvement in Scottish mortality was slower than that south of the border. However, while there has been research on local and regional patterns within each country, these have mainly been pursued separately, and it is therefore unclear whether there were real national differences or whether there were local demographic continuities across borders, and if so whether they followed economic, occupational, cultural or even linguistic lines. Understanding population processes involves a holistic appreciation of the interaction between the basic demographic components of fertility, mortality, nuptiality and migration, and how they come together, interacting with economic and cultural processes, to create a specific demographic system via the spread of people and ideas. This project is the first to consider a historical population geography of the whole of Great Britain across the first demographic transition, drawing together measures of nuptiality, fertility, mortality and migration for small geographic areas and unpacking how they interacted to produce the more readily available broad-brush national patterns for Scotland and for England and Wales.

We will build on our immensely successful project on the fertility of Victorian England and Wales, which used complete count census data for England and Wales to calculate more detailed fertility measures than ever previously possible for some 2000 small geographic areas and 8 social groups, allowing the investigation of intra-urban as well as urban-rural differences in fertility. The new measures allowed us to examine age patterns of fertility across the two countries for the first time. We were also able to calculate contextual variables from the census data which allowed us to undertake spatial analysis of the influences on fertility over time. As well as academic papers, our previous project presented summary data at a fine spatial resolution in an interactive online atlas, populationspast.org, a major new resource which is already being widely used as a teaching tool in both schools and universities.

In this new project we will calculate comparable measures of fertility and contextual variables using the full count census data for Scotland, 1851 to 1901 inclusive, to complement those for England and Wales....
w
Fire statistics data tables
gov.uk
s3.amazonaws.com
Updated Apr 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
Explore at:
Dataset updated
Apr 17, 2025
Dataset provided by
GOV.UK
Authors
Ministry of Housing, Communities and Local Government
Description

On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/" class="govuk-link">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety" class="govuk-link">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/" class="govuk-link">Northern Ireland: Fire and Rescue Statistics.

If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@homeoffice.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

Related content

Fire statistics guidance
Fire statistics incident level datasets

Incidents attended

https://assets.publishing.service.gov.uk/media/67fe79e3393a986ec5cf8dbe/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 126 KB) Previous FIRE0101 tables

https://assets.publishing.service.gov.uk/media/67fe79fbed87b81608546745/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 1.56 MB) Previous FIRE0102 tables

https://assets.publishing.service.gov.uk/media/67fe7a20694d57c6b1cf8db0/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 156 KB) Previous FIRE0103 tables

https://assets.publishing.service.gov.uk/media/67fe7a40ed87b81608546746/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 331 KB) Previous FIRE0104 tables

Dwelling fires attended

https://assets.publishing.service.gov.uk/media/67fe7a5f393a986ec5cf8dc0/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, <span class="gem-c-attachm

Facebook

Twitter

Click to copy link

Link copied

Cite

data.sfgov.org (2025). [Archived] COVID-19 Deaths by Population Characteristics Over Time [Dataset]. https://healthdata.gov/dataset/-Archived-COVID-19-Deaths-by-Population-Characteri/hs5f-amst

[Archived] COVID-19 Deaths by Population Characteristics Over Time

Explore at:

csv, json, xml, application/rssxml, tsv, application/rdfxmlAvailable download formats

Dataset updated

Apr 8, 2025

Dataset provided by

data.sfgov.org

Description

As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.

A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.

B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.

Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates

Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.

To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.

Gender * The City collects information on gender identity using these guidelines.

C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.

Dataset will not update on the business day following any federal holiday.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.

New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.

This data may not be immediately available for more recent deaths. Data updates as more information becomes available.

To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.

E. CHANGE LOG

9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department o

Clear search

Close search

Google apps

Main menu

[Archived] COVID-19 Deaths by Population Characteristics Over Time

Current Population Survey (CPS)

COVID-19 Case Demographics Daily Trend

‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by...

ARCHIVED: COVID-19 Cases by Population Characteristics Over Time

COVID-19 case rate per 100,000 population and percent test positivity in the...

‘Population by Country - 2020’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

[Dataset] Data for the course "Population Genomics" at Aarhus University

DSS Benefit and Payment Recipient Demographics - quarterly data

Annual Population Survey Household, 2004-2021: Secure Access

L2 Voter and Demographic Dataset

Abstract

Methodology

Usage

Bulk Data Access

DataMapping Tool

COVID-19 Case Surveillance Public Use Data

CDC has three COVID-19 case surveillance datasets:

Overview

COVID-19 Case Reports

Data are Considered Provisional

Data Limitations

Data Quality Assurance Procedures

Data Suppression

Additional COVID-19 Data

L2 Political Academic Voter File, 2020-03-01 Delivery

ARCHIVED: COVID-19 Cases by Geography Over Time

Johns Hopkins COVID-19 Case Tracker

Updates

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

Queries

Interactive

Interactive Embed Code

Caveats

Attribution

ACS 5 Year Data by Ward

ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography

Census of Population and Housing, 2000 [United States]: Summary File 2,...

Data from: Populations Past Data: Demographic and Socio-economic Data for...

Fire statistics data tables

Related content

Incidents attended

Dwelling fires attended

[Archived] COVID-19 Deaths by Population Characteristics Over Time