100+ datasets found

g
Coronavirus (Covid-19) Data in the United States
github.com
openicpsr.org
+2more
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
Explore at:
csvAvailable download formats
Dataset provided by
New York Times
License
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
Description
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
BlogFeedback Data Set
kaggle.com
zip
Updated Jul 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julio Tentor (2022). BlogFeedback Data Set [Dataset]. https://www.kaggle.com/datasets/jtentor/blogfeedback-data-set
Explore at:
zip(2550651 bytes)Available download formats
Dataset updated
Jul 15, 2022
Authors
Julio Tentor
Description
Source:

Krisztian Buza Budapest University of Technology and Economics buza '@' cs.bme.hu http://www.cs.bme.hu/~buza

You can download a zip file from https://archive.ics.uci.edu/ml/datasets/BlogFeedback

Data Set Information:

This data originates from blog posts. The raw HTML-documents of the blog posts were crawled and processed.

The prediction task associated with the data is the prediction of the number of comments in the upcoming 24 hours.

In order to simulate this situation, we choose a basetime (in the past) and select the blog posts that were published at most 72 hours before the selected base date/time. Then, we calculate all the features of the selected blog posts from the information that was available at the basetime, therefore each instance corresponds to a blog post. The target is the number of comments that the blog post received in the next 24 hours relative to the base time.

In the train data, the base times were in the years 2010 and 2011. In the test data the base times were in February and March 2012.

This simulates the real-world situation in which training data from the past is available to predict events in the future.

The train data was generated from different base times that may temporally overlap.

Therefore, if you simply split the train into disjoint partitions, the underlying time intervals may overlap.

Therefore, you should use the provided, temporally disjoint train and test splits in order to ensure that the evaluation is fair.

** Attribute Information:**

1...50: Average, standard deviation, min, max and median of the Attributes 51...60 for the source of the current blog post. With source we mean the blog on which the post appeared. For example, myblog.blog.org would be the source of the post myblog.blog.org/post_2010_09_10

51: Total number of comments before basetime 52: Number of comments in the last 24 hours before the base time 53: Let T1 denote the datetime 48 hours before basetime, Let T2 denote the datetime 24 hours before basetime. This attribute is the number of comments in the time period between T1 and T2 54: Number of comments in the first 24 hours after the publication of the blog post, but before basetime 55: The difference of Attribute 52 and Attribute 53 56...60: The same features as the attributes 51...55, but features 56...60 refer to the number of links (trackbacks), while features 51...55 refer to the number of comments. 61: The length of time between the publication of the blog post and base time 62: The length of the blog post 63...262: The 200 bag of words features for 200 frequent words of the text of the blog post 263...269: binary indicator features (0 or 1) for the weekday (Monday...Sunday) of the basetime 270...276: binary indicator features (0 or 1) for the weekday (Monday...Sunday) of the date of publication of the blog post 277: Number of parent pages: we consider a blog post P as a parent of blog post B, if B is a reply (trackback) to blog post P. 278...280: Minimum, maximum, average number of comments that the parents received 281: The target: the number of comments in the next 24 hours (relative to base time)

** Relevant Papers:**

Buza, K. (2014). Feedback Prediction for Blogs. In Data Analysis, Machine Learning and Knowledge Discovery (pp. 145-152). Springer International Publishing (http://cs.bme.hu/~buza/pdfs/gfkl2012_blogs.pdf).
Number of data compromises and impacted individuals in U.S. 2005-2024
statista.com
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of data compromises and impacted individuals in U.S. 2005-2024 [Dataset]. https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/
Explore at:
Dataset updated
Jul 14, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
In 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.
USA Name Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data.gov (2019). USA Name Data [Dataset]. https://www.kaggle.com/datasets/datagov/usa-names
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
Data.govhttps://data.gov/
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Context

Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States

Content

This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.

All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names

https://cloud.google.com/bigquery/public-data/usa-names

Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @dcp from Unplash.

Inspiration

What are the most common names?

What are the most common female names?

Are there more female or male names?

Female names by a wide margin?
COVID-19 - Vaccinations by Region, Age, and Race-Ethnicity - Historical
healthdata.gov
data.cityofchicago.org
+2more
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofchicago.org (2025). COVID-19 - Vaccinations by Region, Age, and Race-Ethnicity - Historical [Dataset]. https://healthdata.gov/dataset/COVID-19-Vaccinations-by-Region-Age-and-Race-Ethni/gdfz-hxz9
Explore at:
application/rssxml, csv, json, application/rdfxml, tsv, xmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.cityofchicago.org
Description
NOTE: This dataset has been retired and marked as historical-only. The recommended dataset to use in its place is https://data.cityofchicago.org/Health-Human-Services/COVID-19-Vaccination-Coverage-Region-HCEZ-/5sc6-ey97.

COVID-19 vaccinations administered to Chicago residents by Healthy Chicago Equity Zones (HCEZ) based on the reported address, race-ethnicity, and age group of the person vaccinated, as provided by the medical provider in the Illinois Comprehensive Automated Immunization Registry Exchange (I-CARE).

Healthy Chicago Equity Zones is an initiative of the Chicago Department of Public Health to organize and support hyperlocal, community-led efforts that promote health and racial equity. Chicago is divided into six HCEZs. Combinations of Chicago’s 77 community areas make up each HCEZ, based on geography. For more information about HCEZs including which community areas are in each zone see: https://data.cityofchicago.org/Health-Human-Services/Healthy-Chicago-Equity-Zones/nk2j-663f

Vaccination Status Definitions:

·People with at least one vaccine dose: Number of people who have received at least one dose of any COVID-19 vaccine, including the single-dose Johnson & Johnson COVID-19 vaccine.

·People with a completed vaccine series: Number of people who have completed a primary COVID-19 vaccine series. Requirements vary depending on age and type of primary vaccine series received.

·People with a bivalent dose: Number of people who received a bivalent (updated) dose of vaccine. Updated, bivalent doses became available in Fall 2022 and were created with the original strain of COVID-19 and newer Omicron variant strains.

Weekly cumulative totals by vaccination status are shown for each combination of race-ethnicity and age group within an HCEZ. Note that each HCEZ has a row where HCEZ is “Citywide” and each HCEZ has a row where age is "All" so care should be taken when summing rows.

Vaccinations are counted based on the date on which they were administered. Weekly cumulative totals are reported from the week ending Saturday, December 19, 2020 onward (after December 15, when vaccines were first administered in Chicago) through the Saturday prior to the dataset being updated.

Population counts are from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-year estimates.

Coverage percentages are calculated based on the cumulative number of people in each population subgroup (age group by race-ethnicity within an HCEZ) who have each vaccination status as of the date, divided by the estimated number of people in that subgroup.

Actual counts may exceed population estimates and lead to >100% coverage, especially in small race-ethnicity subgroups of each age group within an HCEZ. All coverage percentages are capped at 99%.

All data are provisional and subject to change. Information is updated as additional details are received and it is, in fact, very common for recent dates to be incomplete and to be updated as time goes on. At any given time, this dataset reflects data currently known to CDPH.

Numbers in this dataset may differ from other public sources due to when data are reported and how City of Chicago boundaries are defined.

CDPH uses the most complete data available to estimate COVID-19 vaccination coverage among Chicagoans, but there are several limitations that impact its estimates. Data reported in I-CARE only includes doses administered in Illinois and some doses administered outside of Illinois reported historically by Illinois providers. Doses administered by the federal Bureau of Prisons and Department of Defense are also not currently reported in I-CARE. The Veterans Health Administration began reporting doses in I-CARE beginning September 2022. Due to people receiving vaccinations that are not recorded in I-CARE that can be linked to their record, such as someone receiving a vaccine dose in another state, the number of people with a completed series or a booster dose is underesti
U.S. Facebook data requests from government agencies 2013-2023
statista.com
de.statista.com
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, U.S. Facebook data requests from government agencies 2013-2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
Facebook received 73,390 user data requests from federal agencies and courts in the United States during the second half of 2023. The social network produced some user data in 88.84 percent of requests from U.S. federal authorities. The United States accounts for the largest share of Facebook user data requests worldwide.
Data from: San Francisco Open Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasets/datasf/san-francisco
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
DataSF
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
San Francisco
Description
Context

DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

https://datasf.org/about/

Content

This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.

This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.

This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).

This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

http://datasf.org/

https://cloud.google.com/bigquery/public-data/sfo-311

https://cloud.google.com/bigquery/public-data/sffd-service-calls

https://cloud.google.com/bigquery/public-data/sfpd-reports

https://cloud.google.com/bigquery/public-data/sfo-trees

Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @meric from Unplash.

Inspiration

Which neighborhoods have the highest proportion of offensive graffiti?

Which complaint is most likely to be made using Twitter and in which neighborhood?

What are the most complained about Muni stops in San Francisco?

What are the top 10 incident types that the San Francisco Fire Department responds to?

How many medical incidents and structure fires are there in each neighborhood?

What’s the average response time for each type of dispatched vehicle?

Which category of police incidents have historically been the most common in San Francisco?

What were the most common police incidents in the category of LARCENY/THEFT in 2016?

Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

What is the average tree diameter?

What is the highest number of a particular species of tree planted in a single year?

Which San Francisco locations feature the largest number of trees?

Empathy dataset

zenodo.org

bin, csv, html

Updated Dec 18, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Zenodo (2024). Empathy dataset [Dataset]. http://doi.org/10.5281/zenodo.7683907

Explore at:

bin, html, csvAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.7683907

Dataset updated

Dec 18, 2024

Dataset provided by

Zenodohttp://zenodo.org/

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

The database for this study (Briganti et al. 2018; the same for the Braun study analysis) was composed of 1973 French-speaking students in several universities or schools for higher education in the following fields: engineering (31%), medicine (18%), nursing school (16%), economic sciences (15%), physiotherapy, (4%), psychology (11%), law school (4%) and dietetics (1%). The subjects were 17 to 25 years old (M = 19.6 years, SD = 1.6 years), 57% were females and 43% were males. Even though the full dataset was composed of 1973 participants, only 1270 answered the full questionnaire: missing data are handled using pairwise complete observations in estimating a Gaussian Graphical Model, meaning that all available information from every subject are used.

The feature set is composed of 28 items meant to assess the four following components: fantasy, perspective taking, empathic concern and personal distress. In the questionnaire, the items are mixed; reversed items (items 3, 4, 7, 12, 13, 14, 15, 18, 19) are present. Items are scored from 0 to 4, where “0” means “Doesn’t describe me very well” and “4” means “Describes me very well”; reverse-scoring is calculated afterwards. The questionnaires were anonymized. The reanalysis of the database in this retrospective study was approved by the ethical committee of the Erasmus Hospital.

Size: A dataset of size 1973*28

Number of features: 28

Ground truth: No

Type of Graph: Mixed graph

The following gives the description of the variables:

Feature	FeatureLabel	Domain	Item meaning from Davis 1980
001	1FS	Green	I daydream and fantasize, with some regularity, about things that might happen to me.
002	2EC	Purple	I often have tender, concerned feelings for people less fortunate than me.
003	3PT_R	Yellow	I sometimes find it difficult to see things from the “other guy’s” point of view.
004	4EC_R	Purple	Sometimes I don’t feel very sorry for other people when they are having problems.
005	5FS	Green	I really get involved with the feelings of the characters in a novel.
006	6PD	Red	In emergency situations, I feel apprehensive and ill-at-ease.
007	7FS_R	Green	I am usually objective when I watch a movie or play, and I don’t often get completely caught up in it.(Reversed)
008	8PT	Yellow	I try to look at everybody’s side of a disagreement before I make a decision.
009	9EC	Purple	When I see someone being taken advantage of, I feel kind of protective towards them.
010	10PD	Red	I sometimes feel helpless when I am in the middle of a very emotional situation.
011	11PT	Yellow	sometimes try to understand my friends better by imagining how things look from their perspective
012	12FS_R	Green	Becoming extremely involved in a good book or movie is somewhat rare for me. (Reversed)
013	13PD_R	Red	When I see someone get hurt, I tend to remain calm. (Reversed)
014	14EC_R	Purple	Other people’s misfortunes do not usually disturb me a great deal. (Reversed)
015	15PT_R	Yellow	If I’m sure I’m right about something, I don’t waste much time listening to other people’s arguments. (Reversed)
016	16FS	Green	After seeing a play or movie, I have felt as though I were one of the characters.
017	17PD	Red	Being in a tense emotional situation scares me.
018	18EC_R	Purple	When I see someone being treated unfairly, I sometimes don’t feel very much pity for them. (Reversed)
019	19PD_R	Red	I am usually pretty effective in dealing with emergencies. (Reversed)
020	20FS	Green	I am often quite touched by things that I see happen.
021	21PT	Yellow	I believe that there are two sides to every question and try to look at them both.
022	22EC	Purple	I would describe myself as a pretty soft-hearted person.
023	23FS	Green	When I watch a good movie, I can very easily put myself in the place of a leading character.
024	24PD	Red	I tend to lose control during emergencies.
025	25PT	Yellow	When I’m upset at someone, I usually try to “put myself in his shoes” for a while.
026	26FS	Green	When I am reading an interesting story or novel, I imagine how I would feel if the events in the story were happening to me.
027	27PD	Red	When I see someone who badly needs help in an emergency, I go to pieces.
028	28PT	Yellow	Before criticizing somebody, I try to imagine how I would feel if I were in their place

More information about the dataset is contained in empathy_description.html file.

Amount of data created, consumed, and stored 2010-2023, with forecasts to...
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028 [Dataset]. https://www.statista.com/statistics/871513/worldwide-data-created/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 2024
Area covered
Worldwide
Description
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status and...
healthdata.gov
odgavaprod.ogopendata.com
+2more
application/rdfxml +5
Updated Jun 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cdc.gov (2023). Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status and Booster Dose [Dataset]. https://healthdata.gov/w/pifi-rn2z/default?cur=dU-uRhCR4oE
Explore at:
xml, csv, application/rdfxml, tsv, json, application/rssxmlAvailable download formats
Dataset updated
Jun 16, 2023
Dataset provided by
data.cdc.gov
Description
Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Vaccination Status. Click 'More' for important dataset description and footnotes

Dataset and data visualization details: These data were posted on October 21, 2022, archived on November 18, 2022, and revised on February 22, 2023. These data reflect cases among persons with a positive specimen collection date through September 24, 2022, and deaths among persons with a positive specimen collection date through September 3, 2022.

Vaccination status: A person vaccinated with a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected. Additional or booster dose: A person vaccinated with a primary series and an additional or booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after receipt of an additional or booster dose of any COVID-19 vaccine on or after August 13, 2021. For people ages 18 years and older, data are graphed starting the week including September 24, 2021, when a COVID-19 booster dose was first recommended by CDC for adults 65+ years old and people in certain populations and high risk occupational and institutional settings. For people ages 12-17 years, data are graphed starting the week of December 26, 2021, 2 weeks after the first recommendation for a booster dose for adolescents ages 16-17 years. For people ages 5-11 years, data are included starting the week of June 5, 2022, 2 weeks after the first recommendation for a booster dose for children aged 5-11 years. For people ages 50 years and older, data on second booster doses are graphed starting the week including March 29, 2022, when the recommendation was made for second boosters. Vertical lines represent dates when changes occurred in U.S. policy for COVID-19 vaccination (details provided above). Reporting is by primary series vaccine type rather than additional or booster dose vaccine type. The booster dose vaccine type may be different than the primary series vaccine type. ** Because data on the immune status of cases and associated deaths are unavailable, an additional dose in an immunocompromised person cannot be distinguished from a booster dose. This is a relevant consideration because vaccines can be less effective in this group. Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Rates of COVID-19 deaths by vaccination status are reported based on when the patient was tested for COVID-19, not the date they died. Deaths usually occur up to 30 days after COVID-19 diagnosis. Participating jurisdictions: Currently, these 31 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (New York), North Carolina, Philadelphia (Pennsylvania), Rhode Island, South Dakota, Tennessee, Texas, Utah, Washington, and West Virginia; 30 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 72% of the total U.S. population and all ten of the Health and Human Services Regions. Data on cases
COVID-19 Vaccine Progress Dashboard Data by ZIP Code
data.chhs.ca.gov
healthdata.gov
+1more
csv, xlsx, zip
Updated Sep 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2025). COVID-19 Vaccine Progress Dashboard Data by ZIP Code [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-vaccine-progress-dashboard-data-by-zip-code
Explore at:
csv(21567128), csv(5478164), xlsx(7800), csv(27663424), csv(9320174), xlsx(10933), zipAvailable download formats
Dataset updated
Sep 10, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
Note: In these datasets, a person is defined as up to date if they have received at least one dose of an updated COVID-19 vaccine. The Centers for Disease Control and Prevention (CDC) recommends that certain groups, including adults ages 65 years and older, receive additional doses.

Starting on July 13, 2022, the denominator for calculating vaccine coverage has been changed from age 5+ to all ages to reflect new vaccine eligibility criteria. Previously the denominator was changed from age 16+ to age 12+ on May 18, 2021, then changed from age 12+ to age 5+ on November 10, 2021, to reflect previous changes in vaccine eligibility criteria. The previous datasets based on age 12+ and age 5+ denominators have been uploaded as archived tables.

Starting June 30, 2021, the dataset has been reconfigured so that all updates are appended to one dataset to make it easier for API and other interfaces. In addition, historical data has been extended back to January 5, 2021.

This dataset shows full, partial, and at least 1 dose coverage rates by zip code tabulation area (ZCTA) for the state of California. Data sources include the California Immunization Registry and the American Community Survey’s 2015-2019 5-Year data.

This is the data table for the LHJ Vaccine Equity Performance dashboard. However, this data table also includes ZTCAs that do not have a VEM score.

This dataset also includes Vaccine Equity Metric score quartiles (when applicable), which combine the Public Health Alliance of Southern California’s Healthy Places Index (HPI) measure with CDPH-derived scores to estimate factors that impact health, like income, education, and access to health care. ZTCAs range from less healthy community conditions in Quartile 1 to more healthy community conditions in Quartile 4.

The Vaccine Equity Metric is for weekly vaccination allocation and reporting purposes only. CDPH-derived quartiles should not be considered as indicative of the HPI score for these zip codes. CDPH-derived quartiles were assigned to zip codes excluded from the HPI score produced by the Public Health Alliance of Southern California due to concerns with statistical reliability and validity in populations smaller than 1,500 or where more than 50% of the population resides in a group setting.

These data do not include doses administered by the following federal agencies who received vaccine allocated directly from CDC: Indian Health Service, Veterans Health Administration, Department of Defense, and the Federal Bureau of Prisons.

For some ZTCAs, vaccination coverage may exceed 100%. This may be a result of many people from outside the county coming to that ZTCA to get their vaccine and providers reporting the county of administration as the county of residence, and/or the DOF estimates of the population in that ZTCA are too low. Please note that population numbers provided by DOF are projections and so may not be accurate, especially given unprecedented shifts in population as a result of the pandemic.
w
Immigration system statistics data tables
gov.uk
Updated Aug 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Home Office (2025). Immigration system statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables
Explore at:
Dataset updated
Aug 21, 2025
Dataset provided by
GOV.UK
Authors
Home Office
Description
List of the data tables as part of the Immigration system statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.

If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

Accessible file formats

The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
Please tell us what format you need. It will help us if you say what assistive technology you use.

Related content

Immigration system statistics, year ending June 2025
Immigration system statistics quarterly release
Immigration system statistics user guide
Publishing detailed data tables in migration statistics
Policy and legislative changes affecting migration to the UK: timeline
Immigration statistics data archives

Passenger arrivals

https://assets.publishing.service.gov.uk/media/689efececc5ef8b4c5fc448c/passenger-arrivals-summary-jun-2025-tables.ods">Passenger arrivals summary tables, year ending June 2025 (ODS, 31.3 KB)

‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.

Electronic travel authorisation

https://assets.publishing.service.gov.uk/media/689efd8307f2cc15c93572d8/electronic-travel-authorisation-datasets-jun-2025.xlsx">Electronic travel authorisation detailed datasets, year ending June 2025 (MS Excel Spreadsheet, 57.1 KB)
ETA_D01: Applications for electronic travel authorisations, by nationality ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality

Entry clearance visas granted outside the UK

https://assets.publishing.service.gov.uk/media/68b08043b430435c669c17a2/visas-summary-jun-2025-tables.ods">Entry clearance visas summary tables, year ending June 2025 (ODS, 56.1 KB)

https://assets.publishing.service.gov.uk/media/689efda51fedc616bb133a38/entry-clearance-visa-outcomes-datasets-jun-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending June 2025 (MS Excel Spreadsheet, 29.6 MB)
Vis_D01: Entry clearance visa applications, by nationality and visa type
Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome

Additional data relating to in country and overseas Visa applications can be fo
d
Replication Data for: Quantifying Data Capital in Social Media Clout
search.dataone.org
dataverse.harvard.edu
Updated Nov 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tang, Chunlei (2023). Replication Data for: Quantifying Data Capital in Social Media Clout [Dataset]. http://doi.org/10.7910/DVN/MLTKPU
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/MLTKPU
Dataset updated
Nov 12, 2023
Dataset provided by
Harvard Dataverse
Authors
Tang, Chunlei
Description
The data is from two venture capital groups’ Facebook™ pages between October 1, 2016, and September 30, 2018. One is a private group with 18,946 members that was formed on December 3, 2006 and has 25 moderators. The other is a public group with 11,999 members that was formed on May 10, 2008 with 3 moderators. There was some overlap in membership: 3,952 people participated in both groups. 13,384 and 10,876 people have initial human relations via invitations in the private and public groups, respectively. The average invitation count in the private group was 17.0 per member, with a maximum of 5,124 and a minimum of 2, see Appendix Table 1. The average invitation count in the public group was 6.6 per member, with a maximum of 512 and a minimum of 2. We excluded the moderator with 512 invitations, as this was an outlier. After crawling and scraping the original post, we got two datasets. One consists of 1,419 posts with 600 unique private group’s authors, and the other has 1,409 posts from 502 public group’s authors. The private and public group’s authors’ average contributions are 3.2% (600/18,946) and 4.2% (502/11,999), respectively. A total of 110 people published posts in both groups.
People with diabetes who have received nine care processes (CCGOIS 2.4) -...
ckan.publishing.service.gov.uk
Updated Aug 1, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2017). People with diabetes who have received nine care processes (CCGOIS 2.4) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/people-with-diabetes-who-have-received-nine-care-processes-ccgois-2-4
Explore at:
Dataset updated
Aug 1, 2017
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The percentage of people with diabetes who have received nine care processes. Current version updated: Mar-17 Next version due: Mar-18
Crowd Counting Dataset
kaggle.com
Updated Feb 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2024). Crowd Counting Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/crowd-counting-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Crowd Counting Dataset

The dataset includes images featuring crowds of people ranging from 0 to 5000 individuals. The dataset includes a diverse range of scenes and scenarios, capturing crowds in various settings. Each image in the dataset is accompanied by a corresponding JSON file containing detailed labeling information for each person in the crowd for crowd count and classification.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4b51a212e59f575bd6978f215a32aca0%2FFrame%2064.png?generation=1701336719197861&alt=media" alt="">

Types of crowds in the dataset: 0-1000, 1000-2000, 2000-3000, 3000-4000 and 4000-5000

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F72e0fed3ad13826d6545ff75a79ed9db%2FFrame%2065.png?generation=1701337622225724&alt=media" alt="">

This dataset provides a valuable resource for researchers and developers working on crowd counting technology, enabling them to train and evaluate their algorithms with a wide range of crowd sizes and scenarios. It can also be used for benchmarking and comparison of different crowd counting algorithms, as well as for real-world applications such as public safety and security, urban planning, and retail analytics.

Full version of the dataset includes 647 labeled images of crowds, leave a request on TrainingData to buy the dataset

Statistics for the dataset (number of images by the crowd's size and image width):

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F2e9f36820e62a2ef62586fc8e84387e2%2FFrame%2063.png?generation=1701336725293625&alt=media" alt="">

OTHER BIOMETRIC DATASETS:

Anti Spoofing Real Dataset

Antispoofing Replay Dataset

Selfies, ID Images dataset (5591 sets of 15 files)

Selfies and video dataset (4 052 sets)

Dataset of bald people, 5000 images

Get the Dataset

This is just an example of the data

Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

Content

images - includes original images of crowds placed in subfolders according to its size,

labels - includes json-files with labeling and visualised labeling for the images in the previous folder,

csv file - includes information for each image in the dataset

File with the extension .csv

id: id of the image,

image: link to access the original image,

label: link to access the json-file with labeling,

type: type of the crowd on the photo

TrainingData provides high-quality data annotation tailored to your needs

keywords: crowd counting, crowd density estimation, people counting, crowd analysis, image annotation, computer vision, deep learning, object detection, object counting, image classification, dense regression, crowd behavior analysis, crowd tracking, head detection, crowd segmentation, crowd motion analysis, image processing, machine learning, artificial intelligence, ai, human detection, crowd sensing, image dataset, public safety, crowd management, urban planning, event planning, traffic management
COVID-19 Vaccine Progress Dashboard Data
data.chhs.ca.gov
healthdata.gov
+4more
csv, xlsx, zip
Updated Sep 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2025). COVID-19 Vaccine Progress Dashboard Data [Dataset]. https://data.chhs.ca.gov/dataset/vaccine-progress-dashboard
Explore at:
csv(638738), csv(724860), csv(83128924), xlsx(11249), csv(675610), csv(18403068), csv(12877811), csv(188895), csv(111682), xlsx(7708), csv(82754), xlsx(11870), csv(110928434), csv(7777694), xlsx(11534), xlsx(11731), csv(148732), zip, csv(303068812), csv(6772350), csv(2447143)Available download formats
Dataset updated
Sep 17, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
Note: In these datasets, a person is defined as up to date if they have received at least one dose of an updated COVID-19 vaccine. The Centers for Disease Control and Prevention (CDC) recommends that certain groups, including adults ages 65 years and older, receive additional doses.

On 6/16/2023 CDPH replaced the booster measures with a new “Up to Date” measure based on CDC’s new recommendations, replacing the primary series, boosted, and bivalent booster metrics The definition of “primary series complete” has not changed and is based on previous recommendations that CDC has since simplified. A person cannot complete their primary series with a single dose of an updated vaccine. Whereas the booster measures were calculated using the eligible population as the denominator, the new up to date measure uses the total estimated population. Please note that the rates for some groups may change since the up to date measure is calculated differently than the previous booster and bivalent measures.

This data is from the same source as the Vaccine Progress Dashboard at https://covid19.ca.gov/vaccination-progress-data/ which summarizes vaccination data at the county level by county of residence. Where county of residence was not reported in a vaccination record, the county of provider that vaccinated the resident is included. This applies to less than 1% of vaccination records. The sum of county-level vaccinations does not equal statewide total vaccinations due to out-of-state residents vaccinated in California.

These data do not include doses administered by the following federal agencies who received vaccine allocated directly from CDC: Indian Health Service, Veterans Health Administration, Department of Defense, and the Federal Bureau of Prisons.

Totals for the Vaccine Progress Dashboard and this dataset may not match, as the Dashboard totals doses by Report Date and this dataset totals doses by Administration Date. Dose numbers may also change for a particular Administration Date as data is updated.

Previous updates:

On March 3, 2023, with the release of HPI 3.0 in 2022, the previous equity scores have been updated to reflect more recent community survey information. This change represents an improvement to the way CDPH monitors health equity by using the latest and most accurate community data available. The HPI uses a collection of data sources and indicators to calculate a measure of community conditions ranging from the most to the least healthy based on economic, housing, and environmental measures.

Starting on July 13, 2022, the denominator for calculating vaccine coverage has been changed from age 5+ to all ages to reflect new vaccine eligibility criteria. Previously the denominator was changed from age 16+ to age 12+ on May 18, 2021, then changed from age 12+ to age 5+ on November 10, 2021, to reflect previous changes in vaccine eligibility criteria. The previous datasets based on age 16+ and age 5+ denominators have been uploaded as archived tables.

Starting on May 29, 2021 the methodology for calculating on-hand inventory in the shipped/delivered/on-hand dataset has changed. Please see the accompanying data dictionary for details. In addition, this dataset is now down to the ZIP code level.
p
Nepal Number Dataset
listtodata.com
.csv, .xls, .txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
List to Data (2025). Nepal Number Dataset [Dataset]. https://listtodata.com/nepal-dataset
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Jul 17, 2025
Authors
List to Data
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2025 - Dec 31, 2025
Area covered
Nepal
Variables measured
phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
Description
Nepal Number Dataset is an index of Nepal contact numbers that are 100% accurate and valid. We always double-check to make sure that this record is correct. So, when you use this number library, you can trust that Nepal contact numbers work. And if you ever get an incorrect number, you get a replacement guarantee. This means that if a phone number doesn’t work, they will give you a new number at no extra cost. Moreover, the Nepal Number Dataset is very reliable. Use it with confidence, knowing that you are following the right steps for a smooth, successful outreach effort. The people on the list have agreed to share their mobile numbers. So, you are not breaking any rules when you use this database. And, getting the customer’s consent makes contacting them more welcoming and effective. Nepal phone data is detailed information about Nepal contact numbers. Trusted sources collect this phone data to ensure its reliability. The sources from which this library comes may include websites, government records, and phone service providers. We verify each source, and you can check the URLs where we got the data. This ensures that the mobile data is accurate and reliable. Also, Nepal phone data providers offer 24/7 support. Also, Nepal phone data follows an opt-in policy. This means that people can share their numbers. This is good because it ensures that people know they are using their information. You won’t get in trouble for using contact details without permission. List to Data helps you to find Nepal contact data for your business. Nepal phone number list is a collection of phone numbers of people living in Nepal. You can sort these contact numbers by gender, age, and relationship status. This means that you can only see the amount that matches your needs. For example, if you want to contact young and single people, you can do so. Also, this contact list follows GDPR rules. Also, the Nepal phone number list helps you remove invalid data. Sometimes, contact numbers may change or stop working. This list checks this and removes those numbers, so you don’t waste time calling people who don’t answer. Using the Nepal phone number list, you reach the right people. Therefore, you get accurate, current information.
p
Lebanon Number Dataset
listtodata.com
.csv, .xls, .txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
List to Data (2025). Lebanon Number Dataset [Dataset]. https://listtodata.com/lebanon-dataset
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Jul 17, 2025
Authors
List to Data
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2025 - Dec 31, 2025
Area covered
Lebanon
Variables measured
phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
Description
Lebanon Number Data is a list of phone numbers that you can filter in many ways. You can filter by gender, age, or even relationship status. To contact young people, filter the list to show only numbers from that age group. This helps you connect with the right individual quickly. The list also follows GDPR rules, which means it protects people’s privacy. Furthermore, we regularly update the Lebanon Number Data for clarity. It removes invalid numbers, saving time by avoiding outdated contact details. This feature keeps the list fresh and up-to-date and makes your work more efficient. With Lebanon contact data, you can trust accurate, up-to-date details and filter them to meet your needs. Lebanon phone data is a collection of phone numbers that is 100% correct and valid. The companies that provide this data check every number carefully to make sure it works. So, when you use this cellphone data, you don’t have to worry about the wrong numbers. If, for some reason, a number doesn’t work, you get a replacement guarantee. This means, if a number is invalid, they will give you a new dialing number at no extra cost. Moreover, Lebanon phone data comes with all phone number subscribers’ permission. This means the people who own the numbers have agreed to share their information. It’s very important to have this permission because it keeps you out of legal trouble. Using this database without the customer’s permission can be problematic, but this data is safe and secure. Lebanon phone number list is a collection of phone numbers of people living in Lebanon. This list is very helpful for businesses that need to contact people in Lebanon. The information comes from reliable sources, such as government records, websites, and phone service providers. You can even check the URLs where the data came from. This ensures that the phone numbers are accurate. Also, if you need help, 24/7 support is available. Also, the Lebanon phone number list follows the opt-in rule. Number owners know that others use their info, making it safe to use the data. You won’t face any trouble, and it respects people’s privacy. Using the Lebanon contact number list from our List to Data website, you can confidently connect with the right people.
d
Johns Hopkins COVID-19 Case Tracker
data.world
csv, zip
Updated Sep 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
Explore at:
zip, csvAvailable download formats
Dataset updated
Sep 15, 2025
Authors
The Associated Press
Time period covered
Jan 22, 2020 - Mar 9, 2023
Area covered
Description
Updates

Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

CDC Weekly case and death counts (national and state level)

CDC County level cases and deaths

HHS New hospital admissions

CDC NowCast COVID variant proportions (national and regional level)

April 9, 2020

The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.

April 20, 2020

Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.

April 29, 2020

The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.

September 1st, 2020

Johns Hopkins is now providing counts for the five New York City counties individually.

February 12, 2021

The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."

Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.

February 16, 2021

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

The AP is updating this dataset hourly at 45 minutes past the hour.

To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

Queries

Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

Filter cases by state here

Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac

Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true

Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.

Pull the 100 counties with the highest per-capita confirmed cases here

Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.

Interactive

The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

@(https://datawrapper.dwcdn.net/nRyaf/15/)

Interactive Embed Code

<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>

Caveats

This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.

In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.

In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"

This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.

Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.

The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

Attribution

This data should be credited to Johns Hopkins University COVID-19 tracking project
Healthy People 2020 Final Progress by Population Group Chart and Table
catalog.data.gov
odgavaprod.ogopendata.com
+3more
Updated Apr 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2025). Healthy People 2020 Final Progress by Population Group Chart and Table [Dataset]. https://catalog.data.gov/dataset/healthy-people-2020-final-progress-by-population-group-chart-and-table-617d0
Explore at:
Dataset updated
Apr 23, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Description
[1] The Progress by Population Group analysis is a component of the Healthy People 2020 (HP2020) Final Review. The analysis included subsets of the 1,111 measurable HP2020 objectives that have data available for any of six broad population characteristics: sex, race and ethnicity, educational attainment, family income, disability status, and geographic location. Progress toward meeting HP2020 targets is presented for up to 24 population groups within these characteristics, based on objective data aggregated across HP2020 topic areas. The Progress by Population Group data are also available at the individual objective level in the downloadable data set. [2] The final value was generally based on data available on the HP2020 website as of January 2020. For objectives that are continuing into HP2030, more recent data will be included on the HP2030 website as it becomes available: https://health.gov/healthypeople. [3] For more information on the HP2020 methodology for measuring progress toward target attainment and the elimination of health disparities, see: Healthy People Statistical Notes, no 27; available from: https://www.cdc.gov/nchs/data/statnt/statnt27.pdf. [4] Status for objectives included in the HP2020 Progress by Population Group analysis was determined using the baseline, final, and target value. The progress status categories used in HP2020 were: a. Target met or exceeded—One of the following applies: (i) At baseline, the target was not met or exceeded, and the most recent value was equal to or exceeded the target (the percentage of targeted change achieved was equal to or greater than 100%); (ii) The baseline and most recent values were equal to or exceeded the target (the percentage of targeted change achieved was not assessed). b. Improved—One of the following applies: (i) Movement was toward the target, standard errors were available, and the percentage of targeted change achieved was statistically significant; (ii) Movement was toward the target, standard errors were not available, and the objective had achieved 10% or more of the targeted change. c. Little or no detectable change—One of the following applies: (i) Movement was toward the target, standard errors were available, and the percentage of targeted change achieved was not statistically significant; (ii) Movement was toward the target, standard errors were not available, and the objective had achieved less than 10% of the targeted change; (iii) Movement was away from the baseline and target, standard errors were available, and the percent change relative to the baseline was not statistically significant; (iv) Movement was away from the baseline and target, standard errors were not available, and the objective had moved less than 10% relative to the baseline; (v) No change was observed between the baseline and the final data point. d. Got worse—One of the following applies: (i) Movement was away from the baseline and target, standard errors were available, and the percent change relative to the baseline was statistically significant; (ii) Movement was away from the baseline and target, standard errors were not available, and the objective had moved 10% or more relative to the baseline. NOTE: Measurable objectives had baseline data. SOURCE: National Center for Health Statistics, Healthy People 2020 Progress by Population Group database.

Facebook

Twitter

Click to copy link

Link copied

Cite

New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data

Coronavirus (Covid-19) Data in the United States

Explore at:

csvAvailable download formats

Dataset provided by

New York Times

License

https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

Description

The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

Clear search

Close search

Google apps

Main menu

Coronavirus (Covid-19) Data in the United States

BlogFeedback Data Set

Source:

Data Set Information:

** Attribute Information:**

** Relevant Papers:**

Number of data compromises and impacted individuals in U.S. 2005-2024

USA Name Data

Context

Content

Acknowledgements

Inspiration

COVID-19 - Vaccinations by Region, Age, and Race-Ethnicity - Historical

U.S. Facebook data requests from government agencies 2013-2023

Data from: San Francisco Open Data

Context

Content

Acknowledgements

Inspiration

Empathy dataset

Amount of data created, consumed, and stored 2010-2023, with forecasts to...

Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status and...

COVID-19 Vaccine Progress Dashboard Data by ZIP Code

Immigration system statistics data tables

Accessible file formats

Related content

Passenger arrivals

Electronic travel authorisation

Entry clearance visas granted outside the UK

Replication Data for: Quantifying Data Capital in Social Media Clout

People with diabetes who have received nine care processes (CCGOIS 2.4) -...

Crowd Counting Dataset

Crowd Counting Dataset

Full version of the dataset includes 647 labeled images of crowds, leave a request on TrainingData to buy the dataset

Statistics for the dataset (number of images by the crowd's size and image width):

OTHER BIOMETRIC DATASETS:

Get the Dataset

This is just an example of the data

Content

File with the extension .csv

TrainingData provides high-quality data annotation tailored to your needs

COVID-19 Vaccine Progress Dashboard Data

Nepal Number Dataset

Lebanon Number Dataset

Johns Hopkins COVID-19 Case Tracker

Updates

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

Queries

Interactive

Interactive Embed Code

Caveats

Attribution

Healthy People 2020 Final Progress by Population Group Chart and Table

Coronavirus (Covid-19) Data in the United States

Attribute Information:

Relevant Papers: