100+ datasets found

D
ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics
data.sfgov.org
healthdata.gov
+2more
application/rdfxml +5
Updated Jan 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics [Dataset]. https://data.sfgov.org/Health-and-Social-Services/ARCHIVED-Mpox-Vaccinations-Given-to-SF-Residents-b/fk8q-nu3s
Explore at:
csv, json, application/rdfxml, application/rssxml, tsv, xmlAvailable download formats
Dataset updated
Jan 1, 2023
Area covered
San Francisco
Description
In early February 2024, we will be retiring the Mpox Vaccinations Given to SF Residents by Demographics dataset. This dataset will be archived and no longer update. A historic record of this data will remain available.

A. SUMMARY This dataset represents doses of mpox vaccine (JYNNEOS) administered in California to residents of San Francisco ages 18 years or older. This dataset only includes doses of the JYNNEOS vaccine given on or after 5/1/2022. All vaccines given to people who live in San Francisco are included, no matter where the vaccination took place. The data are broken down by multiple demographic stratifications.

B. HOW THE DATASET IS CREATED Information on doses administered to those who live in San Francisco is from the California Immunization Registry (CAIR2), run by the California Department of Public Health (CDPH). Information on individuals’ city of residence, age, race, ethnicity, and sex are recorded in CAIR2 and are self-reported at the time of vaccine administration. Because CAIR2 does not include information on sexual orientation, we pull information from the San Francisco Department of Public Health’s Epic Electronic Health Record (EHR). The populations represented in our Epic data and the CAIR2 data are different. Epic data only include vaccinations administered at SFDPH managed sites to SF residents.

Data notes for population characteristic types are listed below.

Age * Data only include individuals who are 18 years of age or older.

Race/ethnicity * The response option "Other Race" is categorized by the data source system, and the response option "Unknown" refers to a lack of data.

Sex * The response option "Other" is categorized by the source system, and the response option "Unknown" refers to a lack of data.

Sexual orientation * The response option “Unknown/Declined” refers to a lack of data or individuals who reported multiple different sexual orientations during their most recent interaction with SFDPH.

For convenience, we provide the 2020 5-year American Community Survey population estimates.

C. UPDATE PROCESS Updated daily via automated process.

D. HOW TO USE THIS DATASET This dataset includes many different types of demographic groups. Filter the “demographic_group” column to explore a topic area. Then, the “demographic_subgroup” column shows each group or category within that topic area and the total count of doses administered to that population subgroup.

E. CHANGE LOG
UPDATE 1/3/2023: Due to low case numbers, this page will no longer include vaccinations after 12/31/2022.
c
Dataset for: "The effects of skin tone on photoacoustic imaging and...
repository.cam.ac.uk
bin, txt, zip
Updated Aug 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Else, Thomas; Hacker, Lina; Groehl, Janek; Bunce, Ellie; Tao, Ran; Bohndiek, Sarah (2023). Dataset for: "The effects of skin tone on photoacoustic imaging and oximetry" [Dataset]. http://doi.org/10.17863/CAM.100220
Explore at:
txt(8698 bytes), bin(6406 bytes), zip(13490 bytes), zip(125813 bytes)Available download formats
Unique identifier
https://doi.org/10.17863/CAM.100220
Dataset updated
Aug 16, 2023
Dataset provided by
University of Cambridge
Apollo
Authors
Else, Thomas; Hacker, Lina; Groehl, Janek; Bunce, Ellie; Tao, Ran; Bohndiek, Sarah
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the data required for the paper "The effects of skin tone on photoacoustic imaging and oximetry". Figure papers can all be reproduced using code available on GitHub: https://github.com/BohndiekLab/melanin-phantom-simulation-paper. Please follow the README available at this link to do so.

The aim of the study is to understand how photoacoustic imaging is affected by skin colour. To do so, we ran photoacoustic simulations of a forearm and a cylindrical blood flow phantom, imaged a blood-flow phantom in a commercial photoacoustic imaging system (iThera inVision, iThera Medical GmbH) and imaged pigmented mice. Further details can be obtained in the associated publication. We particularly looked at how blood oximetry was affected by skin tone.
C
Pittsburgh American Community Survey Data 2015 - Household Types
data.wprdc.org
catalog.data.gov
+1more
csv
Updated May 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Pittsburgh (2023). Pittsburgh American Community Survey Data 2015 - Household Types [Dataset]. https://data.wprdc.org/dataset/pittsburgh-american-community-survey-data-household-types
Explore at:
csvAvailable download formats
Dataset updated
May 21, 2023
Dataset provided by
City of Pittsburgh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Pittsburgh
Description
The data on relationship to householder were derived from answers to Question 2 in the 2015 American Community Survey (ACS), which was asked of all people in housing units. The question on relationship is essential for classifying the population information on families and other groups. Information about changes in the composition of the American family, from the number of people living alone to the number of children living with only one parent, is essential for planning and carrying out a number of federal programs.

The responses to this question were used to determine the relationships of all persons to the householder, as well as household type (married couple family, nonfamily, etc.). From responses to this question, we were able to determine numbers of related children, own children, unmarried partner households, and multi-generational households. We calculated average household and family size. When relationship was not reported, it was imputed using the age difference between the householder and the person, sex, and marital status.

Household – A household includes all the people who occupy a housing unit. (People not living in households are classified as living in group quarters.) A housing unit is a house, an apartment, a mobile home, a group of rooms, or a single room that is occupied (or if vacant, is intended for occupancy) as separate living quarters. Separate living quarters are those in which the occupants live separately from any other people in the building and which have direct access from the outside of the building or through a common hall. The occupants may be a single family, one person living alone, two or more families living together, or any other group of related or unrelated people who share living arrangements.

Average Household Size – A measure obtained by dividing the number of people in households by the number of households. In cases where people in households are cross-classified by race or Hispanic origin, people in the household are classified by the race or Hispanic origin of the householder rather than the race or Hispanic origin of each individual.

Average household size is rounded to the nearest hundredth.

Comparability – The relationship categories for the most part can be compared to previous ACS years and to similar data collected in the decennial census, CPS, and SIPP. With the change in 2008 from “In-law” to the two categories of “Parent-in-law” and “Son-in-law or daughter-in-law,” caution should be exercised when comparing data on in-laws from previous years. “In-law” encompassed any type of in-law such as sister-in-law. Combining “Parent-in-law” and “son-in-law or daughter-in-law” does not represent all “in-laws” in 2008.

The same can be said of comparing the three categories of “biological” “step,” and “adopted” child in 2008 to “Child” in previous years. Before 2008, respondents may have considered anyone under 18 as “child” and chosen that category. The ACS includes “foster child” as a category. However, the 2010 Census did not contain this category, and “foster children” were included in the “Other nonrelative” category. Therefore, comparison of “foster child” cannot be made to the 2010 Census. Beginning in 2013, the “spouse” category includes same-sex spouses.

Hate Crimes

kaggle.com

Updated Jul 7, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Melissa Monfared (2024). Hate Crimes [Dataset]. https://www.kaggle.com/datasets/melissamonfared/hate-crimes

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 7, 2024

Dataset provided by

Kaggle

Authors

Melissa Monfared

Description

Overview:

This dataset contains detailed information on cases where a hate or bias crime has been reported to the Bloomington Police Department. Hate crimes are criminal offenses motivated by bias against race, religion, ethnicity, sexual orientation, gender identity, or other protected characteristics. This dataset provides insights into the nature and demographics of hate crimes in Bloomington, aiding in understanding and addressing these incidents.

Dataset Details:

The dataset includes the following columns:

Column Name	Description	API Field Name	Data Type
case_number	Case Number	case_number	Text
date	Date	date	Floating Timestamp
weekday	Day of Week	day_of_week	Text
victims	Total Number of Victims	victims	Number
victim_race	Victim Race	victim_race	Text
victim_gender	Victim Gender	victim_gender	Text
victim_type	Victim Type	victim_type	Text
offenders	Total Number of Offenders	offenders	Number
offender_race	Offender Race	offender_race	Text
offender_gender	Offender Gender	offender_gender	Text
offense	Offense / Crime	offense	Text
location_type	Offense / Crime Location Type	location_type	Text
motivation	Offense/Crime Bias Motivation	motivation	Text

Key Features:

Comprehensive Crime Data: Provides detailed information on hate crimes, including demographics of victims and offenders, types of offenses, and bias motivations.
Temporal Analysis: Includes timestamps for each incident, allowing for analysis of trends over time.
Demographic Insights: Offers data on race and gender of both victims and offenders, helping to identify patterns and target interventions.
Location Information: Contains details about the type of location where the offense occurred, useful for spatial analysis and preventive measures.

Usage:

This dataset can be used for:

Crime Analysis: Analyzing trends and patterns in hate crimes to inform law enforcement strategies and policies.
Community Safety: Identifying high-risk areas and times to improve community policing and preventive measures.
Research and Advocacy: Supporting academic research and advocacy efforts focused on combating hate crimes and promoting social justice.
Policy Development: Assisting policymakers in developing targeted initiatives to reduce hate crimes and support affected communities.

Data Maintenance:

Last Updated: July 7, 2024
Source: Bloomington Police Department Data Portal
Revisions: The dataset is annually updated to ensure the inclusion of the latest incidents and to maintain data accuracy. Historical data is preserved to support long-term analyses.

Additional Notes

Data Accuracy: The Bloomington Police Department strives for accuracy in open data; however, errors may occur due to the nature of data collection from multiple sources.
Data Interpretation: Users should be aware that the dataset may change over time as new information becomes available or corrections are made.
Race and District Codes: The dataset uses specific codes for race and reading districts, which are detailed in the accompanying documentation to ensure proper interpretation.
License: Open Data Commons Public Domain Dedication and License

c
Ethnic Organizations Online (EO2) Dataset
datacatalogue.cessda.eu
search.gesis.org
Updated Feb 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gremler, Frederik; Weidmann, Nils (2024). Ethnic Organizations Online (EO2) Dataset [Dataset]. http://doi.org/10.7802/2612
Explore at:
Unique identifier
https://doi.org/10.7802/2612
Dataset updated
Feb 29, 2024
Dataset provided by
Universität Konstanz
Authors
Gremler, Frederik; Weidmann, Nils
Description
With the increasing relevance of ethnic groups as political actors, the literature has attempted to identify and study the ethnic organizations representing these groups. How do these organizations use digital communication channels to reach their domestic and international audiences? To enable research on these questions, we have developed the Ethnic Organizations Online (EO2) dataset, a new data collection focusing on the online channels that ethnic organizations use. The dataset includes four types of channels: Twitter, Facebook, Instagram, and regular websites. It relies on the Ethnic Power Relations -- Organizations database, and is therefore compatible with an entire family of datasets on ethnic politics. Featuring more than 2,000 online channels used by 265 groups, it allows researchers to study a wide variety of questions related to digital ethnic mobilization.

This repository contains the dataset, codebook, and further information on working with the dataset. A paper titled "Ethnic Politics via Digital Means: Introducing the Ethnic Organizations Online (EO2) Dataset" is forthcoming in Journal of Peace Research.
N
Brooklyn, New York Population Breakdown By Race (Excluding Ethnicity)...
neilsberg.com
csv, json
Updated Feb 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Brooklyn, New York Population Breakdown By Race (Excluding Ethnicity) Dataset: Population Counts and Percentages for 7 Racial Categories as Identified by the US Census Bureau // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/brooklyn-ny-population-by-race/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 21, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Brooklyn, New York
Variables measured
Asian Population, Black Population, White Population, Some other race Population, Two or more races Population, American Indian and Alaska Native Population, Asian Population as Percent of Total Population, Black Population as Percent of Total Population, White Population as Percent of Total Population, Native Hawaiian and Other Pacific Islander Population, and 4 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and do not rely on any ethnicity classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Brooklyn borough by race. It includes the population of Brooklyn borough across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to understand the population distribution of Brooklyn borough across relevant racial categories.

Key observations

The percent distribution of Brooklyn borough population by race (across all racial categories recognized by the U.S. Census Bureau): 39.26% are white, 28.99% are Black or African American, 0.59% are American Indian and Alaska Native, 12.04% are Asian, 0.05% are Native Hawaiian and other Pacific Islander, 10.23% are some other race and 8.84% are multiracial.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race: This column displays the racial categories (excluding ethnicity) for the Brooklyn borough

Population: The population of the racial category (excluding ethnicity) in the Brooklyn borough is shown in this column.

% of Total Population: This column displays the percentage distribution of each race as a proportion of Brooklyn borough total population. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Brooklyn borough Population by Race & Ethnicity. You can refer the same here
N
Many, LA annual income distribution by work experience and gender dataset:...
neilsberg.com
csv, json
Updated Feb 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). Many, LA annual income distribution by work experience and gender dataset: Number of individuals ages 15+ with income, 2023 // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/bab5065d-f4ce-11ef-8577-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 27, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Louisiana, Many
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time, Number of males working full time for a given income bracket, Number of males working part time for a given income bracket, Number of females working full time for a given income bracket, Number of females working part time for a given income bracket
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the number of individuals for both the genders (Male and Female), within each income bracket we conducted an initial analysis and categorization of the American Community Survey data. Households are categorized, and median incomes are reported based on the self-identified gender of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Many. The dataset can be utilized to gain insights into gender-based income distribution within the Many population, aiding in data analysis and decision-making..

Key observations

Employment patterns: Within Many, among individuals aged 15 years and older with income, there were 598 men and 927 women in the workforce. Among them, 232 men were engaged in full-time, year-round employment, while 366 women were in full-time, year-round roles.

Annual income under $24,999: Of the male population working full-time, 22.41% fell within the income range of under $24,999, while 42.35% of the female population working full-time was represented in the same income bracket.

Annual income above $100,000: 16.38% of men in full-time roles earned incomes exceeding $100,000, while none of women in full-time positions earned within this income bracket.

Refer to the research insights for more key observations on more income brackets ( Annual income under $24,999, Annual income between $25,000 and $49,999, Annual income between $50,000 and $74,999, Annual income between $75,000 and $99,999 and Annual income above $100,000) and employment types (full-time year-round and part-time)

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Income brackets:

$1 to $2,499 or loss

$2,500 to $4,999

$5,000 to $7,499

$7,500 to $9,999

$10,000 to $12,499

$12,500 to $14,999

$15,000 to $17,499

$17,500 to $19,999

$20,000 to $22,499

$22,500 to $24,999

$25,000 to $29,999

$30,000 to $34,999

$35,000 to $39,999

$40,000 to $44,999

$45,000 to $49,999

$50,000 to $54,999

$55,000 to $64,999

$65,000 to $74,999

$75,000 to $99,999

$100,000 or more

Variables / Data Columns

Income Bracket: This column showcases 20 income brackets ranging from $1 to $100,000+..

Full-Time Males: The count of males employed full-time year-round and earning within a specified income bracket

Part-Time Males: The count of males employed part-time and earning within a specified income bracket

Full-Time Females: The count of females employed full-time year-round and earning within a specified income bracket

Part-Time Females: The count of females employed part-time and earning within a specified income bracket

Employment type classifications include:

Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.

Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Many median household income by race. You can refer the same here
P
Horse Racing Photo Dataset Dataset
paperswithcode.com
Updated Feb 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Horse Racing Photo Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/horse-racing-photo-dataset
Explore at:
Dataset updated
Feb 12, 2025
Description
Description: 👉 Download the dataset here This dataset provides a detailed collection of horse race photo finishes from PMU events. It is ideal for machine learning and computer vision research, particularly in image recognition and sports analytics.

Download Dataset

Image Resolutions: Each image is captured and made available in three sizes:

Small (200×96 pixels): Optimized for quick, low-resolution analysis.

Medium (450×217 pixels): Balances detail and file size for moderate analysis.

Large (478×230 pixels): High-resolution images for precise, in-depth research.

CSV Metadata:

Accompanying the images, a CSV file contains critical metadata about each race:

Walk Type: Differentiates between Trot and Gallop, fundamental gait types in horse racing, essential for analyzing movement patterns.

Speciality & Discipline: These sub-categories give further details on the race type, providing researchers with more context for analysis.

Rope Direction: The position of the track’s barrier is noted as either on the left or right side of the horses, influencing lane dynamics and photo finish placement.

Weather Conditions: Detailed weather codes allow insights into the environmental conditions affecting race visibility and performance:

P1 – P17: Codes range from sunny and cloudy to adverse weather like thunderstorms and snow, offering a broad spectrum for training models on different environmental factors.

Additional Environmental Factors: Luminosity: Whether the race occurred during day or night is an important factor, providing training data for models that operate under various lighting conditions, enhancing the dataset’s versatility. Potential Applications:

This dataset is suitable for numerous machine learning applications:

Photo Finish Analysis: Machine learning models can be trained to detect race winners based on these images.

Environmental Impact Studies: The weather data enables research into how different weather conditions affect race outcomes.

Gait Classification: Using the trot and gallop metadata, researchers can develop algorithms that classify different horse movements automatically.

Why This Dataset Stands Out:

Versatility: With varied image sizes, weather conditions, and gait types, this dataset supports a broad range of research.

Rich Metadata: The dataset provides thorough race information that offers a deeper context for understanding the nuances of horse racing.

This dataset provides an excellent resource for building models related to sports analytics, environmental conditions, and gait classification. The variety in race conditions and photo finish details ensures its suitability for complex machine learning projects.

This dataset is sourced from Kaggle.
d
COVID-19 Deaths by Population Characteristics
catalog.data.gov
data.sfgov.org
+2more
Updated Jun 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). COVID-19 Deaths by Population Characteristics [Dataset]. https://catalog.data.gov/dataset/covid-19-deaths-by-population-characteristics
Explore at:
Dataset updated
Jun 29, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals may increase or decrease. Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups. B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health. Data on the population characteristics of COVID-19 deaths are from: Case reports Medical records Electronic lab reports Death certificates Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths. To protect resident privacy, we summarize COVID-19 data by only one population characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more. Data notes on select population characteristic types are listed below. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. Gender * The City collects information on gender identity using these guidelines. C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week. Dataset will not update on the business day following any federal holiday. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a dataset based on the San Francisco Population and Demographic Census dataset.These population estimates are from the 2018-2022 5-year American Community Survey (ACS). This dataset includes several characteristic types. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cumulative deaths. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed. To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset. E. CHANGE LOG
[Archived] COVID-19 Deaths by Population Characteristics Over Time
healthdata.gov
data.sfgov.org
+1more
application/rdfxml +5
Updated Apr 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). [Archived] COVID-19 Deaths by Population Characteristics Over Time [Dataset]. https://healthdata.gov/dataset/-Archived-COVID-19-Deaths-by-Population-Characteri/hs5f-amst
Explore at:
csv, json, xml, application/rssxml, tsv, application/rdfxmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.sfgov.org
Description
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.

A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.

B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.

Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates

Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.

To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.

Gender * The City collects information on gender identity using these guidelines.

C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.

Dataset will not update on the business day following any federal holiday.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.

New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.

This data may not be immediately available for more recent deaths. Data updates as more information becomes available.

To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.

E. CHANGE LOG
9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department o
D
ARCHIVED: COVID-19 Cases by Population Characteristics Over Time
data.sfgov.org
healthdata.gov
+2more
application/rdfxml +5
Updated Sep 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ARCHIVED: COVID-19 Cases by Population Characteristics Over Time [Dataset]. https://data.sfgov.org/Health-and-Social-Services/ARCHIVED-COVID-19-Cases-by-Population-Characterist/j7i3-u9ke
Explore at:
xml, csv, json, application/rdfxml, tsv, application/rssxmlAvailable download formats
Dataset updated
Sep 11, 2023
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.

B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from:  * Case interviews  * Laboratories  * Medical providers    These multiple streams of data are merged, deduplicated, and undergo data verification processes.  

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

Gender * The City collects information on gender identity using these guidelines.

Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives.  * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’.

Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. https://www.sfdph.org/dph/files/PoliciesProcedures/COM9_SexualOrientationGuidelines.pdf">Learn more about our data collection guidelines pertaining to sexual orientation.

Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews.

Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cases on each date.

New cases are the count of cases within that characteristic group where the positive tests were collected on that specific specimen collection date. Cumulative cases are the running total of all San Francisco cases in that characteristic group up to the specimen collection date listed.

This data may not be immediately available for recently reported cases. Data updates as more information becomes available.

To explore data on the total number of cases, use the ARCHIVED: COVID-19 Cases Over Time dataset.

E. CHANGE LOG
9/11/2023 - data on COVID-19 cases by population characteristics over time are no longer being updated. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.
6/6/2023 - data on cases by transmission type have been removed. See section ARCHIVED DATA for more detail.
5/16/2023 - data on cases by sexual orientation, comorbidities, homelessness, and single room occupancy have been removed. See section ARCHIVED DATA for more detail.
4/6/2023 - the State implemented system updates to improve the integrity of historical data.
2/21/2023 - system updates to improve reliability and accuracy of cases data were implemented.
1/31/2023 - updated “population_estimate” column to reflect the 2020 Census Bureau American Community Survey (ACS) San Francisco Population estimates.
1/5/2023 - data on SNF cases removed. See section ARCHIVED DATA for more detail.
3/23/2022 - ‘Native American’ changed to ‘American Indian or Alaska Native’ to align with the census.
1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
7/15/2022 - reinfections added to cases dataset. See section SUMMARY for more information on how reinfections are identified.
C
Violence Reduction - Victim Demographics - Aggregated
data.cityofchicago.org
s.cnmilf.com
+1more
application/rdfxml +5
Updated Jul 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Chicago (2025). Violence Reduction - Victim Demographics - Aggregated [Dataset]. https://data.cityofchicago.org/Public-Safety/Violence-Reduction-Victim-Demographics-Aggregated/gj7a-742p
Explore at:
application/rssxml, csv, json, application/rdfxml, xml, tsvAvailable download formats
Dataset updated
Jul 13, 2025
Dataset authored and provided by
City of Chicago
Description
This dataset contains aggregate data on violent index victimizations at the quarter level of each year (i.e., January – March, April – June, July – September, October – December), from 2001 to the present (1991 to present for Homicides), with a focus on those related to gun violence. Index crimes are 10 crime types selected by the FBI (codes 1-4) for special focus due to their seriousness and frequency. This dataset includes only those index crimes that involve bodily harm or the threat of bodily harm and are reported to the Chicago Police Department (CPD). Each row is aggregated up to victimization type, age group, sex, race, and whether the victimization was domestic-related. Aggregating at the quarter level provides large enough blocks of incidents to protect anonymity while allowing the end user to observe inter-year and intra-year variation. Any row where there were fewer than three incidents during a given quarter has been deleted to help prevent re-identification of victims. For example, if there were three domestic criminal sexual assaults during January to March 2020, all victims associated with those incidents have been removed from this dataset. Human trafficking victimizations have been aggregated separately due to the extremely small number of victimizations.

This dataset includes a " GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized dataset, but with "UNKNOWN" in the shooting column.

The dataset is refreshed daily, but excludes the most recent complete day to allow CPD time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information.

How does this dataset classify victims?

The methodology by which this dataset classifies victims of violent crime differs by victimization type:

Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table.

To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset.

For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Victims dataset do not align with the homicide/non-fatal shooting categorization:

In instances where a homicide victimization does not correspond to an IUCR code 0110 or 0130, we set the IUCR code to "01XX" to indicate that the victimization was a homicide but we do not know whether it was a first-degree murder (IUCR code = 0110) or a second-degree murder (IUCR code = 0130).

When a non-fatal shooting victimization does not correspond to an IUCR code that signifies a criminal sexual assault, robbery, or aggravated battery, we enter “UNK” in the IUCR column, “YES” in the GUNSHOT_I column, and “NON-FATAL” in the PRIMARY column to indicate that the victim was non-fatally shot, but the precise IUCR code is unknown.

Other violent crime victims: For other violent crime types, we refer to the IUCR classification that exists in CPD's victim table, with only one exception:

When there is an incident that is associated with no victim with a matching IUCR code, we assume that this is an error. Every crime should have at least 1 victim with a matching IUCR code. In these cases, we change the IUCR code to reflect the incident IUCR code because CPD's incident table is considered to be more reliable than the victim table.

Note: All businesses identified as victims in CPD data have been removed from this dataset.

Note: The definition of “homicide” (shooting or otherwise) does not include justifiable homicide or involuntary manslaughter. This dataset also excludes any cases that CPD considers to be “unfounded” or “noncriminal.”

Note: In some instances, the police department's raw incident-level data and victim-level data that were inputs into this dataset do not align on the type of crime that occurred. In those instances, this dataset attempts to correct mismatches between incident and victim specific crime types. When it is not possible to determine which victims are associated with the most recent crime determination, the dataset will show empty cells in the respective demographic fields (age, sex, race, etc.).

Note: The initial reporting officer usually asks victims to report demographic data. If victims are unable to recall, the reporting officer will use their best judgment. “Unknown” can be reported if it is truly unknown.
a
Healthcare Worker Migration, New Mexico, 2021
arc-gis-hub-home-arcgishub.hub.arcgis.com
Updated May 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New Mexico Community Data Collaborative (2023). Healthcare Worker Migration, New Mexico, 2021 [Dataset]. https://arc-gis-hub-home-arcgishub.hub.arcgis.com/maps/NMCDC::healthcare-worker-migration-new-mexico-2021
Explore at:
Dataset updated
May 3, 2023
Dataset authored and provided by
New Mexico Community Data Collaborative
Area covered

Description
Dataset, GDB, and Online Map created by Renee Haley, NMCDC, May 2023 DATA ACQUISITION PROCESS

Scope and purpose of project: New Mexico is struggling to maintain its healthcare workforce, particularly in Rural areas. This project was undertaken with the intent of looking at flows of healthcare workers into and out of New Mexico at the most granular geographic level possible. This dataset, in combination with others (such as housing cost and availability data) may help us understand where our healthcare workforce is relocating and why.

The most relevant and detailed data on workforce indicators in the United States is housed by the Census Bureau's Longitudinal Employer-Household Dynamics, LEHD, System. Information on this system is available here:

https://lehd.ces.census.gov/

The Job-to-Job flows explorer within this system was used to download the data. Information on the J2J explorer can ve found here:

https://j2jexplorer.ces.census.gov/explore.html#1432012

The dataset was built from data queried with the LED Extraction Tool, which allows for the query of more intersectional and detailed data than the explorer. This is a link to the LED extraction tool:

https://ledextract.ces.census.gov/

The geographies used are US Metro areas as determined by the Census, (N=389). The shapefile is named lehd_shp_gb.zip, and can be downloaded under this section of the following webpage: 5.5. Job-to-Job Flow Geographies, 5.5.1. Metropolitan (Complete). A link to the download site is available below:

https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_shapefiles.html

DATA CLEANING PROCESS

This dataset was built from 8 non intersectional datasets downloaded from the LED Extraction Tool.

Separate datasets were downloaded in order to obtain detailed information on the race, ethnicity, and educational attainment levels of healthcare workers and where they are migrating.

Datasets included information for the four separate quarters of 2021. It was not possible to download annual data, only quarterly. Quarterly data was summed in a later step to derive annual totals for 2021.

4 datasets for healthcare workers moving OUT OF New Mexico, with details on race, ethnicity, and educational attainment, were downloaded. 1 contained information on educational attainment, 2 contained information on 7 racial categories identifying as non- Hispanic, 3 contained information on those same 7 categories also identifying as Hispanic, and 4 contained information for workers identifying as white and Hispanic.

4 datasets for healthcare worker moving INTO New Mexico, with details on race, ethnicity, and educational attainment, were downloaded with the same details outlined above.

Each dataset was cleaned according to Data Template which kept key attributes and discarded excess information. Within each dataset, the J2J Indicators reflecting 6 different types of job migration were totaled in order to simplify analysis, as this information was not needed in detail.

After cleaning, each set of 4 datasets for workers moving INTO New Mexico were joined. The process was repeated for workers moving OUT OF New Mexico. This resulted 2 main datasets.

These 2 main datasets still listed all of the variables by each quarter of 2021. Because of this the data was split in JMP, so that attributes of educational attainment, race and ethnicity, of workers migrating by quarter were moved from rows to columns. After this, summary columns for the year of 2021 were derived. This resulted in totals columns for workers identifying as: 6 separate races and all ethnicities, all races and Hispanic, white-Hispanic, and workers of 6 different education levels, reflecting how many workers of each indicator migrated to and from metro areas in New Mexico in 2021.

The data split transposed duplicate rows reflecting differing worker attributes within the same metro area, resulting in one row for each metro area and reflecting the attributes in columns, thus resulting in a mappable dataset.

The 2 datasets were joined (on Metro Area) resulting in one master file containing information on healthcare workers entering and leaving New Mexico.

Rows (N=389) reflect all of the metro areas across the US, and each state. Rows include the 5 metro areas within New Mexico, and New Mexico State.

Columns (N=99) contain information on worker race, ethnicity and educational attainment, specific to each metro area in New Mexico.

78 of these rows reflect workers of specific attributes moving OUT OF the 5 specific Metro Areas in New Mexico and totals for NM State. This level of detail is intended for analyzing who is leaving what area of New Mexico, where they are going to, and why.

13 Columns reflect each worker attribute for healthcare workers moving INTO New Mexico by race, ethnicity and education level. Because all 5 metro areas and New Mexico state are contained in the rows, this information for incoming workers is available by metro area and at the state level - there is less possability for mapping these attributes since it was not realistic or possible to create a dataset reflecting all of these variables for every healthcare worker from every metro area in the US also coming into New Mexico (that dataset would have over 1,000 columns and be unmappable). Therefore this dataset is easier to utilize in looking at why workers are leaving the state but also includes detailed information on who is coming in.

The remaining 8 columns contain geographic information.

GIS AND MAPPING PROCESS

The master file was opened in Arc GIS Pro and the Shapefile of US Metro Areas was also imported

The excel file was joined to the shapefile by Metro Area Name as they matched exactly

The resulting layer was exported as a GDB in order to retain null values which would turn to zeros if exported as a shapefile.

This GDB was uploaded to Arc GIS Online, Aliases were inserted as column header names, and the layer was visualized as desired.

SYSTEMS USED

MS Excel was used for data cleaning, summing NM state totals, and summing quarterly to annual data.

JMP was used to transpose, join, and split data.

ARC GIS Desktop was used to create the shapefile uploaded to NMCDC's online platform.

VARIABLE AND RECODING NOTES

Summary of variables selected for datasets downloaded focused on educational attainment:

J2J Flows by Educational Attainment

Summary of variables selected for datasets downloaded focused on race and ethnicity:

J2J Flows by Race and Ethnicity

Note: Variables in Datasets 1 through 4 downloaded twice, once for workers coming into New Mexico and once for those leaving NM. VARIABLE: LEHD VARIABLE DEFINITION LEHD VARIABLE NOTES DETAILS OR URL FOR RAW DATA DOWNLOAD

Geography Type - State Origin and Destination State

Data downloaded for worker migration into and out of all US States

Geography Type - Metropolitan Areas Origin and Dest Metro Area

Data downloaded for worker migration into and out of all US Metro Areas

NAICS sectors North American Industry Classification System Under Firm Characteristics Only downloaded for Healthcare and Social Assistance Sectors

Other Firm Characteristics No Firm Age / Size Detail Under Firm Characteristics Downloaded data on all firm ages, sizes, and other details.

Worker Characteristics Education, Race, Ethnicity

Non Intersectional data aside from Race / Ethnicity data.

Sex Gender

0 - All Sexes Selected

Age Age

A00 All Ages (14-99)

Education Education Level E0, E1, E2, E3, 34, E5 E0 - All Education Categories, E1 - Less than high school, E2 - High school or equivalent, no college, E3 - Some college or Associate’s degree, E4 - Bachelor's degree or advanced degree, E5 - Educational attainment not available (workers aged 24 or younger)

Dataset 1 All Education Levels, E1, E2, E3, E4, and E5

RACE

A0, A1, A2, A3, A4, A5 OPTIONS: A0 All Races, A1 White Alone, A2 Black or African American Alone, A3 American Indian or Alaska Native Alone, A4 Asian Alone, A5 Native Hawaiian or Other Pacific Islander Alone, SDA7 Two or More Race Groups

ETHNICITY

A0, A1, A2 OPTIONS: A0 All Ethnicities, A1 Not Hispanic or Latino, A2 Hispanic or Latino

Dataset 2 All Races (A0) and All Ethnicities (A0)

Dataset 3 6 Races (A1 through A5) and All Ethnicities (A0)

Dataset 4 White (A1) and Hispanic or Latino (A1)

Quarter Quarter and Year

Data from all quarters of 2021 to sum into annual numbers; yearly data was not available

Employer type Sector: Private or Governmental

Query included all healthcare sector workflows from all employer types and firm sizes from every quarter of 2021

J2J indicator categories Detailed types of job migration

All options were selected for all datasets and totaled: AQHire, AQHireS, EE, EES, J2J, J2JS. Counts were selected vs. earnings, and data was not seasonally adjusted (unavailable).

NOTES AND RESOURCES

The following resources and documentation were used to navigate the LEHD and J2J Worker Flows system and to answer questions about variables:

https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_public_use_schema.html

https://www.census.gov/history/www/programs/geography/metropolitan_areas.html

https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_csv_naming.html

Statewide (New
Data from: Age-by-Race Specific Crime Rates, 1965-1985: [United States]
catalog.data.gov
icpsr.umich.edu
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Justice (2025). Age-by-Race Specific Crime Rates, 1965-1985: [United States] [Dataset]. https://catalog.data.gov/dataset/age-by-race-specific-crime-rates-1965-1985-united-states-b16aa
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
National Institute of Justicehttp://nij.ojp.gov/
Area covered
United States
Description
These data examine the effects on total crime rates of changes in the demographic composition of the population and changes in criminality of specific age and race groups. The collection contains estimates from national data of annual age-by-race specific arrest rates and crime rates for murder, robbery, and burglary over the 21-year period 1965-1985. The data address the following questions: (1) Are the crime rates reported by the Uniform Crime Reports (UCR) data series valid indicators of national crime trends? (2) How much of the change between 1965 and 1985 in total crime rates for murder, robbery, and burglary is attributable to changes in the age and race composition of the population, and how much is accounted for by changes in crime rates within age-by-race specific subgroups? (3) What are the effects of age and race on subgroup crime rates for murder, robbery, and burglary? (4) What is the effect of time period on subgroup crime rates for murder, robbery, and burglary? (5) What is the effect of birth cohort, particularly the effect of the very large (baby-boom) cohorts following World War II, on subgroup crime rates for murder, robbery, and burglary? (6) What is the effect of interactions among age, race, time period, and cohort on subgroup crime rates for murder, robbery, and burglary? (7) How do patterns of age-by-race specific crime rates for murder, robbery, and burglary compare for different demographic subgroups? The variables in this study fall into four categories. The first category includes variables that define the race-age cohort of the unit of observation. The values of these variables are directly available from UCR and include year of observation (from 1965-1985), age group, and race. The second category of variables were computed using UCR data pertaining to the first category of variables. These are period, birth cohort of age group in each year, and average cohort size for each single age within each single group. The third category includes variables that describe the annual age-by-race specific arrest rates for the different crime types. These variables were estimated for race, age, group, crime type, and year using data directly available from UCR and population estimates from Census publications. The fourth category includes variables similar to the third group. Data for estimating these variables were derived from available UCR data on the total number of offenses known to the police and total arrests in combination with the age-by-race specific arrest rates for the different crime types.
N
May township, Washington County, Minnesota annual median income by work...
neilsberg.com
csv, json
Updated Feb 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). May township, Washington County, Minnesota annual median income by work experience and sex dataset: Aged 15+, 2010-2023 (in 2023 inflation-adjusted dollars) // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/may-township-washington-county-mn-income-by-gender/
Explore at:
json, csvAvailable download formats
Dataset updated
Feb 27, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
May Township, Washington County, Minnesota
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates. The dataset covers the years 2010 to 2023, representing 14 years of data. To analyze income differences between genders (male and female), we conducted an initial data analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series (R-CPI-U-RS) based on current methodologies. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median income data over a decade or more for males and females categorized by Total, Full-Time Year-Round (FT), and Part-Time (PT) employment in May township. It showcases annual income, providing insights into gender-specific income distributions and the disparities between full-time and part-time work. The dataset can be utilized to gain insights into gender-based pay disparity trends and explore the variations in income for male and female individuals.

Key observations: Insights from 2023

Based on our analysis ACS 2019-2023 5-Year Estimates, we present the following observations: - All workers, aged 15 years and older: In May township, the median income for all workers aged 15 years and older, regardless of work hours, was $77,845 for males and $40,227 for females.
These income figures highlight a substantial gender-based income gap in May township. Women, regardless of work hours, earn 52 cents for each dollar earned by men. This significant gender pay gap, approximately 48%, underscores concerning gender-based income inequality in the township of May township.
- Full-time workers, aged 15 years and older: In May township, among full-time, year-round workers aged 15 years and older, males earned a median income of $136,250, while females earned $91,750, leading to a 33% gender pay gap among full-time workers. This illustrates that women earn 67 cents for each dollar earned by men in full-time roles. This level of income gap emphasizes the urgency to address and rectify this ongoing disparity, where women, despite working full-time, face a more significant wage discrepancy compared to men in the same employment roles.
Remarkably, across all roles, including non-full-time employment, women displayed a similar gender pay gap percentage. This indicates a consistent gender pay gap scenario across various employment types in May township, showcasing a consistent income pattern irrespective of employment status.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2023-inflation-adjusted dollars.

Gender classifications include:

Male

Female

Employment type classifications include:

Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.

Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Variables / Data Columns

Year: This column presents the data year. Expected values are 2010 to 2023

Male Total Income: Annual median income, for males regardless of work hours

Male FT Income: Annual median income, for males working full time, year-round

Male PT Income: Annual median income, for males working part time

Female Total Income: Annual median income, for females regardless of work hours

Female FT Income: Annual median income, for females working full time, year-round

Female PT Income: Annual median income, for females working part time

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for May township median household income by race. You can refer the same here
Nyc popular baby names
kaggle.com
Updated Jun 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rahul Sarkar (2022). Nyc popular baby names [Dataset]. https://www.kaggle.com/datasets/rahulsarkar221/nyc-popular-baby-names
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 20, 2022
Dataset provided by
Kaggle
Authors
Rahul Sarkar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
New York
Description
This data contains popular baby names in New York .

Dataset :- 1 file (popular-baby-names.csv)

Columns - Year of Birth : Year of the baby's birth. - Gender : Gender of the baby. - Ethnicity : Types of ethnicity they belong to. - Child's First Name : The first name of the child. - Count : How many babies were named . - Ranking : Ranking of that name.
p
Police Race and Identity Based Data - Use of Force - Dataset - CKAN
ckan0.cf.opendata.inter.prod-toronto.ca
Updated Dec 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Police Race and Identity Based Data - Use of Force - Dataset - CKAN [Dataset]. https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/police-race-and-identity-based-data-use-of-force
Explore at:
Dataset updated
Dec 2, 2022
Description
This dataset contains summary table data of information from the provincial Use of Force Reports and occurrences that resulted in an enforcement action. The data used to produce these summary data comes from two sources: a) information about enforcement actions, such as calls for service types and occurrence categories, come from the Service's Records Management System and b) information related to reported use of force, such as highest types of force and perceived weapons, comes from the provincial use of force reports. The data counts unique occurrences which resulted in a police enforcement action or incidents of reported use of force. Hence, there may be more than one person and more than one officer involved in enforcement action incident or reported use of force incident. Since the summary tables are of incidents, where there was more than one person, descriptors such as perceived race refer to the composition of person(s) involved in the enforcement action incident. For example, if the incident involved more than one person, each perceived to be of a different race or gender group, then the incident is categorized as a “multiple race group.” For the purpose of the race-based data analysis, the data includes all incidents which resulted in a police enforcement action and excludes other police interactions with the public, such as taking victim reports, routine traffic or pedestrian stops, or outreach events. Enforcement actions are occurrences where person(s) involved were arrested resulting in charges (including released at scene) or released without charges; received Provincial Offences Act Part III tickets; summons; cautions; diversions; apprehensions, mental health-related incidents as well as those identified as “subject” or “suspect” in an incident to which an officer attended. Reported use of force incident are those in which a Toronto Police Service officer used force and are required to submit a report under the Police Services Act, 1990. For the purposes of the race-based data analysis, it excludes reportable incidents in which force was used against animals, team reports, and incidents where an officer unintentionally discharged a Service weapon during training. Each reported use of force incident is counted once, regardless of the number of officers or subjects involved.
US Cost of Living Dataset (1877 Counties)
kaggle.com
Updated Feb 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
asaniczka (2024). US Cost of Living Dataset (1877 Counties) [Dataset]. http://doi.org/10.34740/kaggle/ds/3832881
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/ds/3832881
Dataset updated
Feb 17, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
asaniczka
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
The US Family Budget Dataset provides insights into the cost of living in different US counties based on the Family Budget Calculator by the Economic Policy Institute (EPI).

This dataset offers community-specific estimates for ten family types, including one or two adults with zero to four children, in all 1877 counties and metro areas across the United States.

Interesting Task Ideas:

See how family budgets compare to the federal poverty line and the Supplemental Poverty Measure in different counties.

Look into the money challenges faced by different types of families using the budgets provided.

Find out which counties have the most affordable places to live, food, transportation, healthcare, childcare, and other things people need.

Explore how the average income of families relates to the overall cost of living in different counties.

Investigate how family size affects the estimated budget and find counties where bigger families have higher costs.

Create visuals showing how the cost of living varies across different states and big cities.

Check whether specific counties are affordable for families of different sizes and types.

Use the dataset to compare living standards and economic security in different US counties.

If you find this dataset valuable, don't forget to hit the upvote button! 😊💝

Checkout my other datasets

Employment-to-Population Ratio for USA

Productivity and Hourly Compensation

130K Kindle Books

900K TMDb Movies

USA Unemployment Rates by Demographics & Race

Photo by Alev Takil on Unsplash
n
Data from: Conference scheduling undermines diversity efforts
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated May 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Burnett; Emily King; Mary Salcedo; Richelle Tanner; Kathryn Wilsterman (2022). Conference scheduling undermines diversity efforts [Dataset]. http://doi.org/10.25338/B8C92R
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25338/B8C92R
Dataset updated
May 27, 2022
Dataset provided by
University of California, Davis
Virginia Tech
University of California, Berkeley
University of Montana
Authors
Nicholas Burnett; Emily King; Mary Salcedo; Richelle Tanner; Kathryn Wilsterman
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Scientific conferences incorporate diversity-focused events into their programming to increase their diversity and inclusivity and to improve the conference experience for scientists from underrepresented groups (URGs). While simply adding diversity-focused events to conferences is positive, maximizing their impact requires that conferences organizeand schedule these events to minimize well-acknowledged, problematic patterns such as the minority tax. To our knowledge, the programming of diversity-focused events at conferences has not been systematically reviewed to identify the extent of these shortcomings and how they can be addressed. This dataset describes temporal trends in the types of diversity-focused events held at biology conferences, the targeted audiences of those events, and scheduling conflicts that occur with each event. Methods Time-series: We gathered publicly available conference programs for the selected biology conferences (Table 1) for the years 2010 through 2019. Not all conferences had programs available for all years, particularly as time from the present increased, thus sample sizes varied across the time series from 17 to 28. Programs were searched for diversity-focused events by both reading through the entire program and conducting keyword searches. The following keywords were used: diversity, gender, female, woman, women, black, race, ethnic*, minorit*, inclusiv*, LGBT*, where asterisks indicate wild-card search terms. For each program, we first scored (yes/no) on whether there were any diversity-focused events. We then scored whether each event was “women-focused” - where the event was specific to women; “ethnic/racial minority groups-focused” – where the event was specific to any URG based on ethnicity and/or race; and/or “LGBTQ+-focused” - where the event was specific to any part of the LGBTQ+ community. Using these scores, we calculated for each calendar year the percent of conferences with (1) any kind of diversity-focused event, (2) women-focused events, (3) ethnic/racial minorities-focused events, and (4) LGBTQ+-focused events. Table 1. Biology conferences were acquired from a list of societies affiliated with the American Association for the Advancement of Science (https://www.aaas.org/group/60/list-aaas-affiliates). We included a conference if its primary focus was on the biological sciences, regardless of whether the conference was hosted by an academic, professional, or not-for-profit organization. Recent publicly available conference programs were used to examine how conferences incorporated diversity-focused events into their schedules.

Society/Conference

Year analyzed

Society/Conference

Year analyzed

American Dairy Science Association

2018

Ecological Society of America

2019

American Ornithological Society

2018

Entomological Society of America

2018

American Physiological Society

2018

International Biometrics Society - Eastern North America

2018

American Phytopathological Society

2018

Microscopy Society of America

2018

American Society for Horticultural Science

2018

Mycological Society of America

2017

American Society for Microbiology

2019

Phycological Society of America

2019

American Society of Agronomy

2018

Poultry Science Association

2018

American Society of Mammalogists

2018

Society for Integrative and Comparative Biology

2018

American Society of Plant Biologists

2019

Society for Neuroscience

2018

Animal Behavior Society

2019

Society for the Study of Evolution

2018

Association for the Sciences of Limnology and Oceanography - Ocean Sciences Meeting

2018

Society of American Foresters

2019

Association of Southeastern Biologists

2018

Society of Toxicology

2018

Behavior Genetics Association

2018

The Wildlife Society

2018

Biophysical Society

2018

Weed Science Society of America

2018

Botanical Society of America

2018

Survey of event-scheduling and targeted audiences: Using one recent program from each conference (years 2017 through 2019), we searched for diversity-focused events by both reading through the entire program and conducting keyword searches. The keywords used are listed above in the Time Series section. From these searches, we found 87 diversity-focused events from 21 out of the 29 conferences. Target audience: For each conference, we used the title and any other description of the event to classify the targeted audience as either an underrepresented group (URG) or the broader conference community. For example, events with titles such as “Inclusive Teaching Workshop” were classified as broadly targeted, whereas events with titles such as “Minority Social” were classified as URG-targeted. However, if any event contained the explicit statement that “all are welcome” (or similar), the event was classified as targeted at the broader conference community. Event format: We also used the titles and other event descriptions to classify the formats of events. Events were classified as socials, workshops, symposia, plenary lectures, forums and town halls, orientations, or poster sessions. The most common events were socials, workshops, and symposia (e.g., “LGBTQ+ Networking Event and Social”, “Workshop for Creating an Inclusive Research Environment”, and “Symposium Honoring the Roles of Women in Microbiology”, respectively). Breaks or scientific sessions: We used the conference schedule to identify whether each diversity-focused event occurred during a scheduled break versus the main scientific sessions. We defined a break as a period that was either explicitly labeled as a break (e.g., lunch, dinner) or occurred outside the daily start or end of conference-wide scientific events, which included workshops, plenary lectures, poster sessions, and contributed oral presentations. Number of conflicting events: We used the conference schedule to count the number of events that overlapped with each diversity-focused event for more than 15 minutes. Events were only counted as separate events if they occurred in separate rooms. “Business” events and other closed, invitation-only events were not included in this calculation. Overlap for an average conference event: Because the baseline number of overlapping events can vary with the size of a conference, we conducted a randomized survey to calculate how many events overlapped with an “average” event at a conference. For each day of a conference, we used a random number generator to identify a single hour with conference activity and counted the number of overlapping events within the first 15 minutes of that hour. The number of events conflicting with an average event was calculated as the total number of overlapping events minus 1. This number was averaged across the different days for each conference. To validate our randomized survey, we also contacted the organizers of each conference to request attendance numbers for the surveyed years - 15 conferences provided this information. Conflict with an average event was strongly correlated with the size of the conference, thus, we concluded that our method of random surveys was a reliable method for quantifying how busy a conference was.
A fMRI dataset in response to large number of short natural dynamic facial...
openneuro.org
Updated Oct 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Panpan Chen; Chi Zhang; Bao Li; Li Tong; Linyuan Wang; Shuxiao Ma; Long Cao; Ziya Yu; Bin Yan (2024). A fMRI dataset in response to large number of short natural dynamic facial expression videos [Dataset]. http://doi.org/10.18112/openneuro.ds005047.v1.0.4
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds005047.v1.0.4
Dataset updated
Oct 10, 2024
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Panpan Chen; Chi Zhang; Bao Li; Li Tong; Linyuan Wang; Shuxiao Ma; Long Cao; Ziya Yu; Bin Yan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary

Facial expression is among the most natural methods for human beings to convey their emotional information in daily life. Although the neural mechanism of facial expression has been extensively studied employing lab-controlled images and a small number of lab-controlled video stimuli, how the human brain processes natural facial expressions still needs to be investigated. To our knowledge, this type of data specifically on large number of natural facial expression videos is currently missing. We describe here the natural Facial Expressions Dataset (NFED), a fMRI dataset including responses to 1,320 short (3-second) natural facial expression video clips. These video clips is annotated with three types of labels: emotion, gender, and ethnicity, along with accompanying metadata. We validate that the dataset has good quality within and across participants and, notably, can capture temporal and spatial stimuli features. NFED provides researchers with fMRI data for understanding of the visual processing of large number of natural facial expression videos.

Data Records

The data, which were structured following the BIDS format53, were accessible at https://openneuro.org/datasets/ds00504754. The “sub-

Stimulus. Distinct folders store the stimuli for distinct fMRI experiments: "stimuli/face-video", "stimuli/floc", and "stimuli/prf" (Fig. 2b). The category labels and metadata corresponding to video stimuli are stored in the "videos-stimuli_category_metadata.tsv”. The “videos-stimuli_description.json” file describes category and metadata information of video stimuli(Fig. 2b).

Raw MRI data. Each participant's folder is comprised of 11 session folders: “sub-

Volume data from pre-processing. The pre-processed volume-based fMRI data were in the folder named “pre-processed_volume_data/sub-

Surface data from pre-processing. The pre-processed surface-based data were stored in a file named “volumetosurface/sub-

FreeSurfer recon-all. The results of reconstructing the cortical surface were saved as “recon-all-FreeSurfer/sub-

Surface-based GLM analysis data. We have conducted GLMsingle on the data of the main experiment. There is a file named “sub--

Validation. The code of technical validation was saved in the “derivatives/validation/code” folder. The results of technical validation were saved in the “derivatives/validation/results” folder (Fig. 2h). “README.md” describes the detailed information of code and results.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2023). ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics [Dataset]. https://data.sfgov.org/Health-and-Social-Services/ARCHIVED-Mpox-Vaccinations-Given-to-SF-Residents-b/fk8q-nu3s

ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics

Explore at:

csv, json, application/rdfxml, application/rssxml, tsv, xmlAvailable download formats

Dataset updated

Jan 1, 2023

Area covered

San Francisco

Description

In early February 2024, we will be retiring the Mpox Vaccinations Given to SF Residents by Demographics dataset. This dataset will be archived and no longer update. A historic record of this data will remain available.

A. SUMMARY This dataset represents doses of mpox vaccine (JYNNEOS) administered in California to residents of San Francisco ages 18 years or older. This dataset only includes doses of the JYNNEOS vaccine given on or after 5/1/2022. All vaccines given to people who live in San Francisco are included, no matter where the vaccination took place. The data are broken down by multiple demographic stratifications.

B. HOW THE DATASET IS CREATED Information on doses administered to those who live in San Francisco is from the California Immunization Registry (CAIR2), run by the California Department of Public Health (CDPH). Information on individuals’ city of residence, age, race, ethnicity, and sex are recorded in CAIR2 and are self-reported at the time of vaccine administration. Because CAIR2 does not include information on sexual orientation, we pull information from the San Francisco Department of Public Health’s Epic Electronic Health Record (EHR). The populations represented in our Epic data and the CAIR2 data are different. Epic data only include vaccinations administered at SFDPH managed sites to SF residents.

Data notes for population characteristic types are listed below.

Age * Data only include individuals who are 18 years of age or older.

Race/ethnicity * The response option "Other Race" is categorized by the data source system, and the response option "Unknown" refers to a lack of data.

Sex * The response option "Other" is categorized by the source system, and the response option "Unknown" refers to a lack of data.

Sexual orientation * The response option “Unknown/Declined” refers to a lack of data or individuals who reported multiple different sexual orientations during their most recent interaction with SFDPH.

For convenience, we provide the 2020 5-year American Community Survey population estimates.

C. UPDATE PROCESS Updated daily via automated process.

D. HOW TO USE THIS DATASET This dataset includes many different types of demographic groups. Filter the “demographic_group” column to explore a topic area. Then, the “demographic_subgroup” column shows each group or category within that topic area and the total count of doses administered to that population subgroup.

E. CHANGE LOG

UPDATE 1/3/2023: Due to low case numbers, this page will no longer include vaccinations after 12/31/2022.

Clear search

Close search

Google apps

Main menu

ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics

Dataset for: "The effects of skin tone on photoacoustic imaging and...

Pittsburgh American Community Survey Data 2015 - Household Types

Hate Crimes

Overview:

Dataset Details:

Key Features:

Usage:

Data Maintenance:

Additional Notes

Ethnic Organizations Online (EO2) Dataset

Brooklyn, New York Population Breakdown By Race (Excluding Ethnicity)...

About this dataset

Content

Inspiration

Recommended for further research

Many, LA annual income distribution by work experience and gender dataset:...

About this dataset

Content

Inspiration

Recommended for further research

Horse Racing Photo Dataset Dataset

COVID-19 Deaths by Population Characteristics

[Archived] COVID-19 Deaths by Population Characteristics Over Time

ARCHIVED: COVID-19 Cases by Population Characteristics Over Time

Violence Reduction - Victim Demographics - Aggregated

Healthcare Worker Migration, New Mexico, 2021

Data from: Age-by-Race Specific Crime Rates, 1965-1985: [United States]

May township, Washington County, Minnesota annual median income by work...

About this dataset

Content

Inspiration

Recommended for further research

Nyc popular baby names

Police Race and Identity Based Data - Use of Force - Dataset - CKAN

US Cost of Living Dataset (1877 Counties)

Interesting Task Ideas:

Checkout my other datasets

Data from: Conference scheduling undermines diversity efforts

A fMRI dataset in response to large number of short natural dynamic facial...

ARCHIVED: Mpox Vaccinations Given to SF Residents by Demographics