100+ datasets found
  1. Infant Mortality, Deaths Per 1,000 Live Births (LGHC Indicator)

    • data.ca.gov
    • data.chhs.ca.gov
    • +3more
    chart, csv, zip
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Infant Mortality, Deaths Per 1,000 Live Births (LGHC Indicator) [Dataset]. https://data.ca.gov/dataset/infant-mortality-deaths-per-1000-live-births-lghc-indicator
    Explore at:
    zip, csv, chartAvailable download formats
    Dataset updated
    Nov 7, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a source dataset for a Let's Get Healthy California indicator at https://letsgethealthy.ca.gov/. Infant Mortality is defined as the number of deaths in infants under one year of age per 1,000 live births. Infant mortality is often used as an indicator to measure the health and well-being of a community, because factors affecting the health of entire populations can also impact the mortality rate of infants. Although California’s infant mortality rate is better than the national average, there are significant disparities, with African American babies dying at more than twice the rate of other groups. Data are from the Birth Cohort Files. The infant mortality indicator computed from the birth cohort file comprises birth certificate information on all births that occur in a calendar year (denominator) plus death certificate information linked to the birth certificate for those infants who were born in that year but subsequently died within 12 months of birth (numerator). Studies of infant mortality that are based on information from death certificates alone have been found to underestimate infant death rates for infants of all race/ethnic groups and especially for certain race/ethnic groups, due to problems such as confusion about event registration requirements, incomplete data, and transfers of newborns from one facility to another for medical care. Note there is a separate data table "Infant Mortality by Race/Ethnicity" which is based on death records only, which is more timely but less accurate than the Birth Cohort File. Single year shown to provide state-level data and county totals for the most recent year. Numerator: Infants deaths (under age 1 year). Denominator: Live births occurring to California state residents. Multiple years aggregated to allow for stratification at the county level. For this indicator, race/ethnicity is based on the birth certificate information, which records the race/ethnicity of the mother. The mother can “decline to state”; this is considered to be a valid response. These responses are not displayed on the indicator visualization.

  2. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 1, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Nov 29, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 7:11 AM EASTERN ON DEC. 1

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  3. US Mass Shootings

    • kaggle.com
    zip
    Updated May 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeeshan-ul-hassan Usmani (2022). US Mass Shootings [Dataset]. https://www.kaggle.com/zusmani/us-mass-shootings-last-50-years
    Explore at:
    zip(317763 bytes)Available download formats
    Dataset updated
    May 25, 2022
    Authors
    Zeeshan-ul-hassan Usmani
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    United States
    Description

    Context

    Mass Shootings in the United States of America (1966-2017) The US has witnessed 398 mass shootings in last 50 years that resulted in 1,996 deaths and 2,488 injured. The latest and the worst mass shooting of October 2, 2017 killed 58 and injured 515 so far. The number of people injured in this attack is more than the number of people injured in all mass shootings of 2015 and 2016 combined. The average number of mass shootings per year is 7 for the last 50 years that would claim 39 lives and 48 injured per year.

    Content

    Geography: United States of America

    Time period: 1966-2017

    Unit of analysis: Mass Shooting Attack

    Dataset: The dataset contains detailed information of 398 mass shootings in the United States of America that killed 1996 and injured 2488 people.

    Variables: The dataset contains Serial No, Title, Location, Date, Summary, Fatalities, Injured, Total Victims, Mental Health Issue, Race, Gender, and Lat-Long information.

    Acknowledgements

    I’ve consulted several public datasets and web pages to compile this data. Some of the major data sources include Wikipedia, Mother Jones, Stanford, USA Today and other web sources.

    Inspiration

    With a broken heart, I like to call the attention of my fellow Kagglers to use Machine Learning and Data Sciences to help me explore these ideas:

    • How many people got killed and injured per year?

    • Visualize mass shootings on the U.S map

    • Is there any correlation between shooter and his/her race, gender

    • Any correlation with calendar dates? Do we have more deadly days, weeks or months on average

    • What cities and states are more prone to such attacks

    • Can you find and combine any other external datasets to enrich the analysis, for example, gun ownership by state

    • Any other pattern you see that can help in prediction, crowd safety or in-depth analysis of the event

    • How many shooters have some kind of mental health problem? Can we compare that shooter with general population with same condition

    Mass Shootings Dataset Ver 3

    This is the new Version of Mass Shootings Dataset. I've added eight new variables:

    1. Incident Area (where the incident took place),
    2. Open/Close Location (Inside a building or open space)
    3. Target (possible target audience or company),
    4. Cause (Terrorism, Hate Crime, Fun (for no obvious reason etc.)
    5. Policeman Killed (how many on duty officers got killed)
    6. Age (age of the shooter)
    7. Employed (Y/N)
    8. Employed at (Employer Name)

    Age, Employed and Employed at (3 variables) contain shooter details

    Mass Shootings Dataset Ver 4

    Quite a few missing values have been added

    Mass Shootings Dataset Ver 5

    Three more recent mass shootings have been added including the Texas Church shooting of November 5, 2017

    I hope it will help create more visualization and extract patterns.

    Keep Coding!

  4. d

    Travel Danger

    • data.world
    csv, zip
    Updated Apr 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State Department Travel Warnings (2025). Travel Danger [Dataset]. https://data.world/travelwarnings/travel-danger
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Apr 19, 2025
    Authors
    State Department Travel Warnings
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2008 - 2016
    Description

    This dataset contains data and analysis from the article Do State Department Travel Warnings Reflect Real Danger?

    Key findings

    • On the whole, there is a significant relationship between the number of American deaths abroad per capita and the number of travel warnings a country receives
    • Mexico, Mali, and Israel have been targeted by the most travel warnings in recent years, but Americans are more likely to be killed in Thailand, Pakistan, and the Philippines
    • Several countries with relatively high rates of American death have not been issued a single travel warning in ~7 years, including Belize, Guyana, and Guatemala
    • Several countries with relatively low rates of American death have been issued a relatively high number of travel warnings in ~7 years, including Israel, Turkey, and Saudi Arabia
    • Overall, countries subject to travel warnings do not see notable declines in American visitors in the 6 months after a warning is issued

    Data sources

    Charts / data visualizations

    https://cdn-images-1.medium.com/max/800/1*moPQYbzXW0Jx6AFhY8VKWQ.png" alt="alt text">

    https://cdn-images-1.medium.com/max/800/1*s1OX6ke8wlHhK4VubpVWcg.png" alt="alt text">

    https://cdn-images-1.medium.com/max/800/1*JwvpqE4YIuYfx2UEqCp9nA.png" alt="alt text">

    https://cdn-images-1.medium.com/max/800/1*LHLsJ0IzLsSlNl0UN8XrAw.png" alt="alt text">

    https://cdn-images-1.medium.com/max/800/1*l0sqn7voWyMCbwoQ2OKGfg.png" alt="alt text">

  5. Mental Illness Prevalence Across the US

    • kaggle.com
    zip
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Mental Illness Prevalence Across the US [Dataset]. https://www.kaggle.com/datasets/thedevastator/investigating-serious-mental-illness-prevalence
    Explore at:
    zip(13919 bytes)Available download formats
    Dataset updated
    Dec 14, 2022
    Authors
    The Devastator
    Area covered
    United States
    Description

    Mental Illness Prevalence Across the US

    Substate Level Estimates

    By Substance Abuse and Mental Health Services Organization [source]

    About this dataset

    This dataset contains estimates of serious mental illness in the US by state and substate region from 2012-2014. This data helps to understand better the mental health disparities that exist between states and different regions within states. By looking at this data, researchers can identify the parts of the country with particularly high or low rates of serious mental illness, which can help prioritize resources for affected areas.

    The dataset includes estimates along with 95% confidence intervals based on a survey-weighted hierarchical Bayes estimation approach and are generated by Markov Chain Monte Carlo techniques. Columns labeled Map Group can be used to distinguish substate regions included in corresponding maps as well as numerical order for sorting original sort order. For definitions in Substate Region, refer to the National Survey on Drug Use and Health's Substate Region Definitions found here: https://www.samhsa.gov/data/sites/default/files/NSDUHsubstateRegionDefs2014/NSDUHsubstateRegionDefs2014.pdf

    This reliable information is provided by SAMHSA, Center for Behavioral Health Statistics and Quality through their National Survey on Drug Use and Health from 2012-2014; helping us gain insights into America’s overall mental health picture – revealing more about where help is needed most urgently so that we can take steps towards a healthier future for all Americans!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    Welcome to this dataset! This dataset contains estimates of Serious Mental Illnesses in the United States by state and substate region from 2012 to 2014. It is designed for researchers, analysts, and data scientists looking for information about the prevalence of Serious Mental Illnesses across the US.

    Research Ideas

    • Performing a trend analysis to identify changes in the estimates of serious mental illnesses over time and across different geographic regions.
    • Exploring disparities in serious mental illnesses among certain minority groups or deprived socio-economic subgroups by comparing estimates at the substate level.
    • Developing targeted public health strategies and interventions for states with higher than average rates of serious mental illness prevalence

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: 2012-2014_Substate_SAE_Table_24.csv | Column name | Description | |:--------------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | Order | A numerical order that can be used to sort the data back to its original order. (Numeric) | | State | The US state associated with the data. (String) | | Substate Region | The substate region associated with the data. (String) | | 95% CI (Lower) | The lower bound of the 95 percent confidence interval for the estimated number of people with serious mental illness in the region. (Numeric) | | 95% CI (Upper) | The upper bound of the 95 percent confidence interval for the estimated number of people with serious mental illness in the region. (Numeric) | | Map Group | A numerical value which can distinguish between different substate regions included in the maps. (Numeric) |

    ...

  6. r

    Early Indicators of Later Work Levels Disease and Death (EI) - Union Army...

    • rrid.site
    • scicrunch.org
    • +2more
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Early Indicators of Later Work Levels Disease and Death (EI) - Union Army Samples Public Health and Ecological Datasets [Dataset]. http://identifiers.org/RRID:SCR_008921
    Explore at:
    Dataset updated
    Oct 16, 2025
    Description

    A dataset to advance the study of life-cycle interactions of biomedical and socioeconomic factors in the aging process. The EI project has assembled a variety of large datasets covering the life histories of approximately 39,616 white male volunteers (drawn from a random sample of 331 companies) who served in the Union Army (UA), and of about 6,000 African-American veterans from 51 randomly selected United States Colored Troops companies (USCT). Their military records were linked to pension and medical records that detailed the soldiers������?? health status and socioeconomic and family characteristics. Each soldier was searched for in the US decennial census for the years in which they were most likely to be found alive (1850, 1860, 1880, 1900, 1910). In addition, a sample consisting of 70,000 men examined for service in the Union Army between September 1864 and April 1865 has been assembled and linked only to census records. These records will be useful for life-cycle comparisons of those accepted and rejected for service. Military Data: The military service and wartime medical histories of the UA and USCT men were collected from the Union Army and United States Colored Troops military service records, carded medical records, and other wartime documents. Pension Data: Wherever possible, the UA and USCT samples have been linked to pension records, including surgeon''''s certificates. About 70% of men in the Union Army sample have a pension. These records provide the bulk of the socioeconomic and demographic information on these men from the late 1800s through the early 1900s, including family structure and employment information. In addition, the surgeon''''s certificates provide rich medical histories, with an average of 5 examinations per linked recruit for the UA, and about 2.5 exams per USCT recruit. Census Data: Both early and late-age familial and socioeconomic information is collected from the manuscript schedules of the federal censuses of 1850, 1860, 1870 (incomplete), 1880, 1900, and 1910. Data Availability: All of the datasets (Military Union Army; linked Census; Surgeon''''s Certificates; Examination Records, and supporting ecological and environmental variables) are publicly available from ICPSR. In addition, copies on CD-ROM may be obtained from the CPE, which also maintains an interactive Internet Data Archive and Documentation Library, which can be accessed on the Project Website. * Dates of Study: 1850-1910 * Study Features: Longitudinal, Minority Oversamples * Sample Size: ** Union Army: 35,747 ** Colored Troops: 6,187 ** Examination Sample: 70,800 ICPSR Link: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06836

  7. FiveThirtyEight Police Killings Dataset

    • kaggle.com
    zip
    Updated Apr 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight Police Killings Dataset [Dataset]. https://www.kaggle.com/fivethirtyeight/fivethirtyeight-police-killings-dataset
    Explore at:
    zip(53916 bytes)Available download formats
    Dataset updated
    Apr 26, 2019
    Dataset authored and provided by
    FiveThirtyEighthttps://abcnews.go.com/538
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    Police Killings

    This directory contains the data behind the story Where Police Have Killed Americans In 2015.

    We linked entries from the Guardian's database on police killings to census data from the American Community Survey. The Guardian data was downloaded on June 2, 2015. More information about its database is available here.

    Census data was calculated at the tract level from the 2015 5-year American Community Survey using the tables S0601 (demographics), S1901 (tract-level income and poverty), S1701 (employment and education) and DP03 (county-level income). Census tracts were determined by geocoding addresses to latitude/longitude using the Bing Maps and Google Maps APIs and then overlaying points onto 2014 census tracts. GEOIDs are census-standard and should be easily joinable to other ACS tables -- let us know if you find anything interesting.

    Field descriptions:

    HeaderDescriptionSource
    nameName of deceasedGuardian
    ageAge of deceasedGuardian
    genderGender of deceasedGuardian
    raceethnicityRace/ethnicity of deceasedGuardian
    monthMonth of killingGuardian
    dayDay of incidentGuardian
    yearYear of incidentGuardian
    streetaddressAddress/intersection where incident occurredGuardian
    cityCity where incident occurredGuardian
    stateState where incident occurredGuardian
    latitudeLatitude, geocoded from address
    longitudeLongitude, geocoded from address
    state_fpState FIPS codeCensus
    county_fpCounty FIPS codeCensus
    tract_ceTract ID codeCensus
    geo_idCombined tract ID code
    county_idCombined county ID code
    namelsadTract descriptionCensus
    lawenforcementagencyAgency involved in incidentGuardian
    causeCause of deathGuardian
    armedHow/whether deceased was armedGuardian
    popTract populationCensus
    share_whiteShare of pop that is non-Hispanic whiteCensus
    share_bloackShare of pop that is black (alone, not in combination)Census
    share_hispanicShare of pop that is Hispanic/Latino (any race)Census
    p_incomeTract-level median personal incomeCensus
    h_incomeTract-level median household incomeCensus
    county_incomeCounty-level median household incomeCensus
    comp_incomeh_income / county_incomeCalculated from Census
    county_bucketHousehold income, quintile within countyCalculated from Census
    nat_bucketHousehold income, quintile nationallyCalculated from Census
    povTract-level poverty rate (official)Census
    urateTract-level unemployment rateCalculated from Census
    collegeShare of 25+ pop with BA or higherCalculated from Census

    Note regarding income calculations:

    All income fields are in inflation-adjusted 2013 dollars.

    comp_income is simply tract-level median household income as a share of county-level median household income.

    county_bucket provides where the tract's median household income falls in the distribution (by quintile) of all tracts in the county. (1 indicates a tract falls in the poorest 20% of tracts within the county.) Distribution is not weighted by population.

    nat_bucket is the same but for all U.S. counties.

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

  8. American Community Survey: 1-Year Estimates: Detailed Tables 1-Year

    • catalog.data.gov
    • datasets.ai
    Updated Jul 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2023). American Community Survey: 1-Year Estimates: Detailed Tables 1-Year [Dataset]. https://catalog.data.gov/dataset/american-community-survey-1-year-estimates-detailed-tables-1-year-3092c
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    The American Community Survey (ACS) is an ongoing survey that provides data every year -- giving communities the current information they need to plan investments and services. The ACS covers a broad range of topics about social, economic, demographic, and housing characteristics of the U.S. population. Much of the ACS data provided on the Census Bureau's Web site are available separately by age group, race, Hispanic origin, and sex. Summary files, Subject tables, Data profiles, and Comparison profiles are available for the nation, all 50 states, the District of Columbia, Puerto Rico, every congressional district, every metropolitan area, and all counties and places with populations of 65,000 or more. Detailed Tables contain the most detailed cross-tabulations published for areas 65k and more. The data are population counts. There are over 31,000 variables in this dataset.

  9. N

    Dead Lake Township, Minnesota Age Group Population Dataset: A complete...

    • neilsberg.com
    csv, json
    Updated Sep 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). Dead Lake Township, Minnesota Age Group Population Dataset: A complete breakdown of Dead Lake township age demographics from 0 to 85 years, distributed across 18 age groups [Dataset]. https://www.neilsberg.com/research/datasets/70219958-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 16, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Minnesota, Dead Lake Township
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Dead Lake township population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Dead Lake township. The dataset can be utilized to understand the population distribution of Dead Lake township by age. For example, using this dataset, we can identify the largest age group in Dead Lake township.

    Key observations

    The largest age group in Dead Lake Township, Minnesota was for the group of age 65-69 years with a population of 96 (15.02%), according to the 2021 American Community Survey. At the same time, the smallest age group in Dead Lake Township, Minnesota was the 25-29 years with a population of 7 (1.10%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Dead Lake township is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Dead Lake township total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Dead Lake township Population by Age. You can refer the same here

  10. C

    COVID-19 Daily Rolling Average Case, Death, and Hospitalization Rates -...

    • data.cityofchicago.org
    • healthdata.gov
    • +1more
    csv, xlsx, xml
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2024). COVID-19 Daily Rolling Average Case, Death, and Hospitalization Rates - Historical [Dataset]. https://data.cityofchicago.org/Health-Human-Services/COVID-19-Daily-Rolling-Average-Case-Death-and-Hosp/e68t-c7fv
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    City of Chicago
    Description

    NOTE: This dataset has been retired and marked as historical-only.

    This dataset is a companion to the COVID-19 Daily Cases and Deaths dataset (https://data.cityofchicago.org/d/naz8-j4nc). The major difference in this dataset is that the case, death, and hospitalization corresponding rates per 100,000 population are not those for the single date indicated. They are rolling averages for the seven-day period ending on that date. This rolling average is used to account for fluctuations that may occur in the data, such as fewer cases being reported on weekends, and small numbers. The intent is to give a more representative view of the ongoing COVID-19 experience, less affected by what is essentially noise in the data.

    All rates are per 100,000 population in the indicated group, or Chicago, as a whole, for “Total” columns.

    Only Chicago residents are included based on the home address as provided by the medical provider.

    Cases with a positive molecular (PCR) or antigen test are included in this dataset. Cases are counted based on the date the test specimen was collected. Deaths among cases are aggregated by day of death. Hospitalizations are reported by date of first hospital admission. Demographic data are based on what is reported by medical providers or collected by CDPH during follow-up investigation.

    Denominators are from the U.S. Census Bureau American Community Survey 1-year estimate for 2018 and can be seen in the Citywide, 2018 row of the Chicago Population Counts dataset (https://data.cityofchicago.org/d/85cm-7uqa).

    All data are provisional and subject to change. Information is updated as additional details are received and it is, in fact, very common for recent dates to be incomplete and to be updated as time goes on. At any given time, this dataset reflects cases and deaths currently known to CDPH.

    Numbers in this dataset may differ from other public sources due to definitions of COVID-19-related cases and deaths, sources used, how cases and deaths are associated to a specific date, and similar factors.

    Data Source: Illinois National Electronic Disease Surveillance System, Cook County Medical Examiner’s Office, U.S. Census Bureau American Community Survey

  11. N

    Gratis, OH Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Gratis, OH Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b235d8fd-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Gratis
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Gratis by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Gratis across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.0% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Gratis is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Gratis total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Gratis Population by Race & Ethnicity. You can refer the same here

  12. p

    Counts of Measles reported in UNITED STATES OF AMERICA: 1888-2002

    • tycho.pitt.edu
    • data.niaid.nih.gov
    • +1more
    Updated Apr 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Measles reported in UNITED STATES OF AMERICA: 1888-2002 [Dataset]. https://www.tycho.pitt.edu/dataset/US.14189004
    Explore at:
    Dataset updated
    Apr 1, 2018
    Dataset provided by
    Project Tycho, University of Pittsburgh
    Authors
    Willem G Van Panhuis; Anne L Cross; Donald S Burke
    Time period covered
    1888 - 2002
    Area covered
    United States
    Description

    Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

    Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

    Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".

  13. D

    ARCHIVED: COVID-19 Cases by Population Characteristics Over Time

    • data.sfgov.org
    • healthdata.gov
    • +1more
    csv, xlsx, xml
    Updated Sep 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). ARCHIVED: COVID-19 Cases by Population Characteristics Over Time [Dataset]. https://data.sfgov.org/Health-and-Social-Services/ARCHIVED-COVID-19-Cases-by-Population-Characterist/j7i3-u9ke
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Sep 11, 2023
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.

    B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from:  * Case interviews  * Laboratories  * Medical providers    These multiple streams of data are merged, deduplicated, and undergo data verification processes.  

    Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

    Gender * The City collects information on gender identity using these guidelines.

    Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives.  * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’.

    Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. https://www.sfdph.org/dph/files/PoliciesProcedures/COM9_SexualOrientationGuidelines.pdf">Learn more about our data collection guidelines pertaining to sexual orientation.

    Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

    Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

    Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews.

    Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

    C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023.

    D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

    This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cases on each date.

    New cases are the count of cases within that characteristic group where the positive tests were collected on that specific specimen collection date. Cumulative cases are the running total of all San Francisco cases in that characteristic group up to the specimen collection date listed.

    This data may not be immediately available for recently reported cases. Data updates as more information becomes available.

    To explore data on the total number of cases, use the ARCHIVED: COVID-19 Cases Over Time dataset.

    E. CHANGE LOG

    • 9/11/2023 - data on COVID-19 cases by population characteristics over time are no longer being updated. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”.
    • 6/6/2023 - data on cases by transmission type have been removed. See section ARCHIVED DATA for more detail.
    • 5/16/2023 - data on cases by sexual orientation, comorbidities, homelessness, and single room occupancy have been removed. See section ARCHIVED DATA for more detail.
    • 4/6/2023 - the State implemented system updates to improve the integrity of historical data.
    • 2/21/2023 - system updates to improve reliability and accuracy of cases data were implemented.
    • 1/31/2023 - updated “population_estimate” column to reflect the 2020 Census Bureau American Community Survey (ACS) San Francisco Population estimates.
    • 1/5/2023 - data on SNF cases removed. See section ARCHIVED DATA for more detail.
    • 3/23/2022 - ‘Native American’ changed to ‘American Indian or Alaska Native’ to align with the census.
    • 1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
    • 7/15/2022 - reinfections added to cases dataset. See section SUMMARY for more information on how reinfections are identified.

  14. N

    American Fork, UT Population Dataset: Yearly Figures, Population Change, and...

    • neilsberg.com
    csv, json
    Updated Sep 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). American Fork, UT Population Dataset: Yearly Figures, Population Change, and Percent Change Analysis [Dataset]. https://www.neilsberg.com/research/datasets/6c39fae9-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Sep 18, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    American Fork, Utah
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2022, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2022. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2022. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the American Fork population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of American Fork across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2022, the population of American Fork was 37,268, a 8.25% increase year-by-year from 2021. Previously, in 2021, American Fork population was 34,427, an increase of 2.63% compared to a population of 33,544 in 2020. Over the last 20 plus years, between 2000 and 2022, population of American Fork increased by 14,726. In this period, the peak population was 37,268 in the year 2022. The numbers suggest that the population has not reached its peak yet and is showing a trend of further growth. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2022

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2022)
    • Population: The population for the specific year for the American Fork is shown in this column.
    • Year on Year Change: This column displays the change in American Fork population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for American Fork Population by Year. You can refer the same here

  15. N

    Traverse City, MI Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Traverse City, MI Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e2054913-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Traverse City, Michigan
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Traverse City by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Traverse City. The dataset can be utilized to understand the population distribution of Traverse City by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Traverse City. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Traverse City.

    Key observations

    Largest age group (population): Male # 30-34 years (757) | Female # 70-74 years (831). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Traverse City population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Traverse City is shown in the following column.
    • Population (Female): The female population in the Traverse City is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Traverse City for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Traverse City Population by Gender. You can refer the same here

  16. Diabetes Health Indicators

    • kaggle.com
    zip
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siamak Tahmasbi (2025). Diabetes Health Indicators [Dataset]. https://www.kaggle.com/datasets/siamaktahmasbi/diabetes-health-indicators
    Explore at:
    zip(4413929 bytes)Available download formats
    Dataset updated
    Mar 7, 2025
    Authors
    Siamak Tahmasbi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context Diabetes is one of the most prevalent chronic diseases in the United States, affecting millions of Americans each year and placing a substantial financial burden on the economy. It is a serious chronic condition in which the body loses the ability to effectively regulate blood glucose levels, leading to a reduced quality of life and decreased life expectancy. During digestion, food is broken down into sugars, which enter the bloodstream. This triggers the pancreas to release insulin, a hormone that helps cells in the body use these sugars for energy. Diabetes is typically characterized by either insufficient insulin production or the body's inability to use insulin effectively.

    Chronic high blood sugar levels in individuals with diabetes can lead to severe complications, including heart disease, vision loss, kidney disease, and lower-limb amputation. Although there is no cure for diabetes, strategies such as maintaining a healthy weight, eating a balanced diet, staying physically active, and receiving medical treatments can help mitigate its effects. Early diagnosis is crucial, as it allows for lifestyle modifications and more effective treatment, making predictive models for assessing diabetes risk valuable tools for public health officials.

    The scale of the diabetes epidemic is significant. According to the Centers for Disease Control and Prevention (CDC), as of 2018, approximately 34.2 million Americans have diabetes, while 88 million have prediabetes. Alarmingly, the CDC estimates that 1 in 5 individuals with diabetes and about 8 in 10 individuals with prediabetes are unaware of their condition. Type II diabetes is the most common form, and its prevalence varies based on factors such as age, education, income, geographic location, race, and other social determinants of health. The burden of diabetes disproportionately affects those with lower socioeconomic status. The economic impact is also substantial, with the cost of diagnosed diabetes reaching approximately $327 billion annually, and total costs, including undiagnosed diabetes and prediabetes, nearing $400 billion each year.

    Content The Behavioral Risk Factor Surveillance System (BRFSS) is a health-related telephone survey that is collected annually by the CDC. Each year, the survey collects responses from over 400,000 Americans on health-related risk behaviors, chronic health conditions, and the use of preventative services. It has been conducted every year since 1984. For this project, a XPT of the dataset available on CDC website for the year 2023 was used. This original dataset contains responses from 433,323 individuals and has 345 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.

    I have selected 20 features from this dataset that are suitable for working on the topic of diabetes, and I have saved them in a CSV file without making any changes to the data. The goal of this is to make it easier to work with the data. For more information or to access updated data, you can refer to the CDC website. I initially examined the original dataset from the CDC and found no duplicate entries. That dataset contains 330 columns and features. Therefore, the duplicate cases in this dataset are not due to errors but rather represent individuals with similar conditions. In my opinion, removing these entries would both introduce errors and reduce accuracy.

    Explore some of the following research questions: - Can survey questions from the BRFSS provide accurate predictions of whether an individual has diabetes? - What risk factors are most predictive of diabetes risk? - Can we use a subset of the risk factors to accurately predict whether an individual has diabetes? - Can we create a short form of questions from the BRFSS using feature selection to accurately predict if someone might have diabetes or is at high risk of diabetes?

    Acknowledgements It is important to reiterate that I did not create this dataset, it is simply a summarized and reformatted dataset derived from the BRFSS 2023 dataset available on the CDC website. It is also worth noting that none of the data in this dataset discloses individuals' identities.

    Inspiration Zidian Xie et al for Building Risk Prediction Models for Type 2 Diabetes Using Machine Learning Techniques using the 2014 BRFSS, and Alex Teboul for building Diabetes Health Indicators dataset based on BRFSS 2015 were the inspiration for creating this dataset and exploring the BRFSS in general.

  17. Event-correlated Outage Dataset in America

    • data.openei.org
    • s.cnmilf.com
    • +1more
    archive +2
    Updated Oct 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buxin She; Veronica Adetola; Ji Young Yun; Buxin She; Veronica Adetola; Ji Young Yun (2024). Event-correlated Outage Dataset in America [Dataset]. https://data.openei.org/submissions/6458
    Explore at:
    archive, text_document, websiteAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    Pacific Northwest National Laboratory
    Open Energy Data Initiative (OEDI)
    Authors
    Buxin She; Veronica Adetola; Ji Young Yun; Buxin She; Veronica Adetola; Ji Young Yun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset includes an aggregated and event-correlated analysis of power outages in the United States, synthesized by integrating three data sources: the Environment for the Analysis of Geo-Located Energy Information (EAGLE-I), the Electric Emergency Incident Disturbance Report (DOE-417), and Annual Estimates of the Resident Population for Counties 2024 (CO-EST2024-POP). The EAGLE-I dataset, spanning from 2014 to 2023, encompasses over 146 million customers and offers county-level outage information at 15-minute intervals. The data has been processed, filtered, and aggregated to deliver an enhanced perspective on power outages, which are then correlated with DOE-417 data based on geographic location as well as the start and end times of events. For each major disturbance documented in DOE-417, essential metrics are defined to quantify the outages associated with the event. This dataset supports researchers in examining outages triggered by major disturbances like extreme weather and physical disruptions, thereby aiding studies on power system resilience.

    Links to the raw data for generating the correlated dataset are included below as "DOE-417", "EAGLE-I", and "CO-EST2024-POP" resources.

    Acknowledgement: This work is funded by the Laboratory Directed Research and Development (LDRD) at the Pacific Northwest National Laboratory (PNNL) as part of the Resilience Through Data-Driven, Intelligently Designed Control (RD2C) Initiative.

  18. N

    Dead Lake Township, Minnesota Median Income by Age Groups Dataset: A...

    • neilsberg.com
    csv, json
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Dead Lake Township, Minnesota Median Income by Age Groups Dataset: A Comprehensive Breakdown of Dead Lake township Annual Median Income Across 4 Key Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e92d2f31-f353-11ef-8577-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Minnesota, Dead Lake Township
    Variables measured
    Income for householder under 25 years, Income for householder 65 years and over, Income for householder between 25 and 44 years, Income for householder between 45 and 64 years
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across four age groups (Under 25 years, 25 to 44 years, 45 to 64 years, and 65 years and over) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the distribution of median household income among distinct age brackets of householders in Dead Lake township. Based on the latest 2019-2023 5-Year Estimates from the American Community Survey, it displays how income varies among householders of different ages in Dead Lake township. It showcases how household incomes typically rise as the head of the household gets older. The dataset can be utilized to gain insights into age-based household income trends and explore the variations in incomes across households.

    Key observations: Insights from 2023

    In terms of income distribution across age cohorts, in Dead Lake township, the median household income stands at $149,375 for householders within the 25 to 44 years age group, followed by $86,389 for the 65 years and over age group. Notably, householders within the 45 to 64 years age group, had the lowest median household income at $72,250.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2023-inflation-adjusted dollars.

    Age groups classifications include:

    • Under 25 years
    • 25 to 44 years
    • 45 to 64 years
    • 65 years and over

    Variables / Data Columns

    • Age Of The Head Of Household: This column presents the age of the head of household
    • Median Household Income: Median household income, in 2023 inflation-adjusted dollars for the specific age group

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Dead Lake township median household income by age. You can refer the same here

  19. Hollywood Movies Domestic Lifetime Gross

    • kaggle.com
    zip
    Updated Jan 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Hollywood Movies Domestic Lifetime Gross [Dataset]. https://www.kaggle.com/datasets/thedevastator/hollywood-movies-domestic-lifetime-gross-and-ran
    Explore at:
    zip(637626 bytes)Available download formats
    Dataset updated
    Jan 17, 2023
    Authors
    The Devastator
    Area covered
    Hollywood
    Description

    Hollywood Movies Domestic Lifetime Gross and Ranking

    An Opportunity for Investigating Box Office Performance

    By Elias Dabbas [source]

    About this dataset

    This dataset contains the details about Hollywood's all-time domestic box office records. It includes data scraped from Box Office Mojo, which breakdowns every movie's lifetime gross, ranking and production year. Domestic gross (adjusted to inflation) has been used as the benchmark to determine what movies were the most successful at the box office in America. This dataset allows you to explore an extensive, comprehensive list of Hollywood all-time biggest hits. Analyze examples of previously unprecedented blockbusters and observe current market trends with this comprehensive overview of domestic box office history - only here at this treasury of motion picture insights!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains comprehensive information about Hollywood movies and their domestic performance at the box office. It includes data on films' production year, lifetime gross, ranking and the studio that produced them. By using this dataset, you can analyze the financial successes and failures of films produced by different studios to gain insights into the Hollywood movie market over time.

    The 'rank' column shows each film's ranking compared to other Hollywood movies released in its year of release based on its box office revenue from theaters (not including other sources such as DVD sales or streaming services). The higher the number for a film’s rank means it was more successful financially than other films released in its date window when ticket prices were taken into account; lower numbers equate to less success at that time frame's box office.

    The ‘title’ column features all movies analyzed here with links provided which direct users to articles giving background information about those projects - directorial credentials or management history -- as well as full reviews with ratings given by critics while they were screened theatricallly across North America (U.S., Canada).

    The ‘studio’ outlines which media conglomerate is credited with distribution/marketing rights for each featured motion picture during their original domestic theatrical runs; these name-brands represent umbrella-corporations comprising multiple divisions specializing in creative development/financing of cinematic works along with doorways engineered around technical know-how -- ie: visual effects shops used by filmmakers during post-production responsibilities their respective productions entailed) -- maintained throughout various industrial regions across entertainment media outlets extending well beyond motion pictures proper... including music/television sector domains defined under respective company flags like Warner Bros., Disney(ABC), NBCUniversal(Comcast) ++ et al mirroring segmentations off any parent brand cited within this database under said label; pertaining solely toward big screen celluloid matters examined herein because charter established assumptions indicate only valid commercially viable feature length fare delivering both titles & collections contained below adheres relevant criterion set forth specifications that warrant inclusion alongside applicable vertical peers made front % center terms established formulating current entries visible within page iteration whilst conforming platform protocols designed enable public

    Research Ideas

    • Creating a recommendation engine to suggest similar movies based on lifetime gross and year of release.
    • Data analysis and visualization of box office trends over time for major Hollywood studios.
    • Utilizing the data to recommend alternative ways for movie marketers to invest their advertising budgets in order to maximize their return on investment

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - **Keep i...

  20. N

    Greenville, NC Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Greenville, NC Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b2363f6c-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Greenville, North Carolina
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Greenville by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Greenville across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of female population, with 54.49% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Greenville is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Greenville total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Greenville Population by Race & Ethnicity. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
California Department of Public Health (2025). Infant Mortality, Deaths Per 1,000 Live Births (LGHC Indicator) [Dataset]. https://data.ca.gov/dataset/infant-mortality-deaths-per-1000-live-births-lghc-indicator
Organization logo

Infant Mortality, Deaths Per 1,000 Live Births (LGHC Indicator)

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
zip, csv, chartAvailable download formats
Dataset updated
Nov 7, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is a source dataset for a Let's Get Healthy California indicator at https://letsgethealthy.ca.gov/. Infant Mortality is defined as the number of deaths in infants under one year of age per 1,000 live births. Infant mortality is often used as an indicator to measure the health and well-being of a community, because factors affecting the health of entire populations can also impact the mortality rate of infants. Although California’s infant mortality rate is better than the national average, there are significant disparities, with African American babies dying at more than twice the rate of other groups. Data are from the Birth Cohort Files. The infant mortality indicator computed from the birth cohort file comprises birth certificate information on all births that occur in a calendar year (denominator) plus death certificate information linked to the birth certificate for those infants who were born in that year but subsequently died within 12 months of birth (numerator). Studies of infant mortality that are based on information from death certificates alone have been found to underestimate infant death rates for infants of all race/ethnic groups and especially for certain race/ethnic groups, due to problems such as confusion about event registration requirements, incomplete data, and transfers of newborns from one facility to another for medical care. Note there is a separate data table "Infant Mortality by Race/Ethnicity" which is based on death records only, which is more timely but less accurate than the Birth Cohort File. Single year shown to provide state-level data and county totals for the most recent year. Numerator: Infants deaths (under age 1 year). Denominator: Live births occurring to California state residents. Multiple years aggregated to allow for stratification at the county level. For this indicator, race/ethnicity is based on the birth certificate information, which records the race/ethnicity of the mother. The mother can “decline to state”; this is considered to be a valid response. These responses are not displayed on the indicator visualization.

Search
Clear search
Close search
Google apps
Main menu