100+ datasets found
  1. Z

    Effect of suicide rates on life expectancy dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4694269
    Explore at:
    Dataset updated
    Apr 16, 2021
    Dataset authored and provided by
    Filip Zoubek
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Effect of suicide rates on life expectancy dataset

    Abstract In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy. The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

    Data

    The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

    LICENSE

    THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

    [1] https://www.kaggle.com/szamil/who-suicide-statistics

    [2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

  2. Statewide Death Profiles

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
    Explore at:
    csv(4689434), csv(16301), csv(5034), csv(463460), csv(2026589), csv(5401561), csv(164006), csv(200270), csv(419332), csv(406971), zipAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  3. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 31, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Aug 1, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 8:11 PM EASTERN ON AUG. 30

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  4. Suicides in England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2024). Suicides in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/suicidesintheunitedkingdomreferencetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2024
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Number of suicides and suicide rates, by sex and age, in England and Wales. Information on conclusion type is provided, along with the proportion of suicides by method and the median registration delay.

  5. g

    Coronavirus (Covid-19) Data in the United States

    • github.com
    • openicpsr.org
    • +4more
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
    Explore at:
    csvAvailable download formats
    Dataset provided by
    New York Times
    License

    https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  6. Deaths; suicide (residents), various themes

    • cbs.nl
    • dexes.eu
    • +1more
    xml
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centraal Bureau voor de Statistiek (2025). Deaths; suicide (residents), various themes [Dataset]. https://www.cbs.nl/en-gb/figures/detail/7022eng
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    Statistics Netherlands
    Authors
    Centraal Bureau voor de Statistiek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1950 - 2023
    Area covered
    The Netherlands
    Description

    This table contains the number of victims of suicide arranged by marital status, method, motives, age and sex. They represent the number deaths by suicide in the resident population of the Netherlands.

    The figures in this table are equal to the suicide figures in the causes of death statistics, because they are based on the same files. The causes of death statistics do not contain information on the motive of suicide. For the years 1950-1995, this information is obtained from a historical data file on suicides. For the years 1996-now the motive is taken from the external causes of death (Niet-Natuurlijke dood) file. Before the 9th revision of the International Statistical Classification of Diseases and Related Health Problems (ICD), i.e. for the years 1950-1978, it was not possible to code "jumping in front of train/metro". For these years 1950-1978 "jumping in front of train/metro" has been left empty, and it has been counted in the group "other method".

    Relative figures have been calculated per 100 000 of the corresponding population group. The figures are calculated based on the average population of the corresponding year.

    Data available from: 1950

    Status of the figures: The figures up to and including 2023 are final.

    Changes as of January 23rd 2025: The figures for 2023 are made final.

    When will new figures be published: In the third quarter of 2025 the provisional figures for 2024 will be published.

  7. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    csv, zip
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 31, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  8. O

    Medical Examiner-Coroner, Suicide Deaths dataset

    • data.sccgov.org
    • splitgraph.com
    csv, xlsx, xml
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of the Medical Examiner-Coroner's (2025). Medical Examiner-Coroner, Suicide Deaths dataset [Dataset]. https://data.sccgov.org/Health/Medical-Examiner-Coroner-Suicide-Deaths-dataset/aw9t-m5w4
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Aug 20, 2025
    Dataset authored and provided by
    Office of the Medical Examiner-Coroner's
    Description

    Note: This Dataset is updated nightly and contains all downloadable Medical Examiner-Coroner records, January 1, 2018 to current, related to deaths that occurred in the County of Santa Clara under the Medical Examiner-Coroner’s jurisdiction and those deaths reportable to the Medical Examiner-Coroner (non-jurisdictional cases/NJA) but in which the office did not assume jurisdiction.

    The Santa Clara County Medical Examiner- Coroner’s Office determines cause and manner of death for those deaths that fall under the jurisdiction of the Medical Examiner-Coroner, as defined by California Government code 27491.

    The Medical Examiner-Coroner will not be responsible for data verification, interpretation or misinformation once data has been downloaded and manipulated from the dashboard.

    Refer to the following document to know more of which deaths are reportable: https://medicalexaminer.sccgov.org/sites/g/files/exjcpb986/files/Reportable%20Death%20Chart%202018.pdf.

  9. Number of suicides India 1971-2022

    • statista.com
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of suicides India 1971-2022 [Dataset]. https://www.statista.com/statistics/665354/number-of-suicides-india/
    Explore at:
    Dataset updated
    May 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    Over *** thousand deaths due to suicides were recorded in India in 2022. Furthermore, majority of suicides were reported in the state of Tamil Nadu, followed by Rajasthan. The number of suicides that year had increased from the previous year. Some of the causes for suicides in the country were due to professional problems, abuse, violence, family problems, financial loss, sense of isolation and mental disorders. Depressive disorders and suicide As of 2015, over ****** million people worldwide suffered from some kind of depressive disorder. Furthermore, over ** percent of the total population in India suffer from different forms of mental disorders as of 2017. There exists a positive correlation between the number of suicide mortality rates and people with select mental disorders as opposed to those without. Risk factors for mental disorders Every ******* person in India suffers from some form of mental disorder. Today, depressive disorders are regarded as the leading contributor not only to disease burden and morbidity worldwide, but even suicide if not addressed. In 2022, the leading cause for suicide deaths in India was due to family problems. The second leading cause was due to illness. Some of the risk factors, relative to developing mental disorders including depressive and anxiety disorders, include bullying victimization, poverty, unemployment, childhood sexual abuse and intimate partner violence.

  10. d

    COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Aug 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2023). COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-by-race-ethnicity
    Explore at:
    Dataset updated
    Aug 12, 2023
    Dataset provided by
    data.ct.gov
    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical

  11. Number and percentage of homicide victims, by type of firearm used to commit...

    • www150.statcan.gc.ca
    • data.urbandatacentre.ca
    • +4more
    Updated Jul 22, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2019). Number and percentage of homicide victims, by type of firearm used to commit the homicide, inactive [Dataset]. http://doi.org/10.25318/3510007201-eng
    Explore at:
    Dataset updated
    Jul 22, 2019
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of homicide victims, by type of firearm used to commit the homicide (total firearms; handgun; rifle or shotgun; fully automatic firearm; sawed-off rifle or shotgun; firearm-like weapons; other firearms, type unknown), Canada, 1974 to 2018.

  12. w

    Dataset of books called The bottoming book, or, how to get terrible things...

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called The bottoming book, or, how to get terrible things done to you by wonderful people [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+bottoming+book%2C+or%2C+how+to+get+terrible+things+done+to+you+by+wonderful+people
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 1 row and is filtered where the book is The bottoming book, or, how to get terrible things done to you by wonderful people. It features 7 columns including author, publication date, language, and book publisher.

  13. e

    An Account on Janke Waali (NCAC_RDD_TAPE_0370A_SE1) - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). An Account on Janke Waali (NCAC_RDD_TAPE_0370A_SE1) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/e6c949bd-d224-5a23-93ac-ada3bef9ad18
    Explore at:
    Dataset updated
    Jul 19, 2025
    Description

    Time references: Janke Wally ascended the throne when Kabu kingdom was 606 years Additional information: Janke Wally and many of his people committed suicide to prevent them being captives forever. b) When Janke Wally realized that he was defeated, he ordered the burning of both gun and gunpowder stores. There was a huge explosion and the entire town was set on fire, then the Fula fighters retreated and fled. After they left, the survivors went in search of others survivors and Nyima Manjang was found alive. Nyima Manjang later got married to Sheriff Mulay Bakari. They had children and among them, were Yahya, Yankuba, and Sheriff Seedy. Timbo defeated Kabu because they possessed more fighters however, despite this, the people of Kabu never fled for their lives. Instead, they stood by their ruler to the very end. It is said that the Koring clan will never fight against the Nyancho clan. Sankulleh is the ancestor of the Koring clan at Kunung Mansa Sansang. The clan members are known to be great fighters. They are related to the Nyancho clan who are descendants of Tiramakang and the Koring clan are Sunjata Keita’s descendants. Tiramakang was one of Sunjata’s war generals. Ngansumana, a member of the Nyancho clan was betrayed and killed at Bambadinka when the town of Kansala was destroyed. His sister expressed that the death of Ngansumana was not the end of the Nyancho clan. When the survivors of Kabu Kansala fled, they resettled at Bere Kolong. While in Bere Kolong they were attacked again by Timbo. Foday Baraka Darame an Islamic scholar (Marabout) in Kabu, converted Mansa Bakari to Islam. Mansa Bakari was commonly referred to as Mansa Kolli. After converting to Islam, he relocated to Sonkunda. While in Sonkunda, he was attacked by assailants and he fled. His wife Jenaba was captured and taken to a ruler named Mansa Edrissa. She gave birth to a child called Kumba Njosanne who later married Alfa Yiramang in Labe. Together they had a child named Alfa Yaya, who later gained prominence and was highly respected.

  14. C

    Allegheny County COVID-19 Tests, Cases and Deaths (Archive)

    • data.wprdc.org
    csv, html
    Updated Jun 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allegheny County (2024). Allegheny County COVID-19 Tests, Cases and Deaths (Archive) [Dataset]. https://data.wprdc.org/dataset/allegheny-county-covid-19-tests-cases-and-deaths
    Explore at:
    csv(16109), csv, html, csv(277234), csv(14904), csv(840), csv(34046863), csv(339166949)Available download formats
    Dataset updated
    Jun 13, 2024
    Dataset authored and provided by
    Allegheny County
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Allegheny County
    Description

    COVID-19 Cases information is reported through the Pennsylvania State Department’s National Electronic Disease Surveillance System (PA-NEDSS). As new cases are passed to the Allegheny County Health Department they are investigated by case investigators. During investigation some cases which are initially determined by the State to be in the Allegheny County jurisdiction may change, which can account for differences between publication of the files on the number of cases, deaths and tests. Additionally, information is not always reported to the State in a timely manner, delays can range from days to weeks, which can also account for discrepancies between previous and current files. Test and Case information will be updated daily. This resource contains individuals who received a COVID-19 test and individuals whom are probable cases. Every day, these records are overwritten with updates. Each row in the data reflects a person that is tested, not tests that are conducted. People that are tested more than once will have their testing and case data updated using the following rules:

    1. Positive tests overwrite negative tests.
    2. Polymerase chain reaction (PCR) tests overwrite antibody or antigen (AG) tests.
    3. The first positive PCR test is never overwritten. Data collected from additional tests do not replace the first positive PCR test.

    Note: On April 4th 2022 the Pennsylvania Department of Health no longer required labs to report negative AG tests. Therefore aggregated counts that included AG tests have been removed from the Municipality/Neighborhood files going forward. Versions of this data up to this cut-off have been retained as archived files.

    Individual Test information is also updated daily. This resource contains the details and results of individual tests along with demographic information of the individual tested. Only PCR and AG tests are included. Every day, these records are overwritten with updates. This resource should be used to determine positivity rates.

    The remaining datasets provide statistics on death demographics. Demographic, municipality and neighborhood information for deaths are reported on a weekly schedule and are not included with individual cases or tests. This has been done to protect the privacy and security of individuals and their families in accordance with the Health Insurance Portability and Accountability Act (HIPAA). Municipality or City of Pittsburgh Neighborhood is based off the geocoded home address of the individual tested.

    Individuals whose home address is incomplete may not be in Allegheny County but whose temporary residency, work or other mitigating circumstance are determined to be in Allegheny County by the Pennsylvania Department of Health are counted as "Undefined".

    Since the start of the pandemic, the ACHD has mapped every day’s COVID tests, cases, and deaths to their Allegheny County municipality and neighborhood. Tests were mapped to patient address, and if this was not available, to the provider location. This has recently resulted in apparent testing rates that exceeded the populations of various municipalities -- mostly those with healthcare providers. As this was brought to our attention, the health department and our data partners began researching and comparing methods to most accurately display the data. This has led us to leave those with missing home addresses off the map. Although these data will still appear in test, case and death counts, there will be over 20,000 fewer tests and almost 1000 fewer cases on the map. In addition to these map changes, we have identified specific health systems and laboratories that had data uploading errors that resulted in missing locations, and are working with them to correct these errors.

    Due to minor discrepancies in the Municipal boundary and the City of Pittsburgh Neighborhood files individuals whose City Neighborhood cannot be identified are be counted as “Undefined (Pittsburgh)”.

    On May 19, 2023, with the rescinding of the COVID-19 public health emergency, changes in data and reporting mechanisms prompted a change to an annual data sharing schedule for tests, cases, hospitalizations, and deaths. Dates for annual release are TBD. The weekly municipal counts and individual data produced before this changed are maintained as archive files.

    Support for Health Equity datasets and tools provided by Amazon Web Services (AWS) through their Health Equity Initiative.

  15. Australian Natural Products dataset

    • researchdata.edu.au
    • data.csiro.au
    datadownload
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Collins; Don McGilvery; Katherine Locock; Alex Shmaylov; Simon Saubern (2025). Australian Natural Products dataset [Dataset]. http://doi.org/10.25919/DH87-B149
    Explore at:
    datadownloadAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    CSIROhttp://www.csiro.au/
    Authors
    David Collins; Don McGilvery; Katherine Locock; Alex Shmaylov; Simon Saubern
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Australia
    Description

    A continuation of the "Phytochemistry of Australian Plants" database compiled by David Collins and Don McGilvery. Contains chemical structures, references, species names, with persistent identifiers to the literature and Atlas of Living Australia (ALA) for geographical distributions. The current curation effort here adds DOIs/ISBNs/ISSNs for ~80% of references, persistent IDs for all species or genus to the ALA or other datasets, and validated structures (smiles) for ~70% of structures. No new entries have been added since the last update to the original database in 2022. Change log is in the README file.

    Data provided here was obtained by the listed authors on linked publications, and these authors may have no association with CSIRO. CSIRO acknowledges that the publications linked here may contain Indigenous Cultural and Intellectual Property (ICIP), including traditional knowledge. CSIRO recognizes that First Nations peoples have the right to control, own and maintain their ICIP in accordance with Article 31 of the United Nations Declaration on the Rights of Indigenous Peoples. Users of this dataset may need to obtain permission from First Nations peoples for use of the information in linked publications. Users intending to collect and use biological specimens containing the compounds described in the dataset may also require permission of First Nations peoples, and may require permits and access permission from landholders. Recognizing that any ICIP in the linked publications is already publicly available but that the publications are not readily accessible by First Nations peoples, CSIRO is committed to finding ways to make the ICIP in these publications more findable and accessible to the First Nations communities from which the knowledge was originally obtained. Users should be aware that because of the historical context of some of the linked publications, they may contain words, descriptions, images or terms which may be culturally sensitive and/or offensive and that reflect authors’ views, or those of the period in which the content was created but may not be considered appropriate today. If First Nations people identify content within this dataset that they consider breaches cultural protocols they are encouraged to contact CSIRO on csiroenquiries@csiro.au or +61 3 9545 2176 to request its removal from the dataset. Please note that while CSIRO is able to administer the data housed within this dataset, this control does not extend to the associated publications. Requests to remove publications should be directed to the associated publishing company. Lineage: Original data extracted in 2022 from https://fms05.filemakerstudio.com.au/fmi/webd?homeurl=http://www.monash.edu/#PhytoChem by kind permission of David Collins and Don McGilvery.

  16. Permanent Residents – Monthly IRCC Updates

    • open.canada.ca
    • data.amerigeoss.org
    • +1more
    csv, xlsx
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Immigration, Refugees and Citizenship Canada (2025). Permanent Residents – Monthly IRCC Updates [Dataset]. https://open.canada.ca/data/en/dataset/f7e5498e-0ad8-4417-85c9-9b8aff9b9eda
    Explore at:
    xlsx, csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    Immigration, Refugees and Citizenship Canadahttp://www.cic.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 2015 - Jun 30, 2025
    Description

    People who have been granted permanent resident status in Canada. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.

  17. F

    Portuguese Product Image OCR Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Portuguese Product Image OCR Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/portuguese-product-image-ocr-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the Portuguese Product Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Portuguese language.

    Dataset Contain & Diversity:

    Containing a total of 2000 images, this Portuguese OCR dataset offers diverse distribution across different types of front images of Products. In this dataset, you'll find a variety of text that includes product names, taglines, logos, company names, addresses, product content, etc. Images in this dataset showcase distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build a robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible Portuguese text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, to build a balanced OCR dataset. The collection features images in portrait and landscape modes.

    All these images were captured by native Portuguese people to ensure the text quality, avoid toxic content and PII text. We used the latest iOS and Android mobile devices above 5MP cameras to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data, you will also receive detailed structured metadata in CSV format. For each image, it includes metadata like image orientation, county, language, and device information. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Portuguese text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native Portuguese crowd community.

    If you require a custom product image OCR dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific project requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this product image OCR dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Portuguese language. Your journey to enhanced language understanding and processing starts here.

  18. w

    Immigration system statistics data tables

    • gov.uk
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2025). Immigration system statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables
    Explore at:
    Dataset updated
    Aug 21, 2025
    Dataset provided by
    GOV.UK
    Authors
    Home Office
    Description

    List of the data tables as part of the Immigration system statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.

    If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

    Accessible file formats

    The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
    If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
    Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Immigration system statistics, year ending June 2025
    Immigration system statistics quarterly release
    Immigration system statistics user guide
    Publishing detailed data tables in migration statistics
    Policy and legislative changes affecting migration to the UK: timeline
    Immigration statistics data archives

    Passenger arrivals

    https://assets.publishing.service.gov.uk/media/689efececc5ef8b4c5fc448c/passenger-arrivals-summary-jun-2025-tables.ods">Passenger arrivals summary tables, year ending June 2025 (ODS, 31.3 KB)

    ‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.

    Electronic travel authorisation

    https://assets.publishing.service.gov.uk/media/689efd8307f2cc15c93572d8/electronic-travel-authorisation-datasets-jun-2025.xlsx">Electronic travel authorisation detailed datasets, year ending June 2025 (MS Excel Spreadsheet, 57.1 KB)
    ETA_D01: Applications for electronic travel authorisations, by nationality ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality

    Entry clearance visas granted outside the UK

    https://assets.publishing.service.gov.uk/media/68b08043b430435c669c17a2/visas-summary-jun-2025-tables.ods">Entry clearance visas summary tables, year ending June 2025 (ODS, 56.1 KB)

    https://assets.publishing.service.gov.uk/media/689efda51fedc616bb133a38/entry-clearance-visa-outcomes-datasets-jun-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending June 2025 (MS Excel Spreadsheet, 29.6 MB)
    Vis_D01: Entry clearance visa applications, by nationality and visa type
    Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome

    Additional data relating to in country and overseas Visa applications can be fo

  19. F

    Dutch Handwritten Sticky Notes OCR Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Dutch Handwritten Sticky Notes OCR Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/dutch-sticky-notes-ocr-image-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the Dutch Sticky Notes Image Dataset - a diverse and comprehensive collection of handwritten text images carefully curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Dutch language.

    Dataset Contain & Diversity:

    Containing more than 2000 images, this Dutch OCR dataset offers a wide distribution of different types of sticky note images. Within this dataset, you'll discover a variety of handwritten text, including quotes, sentences, and individual words on sticky notes. The images in this dataset showcase distinct handwriting styles, fonts, font sizes, and writing variations.

    To ensure diversity and robustness in training your OCR model, we allow limited (less than three) unique images in a single handwriting. This ensures we have diverse types of handwriting to train your OCR model on. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible Dutch text.

    The images have been captured under varying lighting conditions, including day and night, as well as different capture angles and backgrounds. This diversity helps build a balanced OCR dataset, featuring images in both portrait and landscape modes.

    All these sticky notes were written and images were captured by native Dutch people to ensure text quality, prevent toxic content, and exclude PII text. We utilized the latest iOS and Android mobile devices with cameras above 5MP to maintain image quality. Images in this training dataset are available in both JPEG and HEIC formats.

    Metadata:

    In addition to the image data, you will receive structured metadata in CSV format. For each image, this metadata includes information on image orientation, country, language, and device details. Each image is correctly named to correspond with the metadata.

    This metadata serves as a valuable resource for understanding and characterizing the data, aiding informed decision-making in the development of Dutch text recognition models.

    Update & Custom Collection:

    We are committed to continually expanding this dataset by adding more images with the help of our native Dutch crowd community.

    If you require a customized OCR dataset containing sticky note images tailored to your specific guidelines or device distribution, please don't hesitate to contact us. We have the capability to curate specialized data to meet your unique requirements.

    Additionally, we can annotate or label the images with bounding boxes or transcribe the text in the images to align with your project's specific needs using our crowd community.

    License:

    This image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage this sticky notes image OCR dataset to enhance the training and performance of text recognition, text detection, and optical character recognition models for the Dutch language. Your journey to improved language understanding and processing begins here.

  20. F

    Turkish Newspaper, Magazine, and Books OCR Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Turkish Newspaper, Magazine, and Books OCR Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/turkish-newspaper-book-magazine-ocr-image-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the Turkish Newspaper, Books, and Magazine Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Turkish language.

    Dataset Contain & Diversity:

    Containing a total of 5000 images, this Turkish OCR dataset offers an equal distribution across newspapers, books, and magazines. Within, you'll find a diverse collection of content, including articles, advertisements, cover pages, headlines, call outs, and author sections from a variety of newspapers, books, and magazines. Images in this dataset showcases distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personal identifiable information (PII), and in each image a minimum of 80% space is contain visible Turkish text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, further enhancing dataset diversity. The collection features images in portrait and landscape modes.

    All these images were captured by native Turkish people to ensure the text quality, avoid toxic content and PII text. We used latest iOS and android mobile devices above 5MP camera to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data you will also receive detailed structured metadata in CSV format. For each image it includes metadata like device information, source type like newspaper, magazine or book image, and image type like portrait or landscape etc. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Turkish text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native Turkish crowd community.

    If you require a custom dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this image dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Turkish language. Your journey to enhanced language understanding and processing starts here.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4694269

Effect of suicide rates on life expectancy dataset

Explore at:
Dataset updated
Apr 16, 2021
Dataset authored and provided by
Filip Zoubek
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

Effect of suicide rates on life expectancy dataset

Abstract In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy. The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

Data

The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

LICENSE

THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

[1] https://www.kaggle.com/szamil/who-suicide-statistics

[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

Search
Clear search
Close search
Google apps
Main menu