34 datasets found
  1. War and Peace

    • kaggle.com
    zip
    Updated Apr 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamadreza Momeni (2024). War and Peace [Dataset]. https://www.kaggle.com/datasets/imtkaggleteam/war-and-peace/code
    Explore at:
    zip(86102 bytes)Available download formats
    Dataset updated
    Apr 6, 2024
    Authors
    Mohamadreza Momeni
    Description

    Data Description: Since 1800, more than 37 million people worldwide have died while actively fighting in wars.

    The number would be much higher still if it also considered the civilians who died due to the fighting, the increased number of deaths from hunger and disease resulting from these conflicts, and the deaths in smaller conflicts that are not considered wars.

    Wars are also terrible in many other ways: they make people’s lives insecure, lower their living standards, destroy the environment, and, if fought between countries armed with nuclear weapons, can be an existential threat to humanity.

    Looking at the news alone, it can be difficult to understand whether more or less people are dying as a result of war than in the past. One has to rely on statistics that are carefully collected so that they can be compared over time.

    How many wars are avoided, and whether the trend of fewer deaths in them continues, is up to our own actions. Conflict deaths recently increased in the Middle East, Africa, and Europe, stressing that the future of these trends is uncertain.

    In this dataset, there are 6 csv files in one zip one. Everything is clear but if you have any question, feel free to ask. Good luck.

    This dataset belongs to Ourworldindata By: Bastian Herre, Lucas Rodés-Guirao, Max Roser, Joe Hasell and Bobbie Macdonald

  2. i

    Mlomp HDSS INDEPTH Core Dataset 1985 - 2014 (Release 2017) - Senegal

    • catalog.ihsn.org
    Updated Sep 19, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valérie Delaunay (2018). Mlomp HDSS INDEPTH Core Dataset 1985 - 2014 (Release 2017) - Senegal [Dataset]. https://catalog.ihsn.org/catalog/study/SEN_1985-2014_INDEPTH-MHDSS_v01_M
    Explore at:
    Dataset updated
    Sep 19, 2018
    Dataset provided by
    Gilles Pison
    Valérie Delaunay
    Cheikh Sokhna
    Laurence Fleury
    El-Hadji Ciré Konko Bâ
    Time period covered
    1985 - 2014
    Area covered
    Senegal
    Description

    Abstract

    In 1985 the population and health observatory was established at Mlomp, in the region of Ziguinchor, in southern Senegal (see map). The objective was to complement the two rural population observatories then existing in the country, Bandafassi, in the south-east, and Niakhar, in the centre-west, with a third observatory in a region - the south-west of the country (Casamance) - whose history, ethnic composition and economic situation were quite different from those of the regions where the first two observatories were located. It was expected that measuring the demographic levels and trends on those three sites would provide better coverage of the demographic and epidemiological diversity of the country.

    Following a population census in 1984-1985, demographic events and causes of death have been monitored yearly. During the initial census, all women were interviewed concerning the birth and survival of their children. Since 1985, yearly censuses, usually conducted in January-February, have been recording demographic data, including all births, deaths, and migrations. The completeness and accuracy of dates of birth and death are cross-checked against those of registers of the local maternity ward (_95% of all births) and dispensary (all deaths are recorded, including those occurring outside the area), respectively. The study area comprises 11 villages with approximately 8000 inhabitants, mostly Diola. Mlomp is located in the Department of Oussouye, Region of Ziguinchor (Casamance), 500 km south of Dakar.

    On 1 January 2000 the Mlomp area included a population of 7,591 residents living in 11 villages. The population density was 108 people per square kilometre. The population belongs to the Diola ethnic group, and the religion is predominantly animist, with a large minority of Christians and a few Muslims. Though low, the educational level - in 2000, 55% of women aged 15-49 had been to school (for at least one year) - is definitely higher than at Bandafassi. The population also benefits from much better health infrastructure and programmes. Since 1961, the area under study has been equipped with a private health centre run by French Catholic nurses and, since 1968, a village maternity centre where most women give birth. The vast majority of the children are totally immunized and involved in a growth-monitoring programme (Pison et al.,1993; Pison et al., 2001).

    Geographic coverage

    The Mlomp DSS site, about 500 km from the capital, Dakar, in Senegal, lies between latitudes 12°36' and 12°32'N and longitudes 16°33' and 16°37'E, at an altitude ranging from 0 to 20 m above sea level. It is in the region of Ziguinchor, Département of Oussouye (Casamance), in southwest Senegal. It is locates 50 km west of the city of Ziguinchor and 25 kms north of the border with Guinea Bissau. It covers about half the Arrondissement of Loudia-Ouolof. The Mlomp DSS site is about 11 km × 7 km and has an area of 70 km2. Villages are households grouped in a circle with a 3-km diameter and surrounded by lands that are flooded during the rainy season and cultivated for rice. There is still no electricity.

    Analysis unit

    Individual

    Universe

    At the census, a person was considered a member of the compound if the head of the compound declared it to be so. This definition was broad and resulted in a de jure population under study. Thereafter, a criterion was used to decide whether and when a person was to be excluded or included in the population.

    A person was considered to exit from the study population through either death or emigration. Part of the population of Mlomp engages in seasonal migration, with seasonal migrants sometimes remaining 1 or 2 years outside the area before returning. A person who is absent for two successive yearly rounds, without returning in between, is regarded as having emigrated and no longer resident in the study population at the date of the second round. This definition results in the inclusion of some vital events that occur outside the study area. Some births, for example, occur to women classified in the study population but physically absent at the time of delivery, and these births are registered and included in the calculation of rates, although information on them is less accurate. Special exit criteria apply to babies born outside the study area: they are considered emigrants on the same date as their mother.

    A new person enters the study population either through birth to a woman of the study population or through immigration. Information on immigrants is collected when the list of compounds of a village is checked ("Are there new compounds or new families who settled since the last visit?") or when the list of members of a compound is checked ("Are there new persons in the compound since the last visit?"). Some immigrants are villagers who left the area several years before and were excluded from the study population. Information is collected to determine in which compound they were previously registered, to match the new and old information.

    Information is routinely collected on movements from one compound to another within the study area. Some categories of the population, such as older widows or orphans, frequently move for short periods of time and live in between several compounds, and they may be considered members of these compounds or of none. As a consequence, their movements are not always declared.

    Kind of data

    Event history data

    Frequency of data collection

    One round of data collection took place annually, except in 1987 and 2008.

    Sampling procedure

    No samplaing is done

    Sampling deviation

    None

    Mode of data collection

    Proxy Respondent [proxy]

    Research instrument

    List of questionnaires: - Household book (used to register informations needed to define outmigrations) - Delivery questionnaire (used to register information of dispensaire ol mlomp) - New household questionnaire - New member questionnaire - Marriage and divorce questionnaire - Birth and marital histories questionnaire (for a new member) - Death questionnaire (used to register the date of death)

    Cleaning operations

    On data entry data consistency and plausibility were checked by 455 data validation rules at database level. If data validaton failure was due to a data collection error, the questionnaire was referred back to the field for revisit and correction. If the error was due to data inconsistencies that could not be directly traced to a data collection error, the record was referred to the data quality team under the supervision of the senior database scientist. This could request further field level investigation by a team of trackers or could correct the inconsistency directly at database level.

    No imputations were done on the resulting micro data set, except for:

    a. If an out-migration (OMG) event is followed by a homestead entry event (ENT) and the gap between OMG event and ENT event is greater than 180 days, the ENT event was changed to an in-migration event (IMG). b. If an out-migration (OMG) event is followed by a homestead entry event (ENT) and the gap between OMG event and ENT event is less than 180 days, the OMG event was changed to an homestead exit event (EXT) and the ENT event date changed to the day following the original OMG event. c. If a homestead exit event (EXT) is followed by an in-migration event (IMG) and the gap between the EXT event and the IMG event is greater than 180 days, the EXT event was changed to an out-migration event (OMG). d. If a homestead exit event (EXT) is followed by an in-migration event (IMG) and the gap between the EXT event and the IMG event is less than 180 days, the IMG event was changed to an homestead entry event (ENT) with a date equal to the day following the EXT event. e. If the last recorded event for an individual is homestead exit (EXT) and this event is more than 180 days prior to the end of the surveillance period, then the EXT event is changed to an out-migration event (OMG)

    In the case of the village that was added (enumerated) in 2006, some individuals may have outmigrated from the original surveillance area and setlled in the the new village prior to the first enumeration. Where the records of such individuals have been linked, and indivdiual can legitmately have and outmigration event (OMG) forllowed by and enumeration event (ENU). In a few cases a homestead exit event (EXT) was followed by an enumeration event in these cases. In these instances the EXT events were changed to an out-migration event (OMG).

    Response rate

    On an average the response rate is about 99% over the years for each round.

    Sampling error estimates

    Not applicable

    Data appraisal

    CenterId Metric Table QMetric Illegal Legal Total Metric Rundate
    SN012 MicroDataCleaned Starts 18756 2017-05-19 00:00
    SN012 MicroDataCleaned Transitions 0 45136 45136 0 2017-05-19 00:00
    SN012 MicroDataCleaned Ends 18756 2017-05-19 00:00
    SN012 MicroDataCleaned SexValues 38 45098 45136 0 2017-05-19 00:00
    SN012 MicroDataCleaned DoBValues 204 44932 45136 0 2017-05-19 00:00

  3. D

    Provisional COVID-19 Deaths: Focus on Ages 0-18 Years

    • data.cdc.gov
    • data.virginia.gov
    • +5more
    csv, xlsx, xml
    Updated Jun 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCHS/DVS (2023). Provisional COVID-19 Deaths: Focus on Ages 0-18 Years [Dataset]. https://data.cdc.gov/widgets/nr4s-juj3?mobile_redirect=true
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Jun 28, 2023
    Dataset authored and provided by
    NCHS/DVS
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    Effective June 28, 2023, this dataset will no longer be updated. Similar data are accessible from CDC WONDER (https://wonder.cdc.gov/mcd-icd10-provisional.html).

    Deaths involving coronavirus disease 2019 (COVID-19) with a focus on ages 0-18 years in the United States.

  4. War and Peace

    • kaggle.com
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira gibin (2024). War and Peace [Dataset]. http://doi.org/10.34740/kaggle/dsv/8085402
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    willian oliveira gibin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    this graph was created in OurDataWorld and R :

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fa1f999a8ca1714c6df81f8b70373ae43%2Fgraph1.png?generation=1712777155869876&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F3ce71761131475698fec2e478beaa257%2Fgraph4.png?generation=1712777161405486&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F0fd6a9ed4c6fb0bbbecc6d135f164395%2Fgraph2.png?generation=1712777167968903&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Ff407ebcc2edc8fcb51d3e1e3a63e323f%2Fgraph3.png?generation=1712777174298361&alt=media" alt="">

    Since 1800, the specter of war has claimed the lives of over 37 million people worldwide. This staggering statistic represents only those who actively fought in wars. The true cost of war is far greater when we consider the countless civilian lives lost, the increased mortality from hunger and disease in war-torn regions, and the untold suffering of those displaced from their homes.

    Beyond the loss of life, war leaves a trail of destruction in its wake:

    Insecurity and instability: Wars create an environment of fear and uncertainty, making it difficult for people to live their lives in peace. Reduced living standards: War-torn countries often suffer from economic collapse, leading to widespread poverty and hunger. Environmental damage: Wars can cause extensive damage to the natural environment, with long-term consequences for both people and wildlife. The threat of nuclear annihilation: In the age of nuclear weapons, the potential for war to escalate into a global catastrophe is a constant threat to humanity. While each war is a tragedy in its own right, data suggests that fewer people have died in conflicts in recent decades than in most of the 20th century. Additionally, there has been a trend towards more peaceful relations between and within countries.

    However, the future of these trends is uncertain. Recent conflicts in the Middle East, Africa, and Europe have seen a resurgence in war-related deaths, underscoring the need for continued vigilance and effort to prevent future conflicts.

    On this page, you will find data, visualizations, and analysis on the prevalence of war and peace between and within countries, and how this has changed over time. This information is essential for understanding the true cost of war and the imperative for building a more peaceful world.

    Join us in the fight for peace. By raising awareness of the human cost of war and supporting initiatives that promote peace and cooperation, we can help to create a future where war is no longer a reality.

    Key Points:

    Over 37 million people have died in wars since 1800. The true cost of war includes civilian deaths, increased mortality from hunger and disease, and displacement. War has devastating consequences for human society and the environment. There has been a trend towards fewer war-related deaths in recent decades, but the future of this trend is uncertain. We must continue to work towards a more peaceful world.

  5. i

    Vadu HDSS INDEPTH Core Dataset 2009 - 2015 (Release 2017) - India

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Siddhivinayak Hirve (Founding Investigator: from 2002-2009) (2019). Vadu HDSS INDEPTH Core Dataset 2009 - 2015 (Release 2017) - India [Dataset]. https://catalog.ihsn.org/catalog/study/IND_2009-2015_INDEPTH-VHDSS_v01_M
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    Dr. Siddhivinayak Hirve (Founding Investigator: from 2002-2009)
    Dr. Sanjay Juvekar (Founding Co-Investigator and presently Investigator: 2002 to date)
    Time period covered
    2009 - 2015
    Area covered
    India
    Description

    Abstract

    Vadu Rural Health Program, KEM Hospital Research Centre Pune has a rich tradition in health care and development being in the forefront of needs-based, issue-driven research over almost 35 years. During the decades of 1980 and 1990 the research at Vadu focused on mother and child with epidemiological and social science research exploring low birth weight, child survival, maternal mortality, safe abortion and domestic violence. The research portfolio has ever since expanded to include adult health and aging, non-communicable and communicable diseases and to clinical trials in recent years. It started with establishment of Health and Demographic Surveillance System at Vadu (HDSS Vadu) in August, 2002 that seeks to establish a quasi-experimental design setting to allow evaluation of impact of health interventions as well as monitor secular trends in diseases, risk factors and health behavior of humans.

    The term "demographic surveillance" means to keep close track of the population dynamics. Vadu HDSS deals with keeping track of health issues and demographic changes in Vadu rural health program (VRHP) area. It is one of the most promising projects of national relevance that aims at establishing a quasi-experimental intervention research setting with the following objectives: 1) To create a longitudinal data base for efficient service delivery, future research, and linking all past micro-studies in Vadu area 2) Monitoring trends in public health problems 3) Keeping track of population dynamics 4) Evaluating intervention services

    This dataset contains the events of all individuals ever resident during the study period (1 Jan. 2009 to 31 Dec. 2015).

    Geographic coverage

    Vadu HDSS falls in two administrative blocks: (1) Shirur and (2) Haweli of Pune district in Maharashtra in western India. It covers an area of approximately 232 square kilometers.

    Analysis unit

    Individual

    Universe

    Vadu HDSS covers as many as 50,000 households having 140,000 population spread across 22 villages.

    Kind of data

    Event history data

    Frequency of data collection

    Two rounds per year

    Sampling procedure

    Vadu area including 22 villages in two administrative blocks is the study area. This area was selected as this is primarily coverage area of Vadu Rural Health Program which is in function since more than four decade. Every individual household is included in HDSS. There is no sampling strategy employed as 100% population coverage in the area is expected.

    Mode of data collection

    Proxy Respondent [proxy]

    Research instrument

    Language of communication is in Marath or Hindi. The form labels are multilingual - in English and Marathi, but the data entered through the forms are in English only.

    The following forms were used: - Field Worker Checklist Form - The checklist provides a guideline to ensure that all the households are covered during the round and the events occurred in each household are captured. - Enumeration Form: To capture the population details at the start of the HDSS or any addition of villages afterwards. - Pregnancy Form: To capture pregnancy details of women in the age group 15 to 49. - Birth Form: To capture the details of the birth events.
    - Inmigration Form: To capture inward population movement from outside the HDSS area and also for movement within the HDSS area. - Outmigration Form: To capture outward population movement from inside the HDSS area and also for movement within the HDSS area. - Death Form: To capture death events.

    Cleaning operations

    Entered data undergo a data cleaning process. During the cleaning process all error data are either corrected in consultaiton with the data QC team or the respective forms are sent back to the field for re collection of correct data. Data editors have the access to the raw dataset for making necessary editing after corrected data are bought from the field.

    For all individuals whose enumeration (ENU), Inmigration (IMG) or Birth (BTH) have occurred before the left censoring date (2009-01-01) and have not outmigrated (OMG) or not died (DTH) before the left censoring date (2009-01-01) are included in the dataset as Enumeration (ENU) with EventDate as the left censored date (2009-01-01). But the actual date of observation of the event (ENU, BTH, IMG) is retained in the dataset as observation date for these left censored ENU events. The individual is dropped from the dataset if their end event (OMG or DTH) is prior to the left censoring date (2009-01-01)

    Response rate

    On an average the response rate is 99.99% in all rounds over the years.

    Sampling error estimates

    Not Applicable

    Data appraisal

    Data is cleaned to an acceptable level against the standard data rules using Pentaho Data Integration Comminity Edition (PDI CE) tool. After the cleaning process, quality metrics were as follows:

    CentreId MetricTable QMetric Illegal Legal Total Metric RunDate IN021 MicroDataCleaned Starts 1 301112 301113 0. 2017-05-31 20:06
    IN021 MicroDataCleaned Transitions 0 667010 667010 0. 2017-05-31 20:07
    IN021 MicroDataCleaned Ends 301113 2017-05-31 20:07
    IN021 MicroDataCleaned SexValues 29 666981 667010 0. 2017-05-31 20:07
    IN021 MicroDataCleaned DoBValues 575 666435 667010 0. 2017-05-31 20:07

    Note: Except lower under five mortality in 2012 and lower adult mortality among females in 2013, all other estimates are fairly within expected range. Data underwent additional review in terms of electronic data capture, data cleaning and management to look for reasons for lower under five mortality rates in 2013 and lower female adult mortality in 2013. The additional review returned marginally higher rates and this supplements the validity of collected data. Further field related review of 2012 and 2013 data are underway and any revisions to published data/figures will be shared at a later stage.

  6. 🌎 War and Peace

    • kaggle.com
    zip
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 🌎 War and Peace [Dataset]. https://www.kaggle.com/datasets/mexwell/war-and-peace/code
    Explore at:
    zip(45806 bytes)Available download formats
    Dataset updated
    Jul 18, 2024
    Authors
    mexwell
    Description

    Since 1800, more than 37 million people worldwide have died while actively fighting in wars.

    The number would be much higher still if it also considered the civilians who died due to the fighting, the increased number of deaths from hunger and disease resulting from these conflicts, and the deaths in smaller conflicts that are not considered wars.1

    Wars are also terrible in many other ways: they make people’s lives insecure, lower their living standards, destroy the environment, and, if fought between countries armed with nuclear weapons, can be an existential threat to humanity.

    Looking at the news alone, it can be difficult to understand whether more or less people are dying as a result of war than in the past. One has to rely on statistics that are carefully collected so that they can be compared over time.

    While every war is a tragedy, the data suggests that fewer people died in conflicts in recent decades than in most of the 20th century. Countries have also built more peaceful relations between and within them.

    How many wars are avoided, and whether the trend of fewer deaths in them continues, is up to our own actions. Conflict deaths recently increased in the Middle East, Africa, and Europe, stressing that the future of these trends is uncertain.

    Original Data

    Acknowlegement

    Foto von UX Gun auf Unsplash

  7. Data from: Estimated Deaths, Intensive Care Admissions and Hospitalizations...

    • figshare.com
    xlsx
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Fisman (2023). Estimated Deaths, Intensive Care Admissions and Hospitalizations Averted in Canada during the COVID-19 Pandemic [Dataset]. http://doi.org/10.6084/m9.figshare.14036549.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 28, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    David Fisman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Canada
    Description

    These datasets explore disparities in COVID-19 mortality observed in the US and Canada between January 2020 and early March 2021. Table 1 provides counts of deaths, hospitalizations, ICU admissions, and cases, by age, for Ontario, Canada (Canada's most populous province).

    Table 2 estimates deaths averted by Canada's response to the COVID-19 pandemic, relative to that in the United States, by "Canada-standardizing" the US epidemic (i.e., by applying US age-specific mortality to Canadian populations, in order to estimate the deaths that would have occurred in a Canadian pandemic with the same rates of death as have been observed in the US). Observed Canadian deaths are compared to "expected" deaths with a US-like response in order to estimate both deaths averted and SMR (Table 2).

    As Canadian age groups for purposes of death reporting are slightly different from those used in the US (e.g., 0-17 in the US vs. 0-19 in Canada), we reallocate Canadian deaths based on proportions of deaths occurring in 2-year age categories in Ontario (Table 1).

    Ontario age-specific case-fatality is used to inflate the deaths averted, in order to estimate cases averted. Ontario age-specific hospitalization and ICU risk (again derived from Table 1) are used to estimate hospitalizations and ICU admissions averted (Table 2).

    As of August 9, 2022, a new dataset has been added which applies the methodology described above to compare deaths in Canada to those in the United Kingdom, France, and Australia. Estimates of QALY loss, and healthcare costs averted, have also been added. Uncertainty bounds are estimated either as parametric confidence intervals, or as upper and lower bound 95% credible intervals through simulation (implemented using the random draw funding in Microsoft Excel).

    Errors in confidence intervals for QALY losses in France and Australia corrected February 28, 2023.

  8. Asthma Deaths by County

    • data.chhs.ca.gov
    • data.ca.gov
    • +6more
    csv, zip
    Updated Nov 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Asthma Deaths by County [Dataset]. https://data.chhs.ca.gov/dataset/asthma-deaths-by-county
    Explore at:
    csv(43300), zipAvailable download formats
    Dataset updated
    Nov 6, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts and rates (per 1,000,000 residents) of asthma deaths among Californians statewide and by county. The data are stratified by age group (all ages, 0-17, 18+) and reported for 3-year periods. The data are derived from the California Death Statistical Master Files, which contain information collected from death certificates. All deaths with asthma coded as the underlying cause of death (ICD-10 CM J45 or J46) are included.

  9. Z

    Social networks predict the life and death of honey bees - Data

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wild, Benjamin; Dormagen, David; Landgraf, Tim (2021). Social networks predict the life and death of honey bees - Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4438012
    Explore at:
    Dataset updated
    Jan 15, 2021
    Dataset provided by
    Freie Universität Berlin
    Authors
    Wild, Benjamin; Dormagen, David; Landgraf, Tim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Interaction matrices and metadata used in "Social networks predict the life and death of honey bees"

    Preprint: Social networks predict the life and death of honey bees

    See the README file in bb_network_decomposition for example code.

    The following files are included:

    interaction_networks_20160729to20160827.h5

    The social interaction networks as a dense tensor and metadata.

    Keys:

    interactions: Tensor of shape (29, 2010, 2010, 9) (days x individuals x individuals x interaction_types). I_{d,i,j,t} = log(1 + x), where x is the number of interactions of type t between individuals i and j at recording day d. See the methods section of paper of the interaction types.

    labels: Names of the 9 interaction types in the order they are stored in the interactions tensor.

    bee_ids: List of length 2010, mapping from sequential index used in the interaction tensor to the original BeesBook tag ID of the individual

    alive_bees_bayesian.csv

    This file contains the results of the bayesian lifetime model with one row for each bee.

    Columns:

    bee_id: Numerical unique identifier for each individual.

    days_alive: Number of bees the bees was determined to be alive. If the individual was still alive at the end of the recording, the number of days from the day she hatched until the end of the recording.

    death_observed: Boolean indicator whether the death occurred during the recording period.

    annotated_tagged_date: Hatch date of the individual, i.e. the date she was tagged.

    inferred_death_date: The death date as determined by the model.

    bee_daily_data.csv

    This file contains one row per bee per day that she was alive for the focal period.

    Columns:

    bee_id: Numerical unique identifier for each individual.

    date: Date in year-month-day format.

    age: Age in days. Can be NaN if the bee has no associated death_date.

    network_age, network_age_1, network_age_2: The first three dimensions of network age.

    dance_floor, honey_storage, near_exit, brood_area_total: Normalized (sum to 1). Can be NaN if a bee had no high confidence detections (>0.9) for a given day. Can be 0 if a bee was only seen outside of the annotated areas.

    location_descriptor_count: The number of minutes the bee was seen in one of the location labels during that day. I.e., dance_floor * location_descriptor_count calculates the number of minutes, the bee was seen on the dance floor on the given day.

    death_date: Date the bee was last seen in the colony in year-month-day format. Can be NaN for individuals that did not die until the end of the recording period.

    circadian_rhythm: R² value of a sine with a period of one day fitted to the velocity data of the individual over three days. Can be NaN if the fit did not converge due to a lack of data points.

    velocity_peak_time: Phase of the circadian sine fit in hours as an offset to 12:00 UTC. Can be NaN if circadian_rhythm is NaN.

    velocity_day, velocity_night: Mean velocity of the individual between 09:00-18:00 UTC and 21:00-06:00 UTC, respectively. Can be NaN if no velocity data was available for that interval.

    days_left: Difference in days between date and death_date. Can be NaN if death_date is NaN.

    location_data.csv

    This file contains subsampled position information for all bees during the focal period. The data contains one row for every individual for every minute of the recording if that individual was seen at least once during that minute with a tag confidence of at least 0.9. The first matching detection for each individual is used.

    Columns:

    In addition to the bee_id and date columns as in the bee_daily_data.csv, the file contains these additional columns:

    cam_id, cams: The cam_id is a numerical identifier from {0, 1, 2, 3}. Each side of the hive is filmed by two cameras where {0, 1} and {2, 3} record the same side respectively. The cams column contains values either “(0, 1)” or “(2, 3)” and indicates to which sides of the hive this detection belongs.

    x_pos_hive, y_pos_hive: The spatial positions in millimeters on the hive. The two cameras from one side share a common coordinate system.

    location: The label that was assigned to the comb at (x_pos_hive, y_pos_hive) on the given date. The label “other” indicates detections that were outside of any annotated region. The label “not_comb” indicates the wooden frame or empty space around the comb.

    timestamp, date: The timestamp indicates the beginning of each one-minute sampling interval and is given in UTC, as indicated (example: “2016-08-13 00:00:00+00:00”). The date part of the timestamp is repeated in the “date” column. Both are given in year-month-day format.

    Software used to acquire and analyze the data:

    bb_network_decomposition: Network age calculation and regression analyses

    bb_pipeline: Tag localization and decoding pipeline

    bb_pipeline_models: Pretrained localizer and decoder models for bb_pipeline

    bb_binary: Raw detection data storage format

    bb_irflash: IR flash system schematics and arduino code

    bb_imgacquisition: Recording and network storage

    bb_behavior: Database interaction and data (pre)processing, velocity calculation

    bb_circadian: Circadian rhythm calculations

    bb_tracking: Tracking of bee detections over time

    bb_wdd: Automatic detection and decoding of honey bee waggle dances

    bb_interval_determination: Homography calculation

    bb_stitcher: Image stitching

  10. Mortality rates, by age group

    • www150.statcan.gc.ca
    • open.canada.ca
    Updated Dec 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2024). Mortality rates, by age group [Dataset]. http://doi.org/10.25318/1310071001-eng
    Explore at:
    Dataset updated
    Dec 4, 2024
    Dataset provided by
    Government of Canadahttp://www.gg.ca/
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number of deaths and mortality rates, by age group, sex, and place of residence, 1991 to most recent year.

  11. Data for: World's human migration patterns in 2000-2019 unveiled by...

    • data.niaid.nih.gov
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niva, Venla; Horton, Alexander; Virkki, Vili; Heino, Matias; Kallio, Marko; Kinnunen, Pekka; Abel, Guy J; Muttarak, Raya; Taka, Maija; Varis, Olli; Kummu, Matti (2024). Data for: World's human migration patterns in 2000-2019 unveiled by high-resolution data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7997133
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Wittgenstein Centre for Demography and Global Human Capitalhttp://www.oeaw.ac.at/wic/
    Aalto University
    Authors
    Niva, Venla; Horton, Alexander; Virkki, Vili; Heino, Matias; Kallio, Marko; Kinnunen, Pekka; Abel, Guy J; Muttarak, Raya; Taka, Maija; Varis, Olli; Kummu, Matti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    This dataset provides a global gridded (5 arc-min resolution) detailed annual net-migration dataset for 2000-2019. We also provide global annual birth and death rate datasets – that were used to estimate the net-migration – for same years. The dataset is presented in details, with some further analyses, in the following publication. Please cite this paper when using data.

    Niva et al. 2023. World's human migration patterns in 2000-2019 unveiled by high-resolution data. Nature Human Behaviour 7: 2023–2037. Doi: https://doi.org/10.1038/s41562-023-01689-4

    You can explore the data in our online net-migration explorer: https://wdrg.aalto.fi/global-net-migration-explorer/

    Short introduction to the data

    For the dataset, we collected, gap-filled, and harmonised:

    a comprehensive national level birth and death rate datasets for altogether 216 countries or sovereign states; and

    sub-national data for births (data covering 163 countries, divided altogether into 2555 admin units) and deaths (123 countries, 2067 admin units).

    These birth and death rates were downscaled with selected socio-economic indicators to 5 arc-min grid for each year 2000-2019. These allowed us to calculate the 'natural' population change and when this was compared with the reported changes in population, we were able to estimate the annual net-migration. See more about the methods and calculations at Niva et al (2023).

    We recommend using the data either over multiple years (we provide 3, 5 and 20 year net-migration sums at gridded level) or then aggregated over larger area (we provide adm0, adm1 and adm2 level geospatial polygon files). This is due to some noise in the gridded annual data.

    Due to copy-right issues we are not able to release all the original data collected, but those can be requested from the authors.

    List of datasets

    Birth and death rates:

    raster_birth_rate_2000_2019.tif: Gridded birth rate for 2000-2019 (5 arc-min; multiband tif)

    raster_death_rate_2000_2019.tif: Gridded death rate for 2000-2019 (5 arc-min; multiband tif)

    tabulated_adm1adm0_birth_rate.csv: Tabulated sub-national birth rate for 2000-2019 at the division to which data was collected (subnational data when available, otherwise national)

    tabulated_ adm1adm0_death_rate.csv: Tabulated sub-national death rate for 2000-2019 at the division to which data was collected (subnational data when available, otherwise national)

    Net-migration:

    raster_netMgr_2000_2019_annual.tif: Gridded annual net-migration 2000-2019 (5 arc-min; multiband tif)

    raster_netMgr_2000_2019_3yrSum.tif: Gridded 3-yr sum net-migration 2000-2019 (5 arc-min; multiband tif)

    raster_netMgr_2000_2019_5yrSum.tif: Gridded 5-yr sum net-migration 2000-2019 (5 arc-min; multiband tif)

    raster_netMgr_2000_2019_20yrSum.tif: Gridded 20-yr sum net-migration 2000-2019 (5 arc-min)

    polyg_adm0_dataNetMgr.gpkg: National (adm 0 level) net-migration geospatial file (gpkg)

    polyg_adm1_dataNetMgr.gpkg: Provincial (adm 1 level) net-migration geospatial file (gpkg) (if not adm 1 level division, adm 0 used)

    polyg_adm2_dataNetMgr.gpkg: Communal (adm 2 level) net-migration geospatial file (gpkg) (if not adm 2 level division, adm 1 used; and if not adm 1 level division either, adm 0 used)

    Files to run online net migration explorer

    masterData.rds and admGeoms.rds are related to our online ‘Net-migration explorer’ tool (https://wdrg.aalto.fi/global-net-migration-explorer/). The source code of this application is available in https://github.com/vvirkki/net-migration-explorer. Running the application locally requires these two .rds files from this repository.

    Metadata

    Grids:

    Resolution: 5 arc-min (0.083333333 degrees)

    Spatial extent: Lon: -180, 180; -90, 90 (xmin, xmax, ymin, ymax)

    Coordinate ref system: EPSG:4326 - WGS 84

    Format: Multiband geotiff; each band for each year over 2000-2019

    Units:

    Birth and death rates: births/deaths per 1000 people per year

    Net-migration: persons per 1000 people per time period (year, 3yr, 5yr, 20yr, depending on the dataset)

    Geospatial polygon (gpkg) files:

    Spatial extent: -180, 180; -90, 83.67 (xmin, xmax, ymin, ymax)

    Temporal extent: annual over 2000-2019

    Coordinate ref system: EPSG:4326 - WGS 84

    Format: gkpk

    Units:

    Net-migration: persons per 1000 people per year

  12. d

    Data from: Habitat and density effects on the demography of an expanding...

    • dataone.org
    • nde-dev.biothings.io
    • +2more
    Updated Aug 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aimara Planillo; Ilka Reinhardt; Gesa Kluth; Sebastian Collet; Gregor Rolshausen; Carsten Nowak; Katharina Steyer; Götz Ellwanger; Stephanie Kramer-Schadt (2025). Habitat and density effects on the demography of an expanding wolf population in Central Europe [Dataset]. http://doi.org/10.5061/dryad.dncjsxm5m
    Explore at:
    Dataset updated
    Aug 5, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Aimara Planillo; Ilka Reinhardt; Gesa Kluth; Sebastian Collet; Gregor Rolshausen; Carsten Nowak; Katharina Steyer; Götz Ellwanger; Stephanie Kramer-Schadt
    Time period covered
    Jan 1, 2023
    Area covered
    Europe, Central Europe
    Description

    Demographic parameters are key to understanding population dynamics. Here, we analyse the survival and reproduction of the German wolf population in the 20 years following recolonization. Specifically, we analysed the effects of environmental, ecological, and individual characteristics on i) the survival probability of the population; ii) annual survival rates of age classes; iii) reproduction probability; and iv) reproductive output, measured as the number of detected pups/juveniles. Using the Cox proportional hazards model, we estimated a median survival time of circa 3 years for wolves. Annual survival probabilities were found to be 0.75 for juveniles, 0.75 for subadults, and 0.88 for adults. Survival was lower for juveniles in winter and for subadult males in summer, probably associated with dispersal events. Low habitat suitability was clearly associated with lower survival in juveniles and subadults, but not in adults. Local territory density was related to increased survival. Rep..., Wolf individual and territory data for survival and reproduction analyses were provided by the Federal Documentation and Consultation Centre on Wolves (DBBW, www.dbb-wolf.de) and by the Senckenberg Centre for Wildlife Genetics. Information about individuals and territories was grouped into monitoring years (from the 1st of May to the 30th of April next year), starting in 2000 until 2020 (April 2021). Individuals were identified genetically and for the survival analysis, the original dataset was filtered to retain only reliable information on the lifespan of the animals. Thus, individuals with NA ("not available") in the variables 'sex' or 'date of birth' as well as individuals born or died outside the German border were removed, as the environmental data included in the demographic analyses were only available for Germany. Consequently, the status of the individuals (dead, alive) was assessed until April 2021. The age classes were defined as juveniles including pups (0-12 months), suba..., , # Habitat and density effects on the demography of an expanding wolf population in Central Europe

    https://doi.org/10.5061/dryad.dncjsxm5m

    Tables containing the data used in the demographic analyses of the German wolf population.

    Description of the data and file structure

    Two tables are provided:

    • data_wolf_survival_table.csv. This table contains individual information used in the survival analyses: information on the number of weeks an individual was alive (weeks_date) until confirmed death (status = 1) or went missing (censored, status = 0), sex, the month of the year that the individual was last detected (death_month), the season of the last detection, habitat suitability of the natal (hs_8km_natal) and final territories (hs_8km_final), the information to calculate the territory density in the 50 km buffer in the natal (_first) and final (_last) territories (nterr_buffer50 = the number of territories in buffer...
  13. m

    An Extensive Dataset for the Heart Disease Classification System

    • data.mendeley.com
    Updated Feb 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.1
    Explore at:
    Dataset updated
    Feb 15, 2022
    Authors
    Sozan S. Maghdid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70 A comprehensive database for factors that contribute to a heart attack has been constructed , The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. As a result, a form is created to accomplish this. Microsoft Excel was used to create this form. Figure 1 depicts the form which It has nine fields, where eight fields for input fields and one field for output field. Age, gender, heart rate, systolic BP, diastolic BP, blood sugar, CK-MB, and Test-Troponin are representing the input fields, while the output field pertains to the presence of heart attack, which is divided into two categories (negative and positive).negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.Table 1 show the detailed information and max and min of values attributes for 1319 cases in the whole database.To confirm the validity of this data, we looked at the patient files in the hospital archive and compared them with the data stored in the laboratories system. On the other hand, we interviewed the patients and specialized doctors. Table 2 is a sample for 1320 cases, which shows 44 cases and the factors that lead to a heart attack in the whole database,After collecting this data, we checked the data if it has null values (invalid values) or if there was an error during data collection. The value is null if it is unknown. Null values necessitate special treatment. This value is used to indicate that the target isn’t a valid data element. When trying to retrieve data that isn't present, you can come across the keyword null in Processing. If you try to do arithmetic operations on a numeric column with one or more null values, the outcome will be null. An example of a null values processing is shown in Figure 2.The data used in this investigation were scaled between 0 and 1 to guarantee that all inputs and outputs received equal attention and to eliminate their dimensionality. Prior to the use of AI models, data normalization has two major advantages. The first is to avoid overshadowing qualities in smaller numeric ranges by employing attributes in larger numeric ranges. The second goal is to avoid any numerical problems throughout the process.After completion of the normalization process, we split the data set into two parts - training and test sets. In the test, we have utilized1060 for train 259 for testing Using the input and output variables, modeling was implemented.

  14. i

    Rufiji HDSS INDEPTH Core Dataset 1999 - 2014 (Release 2017) - Tanzania

    • catalog.ihsn.org
    Updated Sep 19, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr. Sigilbert Mrema (2018). Rufiji HDSS INDEPTH Core Dataset 1999 - 2014 (Release 2017) - Tanzania [Dataset]. https://catalog.ihsn.org/catalog/study/TZA_1999-2014_INDEPTH-RHDSS_v01_M
    Explore at:
    Dataset updated
    Sep 19, 2018
    Dataset provided by
    Dr. Eveline Gaubbel
    Mr. Sigilbert Mrema
    Time period covered
    1999 - 2014
    Area covered
    Tanzania
    Description

    Abstract

    The Rufiji Health and Demographic Surveillance System (HDSS) was established in October 1998 to evaluate the impact on burden of disease of health system reforms based on locally generated data, prioritization, resource allocation and planning for essential health interventions. The Rufiji HDSS collects detailed information on health and survival and provides a framework for population based health research of relevance to local and national health priorities. In December 2011 the population under surveillance was about 97,000 people, residing in 19,000 households. Monitoring of households and members within households is undertaken in regular four months cycles known as “rounds”. Self-reported information is collected on demographic, household, socio-economic and geographic characteristics. Verbal Autopsy is conducted using standardized questionnaires, to determine probable causes of death.

    Basic Rufiji HDSS data and data requisition forms are available at www.data.ihi.or.tz. Access to online data requires permission from the Data Centralization Team (dc@ihi.or.tz dc@ihi.or.tz). Requests for data are subject to ethics committee approval and will only be accepted from bona fide researchers with specified research objectives or collaborations.

    Geographic coverage

    The Demographic Surveillance Area (DSA) is located in Rufiji District, Coastal Region Tanzania about 178 km south of Dar es Salaam city and extends between south latitude -7.47° and -8.03°S and east longitude 38.62° and 39.17°E. The Rufiji HDSS covers 1,813 km2 that comprising 38 villages of Rufiji district. The district is largely rural though population is clustered around Utete (outside the DSA, District headquarters), Ikwiriri, Kibiti and Bungu townships (Figure 2). The population density of Rufiji HDSS is about 53 people per kilometer square and the average population per village is about 2,552 inhabitants. As a district observatory, information which is collected by Rufiji HDSS can be used by nearby districts with similar ecological characteristics for planning purposes.

    Analysis unit

    Individual

    Universe

    All residents in HDSS area

    Kind of data

    Event history data

    Frequency of data collection

    Three times per year from 1998 to 2012. Two time per year from 2013 onwards.

    Sampling procedure

    No sampling.

    Sampling deviation

    Not applicable

    Mode of data collection

    Proxy Respondent [proxy]

    Research instrument

    The questionnaires are designed to capture the core HDSS information which includes the baseline, birth, inmigration, outmigration and death along with the other questionnaires.

    Cleaning operations

    Data processing: Data from the field are collected for entry at the data center. Data is entered into the server through a network of workstations. Until 2014, the HDSS used custom designed software called the Household Registration System 2 (HRS 2) developed in Visual FoxPro 6.

    The following processing checks are done during the ETL process.

    1. If the first event is legal. Like the first event must beenumeration, birth or inmigration.
    2. If the last event is legal. Like the last event must be end of observtion, death or outmigration.
    3. If the transition events are legal. The list of legal transitions:

      Birth followed by death Birth followed by exit Birth followed by end of observation Birth followed by outmigration

      Death followed by none

      Entry followed by death Entry followed by exit Entry followed by end of observation Entry followed by outmigration Enumeration followed by death Enumeration followed by exit Enumeration followed by outmigration

      Exit followed by entry

      Inmigration followed by Death Inmigration followed by exit Inmigration followed by end of observation Inmigration followed by outmigration

      End of observation followed by none

      Outmigration followed by none Outmigration followed by enumeration Outmigration followed by inmigration

      The list of illegal transitions:

      Birth followed by none Birth followed by birth Birth followed by entry Birth followed by enumeration Birth followed by inmigration

      Death followed by birth Death followed by death Death followed by entry Death followed by enumeration Death followed by exit Death followed by inmigration Death followed by outmigration Death followed by end of observation

      Entry followed by none Entry followed by birth Entry followed by entry Entry followed by enumeration Entry followed by inmigration

      Enumeration followed by none Enumeration followed by birth Enumeration followed by entry Enumeration followed by enumeration Enumeration followed by inmigration

      Exit followed by birth Exit followed by death Exit followed by exit Exit followed by end of observation Exit followed by outmigration

      Inmigration followed by none Inmigration followed by birth Inmigration followed by entry Inmigration followed by enumeration Inmigration followed by inmigration

      End of observation followed by birth End of observation followed by death End of observation followed by entry End of observation followed by enumeration End of observation followed by exit End of observation followed by inmigration End of observation followed by end of observation End of observation followed by outmigration

      Outmigration followed by birth Outmigration followed by death Outmigration followed by exit Outmigration followed by end of observation Outmigration followed by outmigration

      List of edited events:

      Exit followed by none Exit followed by enumeration Exit followed by inmigration

      Outmigration followed by entry

    Response rate

    Response rate: In Rufiji HDSS information on refusals is not collected. However, active community engagement programmes which includes Key Informants (KIs) days, where the HDSS team convenes meetings with KIs for presentations on recent findings to feed back to community and distribution of news letters to households are in place. Community sensitization events are held at the time of introducing new studies. Those initiatives have cemented good relationship with the community and eventually maintained high participation.

    Sampling error estimates

    Not applicable

    Data appraisal

    INDEPTH-data quality metrics: To ensure data quality Rufiji HDSS portions this process into two sections. In the field section, quality of collected information is monitored through a validation process where 3-5% of households are sampled at random for re-interview by a field supervisor who validates the previous collected data. In the data management section, the software used in data management is able to identify some inconsistencies. Some of them are edited online and others are printed and reported back to the field for more clarification and correction. Finally the clean data base is archived for report generation.

    CentreId MetricTable QMetric Illegal Legal Total Metric RunDate TZ012 MicroDataCleaned Starts 205383 2017-05-19 10:31
    TZ012 MicroDataCleaned Transitions 0 514426 514426 0 2017-05-19 10:31
    TZ012 MicroDataCleaned Ends 205383 2017-05-19 10:32
    TZ012 MicroDataCleaned SexValues 4 514422 514426 0 2017-05-19 10:32
    TZ012 MicroDataCleaned DoBValues 2 514424 514426 0 2017-05-19 10:32

  15. US Mass Shootings

    • kaggle.com
    zip
    Updated Mar 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rana Sagheer Khan (2023). US Mass Shootings [Dataset]. https://www.kaggle.com/datasets/ranasagheerkhan/us-mass-shootings/discussion
    Explore at:
    zip(317763 bytes)Available download formats
    Dataset updated
    Mar 15, 2023
    Authors
    Rana Sagheer Khan
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    United States
    Description

    Context

    Mass Shootings in the United States of America (1966-2017)

    The US has witnessed 398 mass shootings in last 50 years that resulted in 1,996 deaths and 2,488 injured. The latest and the worst mass shooting of October 2, 2017 killed 58 and injured 515 so far. The number of people injured in this attack is more than the number of people injured in all mass shootings of 2015 and 2016 combined.

    The average number of mass shootings per year is 7 for the last 50 years that would claim 39 lives and 48 injured per year.

    Content

    Geography: United States of America

    Time period: 1966-2017

    Unit of analysis: Mass Shooting Attack

    Dataset: The dataset contains detailed information of 398 mass shootings in the United States of America that killed 1996 and injured 2488 people.

    Variables: The dataset contains Serial No, Title, Location, Date, Summary, Fatalities, Injured, Total Victims, Mental Health Issue, Race, Gender, and Lat-Long information.

    Acknowledgements

    I’ve consulted several public datasets and web pages to compile this data.

    Some of the major data sources include Wikipedia, Mother Jones, Stanford, USA Today and other web sources.

    Inspiration

    With a broken heart, I like to call the attention of my fellow Kagglers to use Machine Learning and Data Sciences to help me explore these ideas:

    • How many people got killed and injured per year?

    • Visualize mass shootings on the U.S map

    • Is there any correlation between shooter and his/her race, gender

    • Any correlation with calendar dates? Do we have more deadly days, weeks or months on average

    • What cities and states are more prone to such attacks

    • Can you find and combine any other external datasets to enrich the analysis, for example, gun ownership by state

    • Any other pattern you see that can help in prediction, crowd safety or in-depth analysis of the event

    • How many shooters have some kind of mental health problem? Can we compare that shooter with general population with same condition

    Mass Shootings Dataset Ver 3

    This is the new Version of Mass Shootings Dataset. I've added eight new variables:

    Incident Area (where the incident took place), Open/Close Location (Inside a building or open space) Target (possible target audience or company), Cause (Terrorism, Hate Crime, Fun (for no obvious reason etc.) Policeman Killed (how many on duty officers got killed) Age (age of the shooter) Employed (Y/N) Employed at (Employer Name) Age, Employed and Employed at (3 variables) contain shooter details

    Mass Shootings Dataset Ver 4

    Quite a few missing values have been added

    Mass Shootings Dataset Ver 5

    Three more recent mass shootings have been added including the Texas Church shooting of November 5, 2017

    I hope it will help create more visualization and extract patterns.

    Keep Coding!

  16. i

    Dabat Health and Demographic Surveillance System Core Dataset 2008-2011 -...

    • catalog.ihsn.org
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr. Temesgen Azimeraw (2019). Dabat Health and Demographic Surveillance System Core Dataset 2008-2011 - Ethiopia [Dataset]. https://catalog.ihsn.org/index.php/catalog/5332
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    Prof. Afework Kassu
    Dr. Shitaye Alemu
    Prof. Yigzaw Kebede
    Prof. Mengesha Admassu
    Mr. Tesfahun Melese
    Dr. Gashaw Andargie
    Dr. Sisay Yifru
    Mr. Tadesse Awoke
    Mr. Temesgen Azimeraw
    Time period covered
    2008 - 2012
    Area covered
    Ethiopia
    Description

    Abstract

    Introduction Dabat Health and Demographic Surveillance System (HDSS), also called the Dabat Research Center (DRC), was established at Dabat District in 1996 after conducting initial census. Later re-census was done in 2008. The surveillance is run by the College of Medicine and Health Sciences which is one of the colleges/faculties of the University of Gondar. Dabat district is one of the 21 districts in North Gondar Administrative Zone of Amhara Region in Ethiopia. According to the report published by the Central Statistical Agency in 2007, the district has an estimated total population of 145,458 living in 27 rural and 3 urban Kebeles (sub-districts). The altitude of the district ranges from about 1000 meters to over 2500 meters above sea level. The district population largely depends on subsistence agriculture economy. There are two health centers, three health stations, and twenty-nine health posts providing health services for the community. An all-weather road runs from Gondar town through Dabat to some towns of Tigray. Dabat town, the capital of Dabat District, is located approximately 821 km northwest of Addis Ababa and 75 kms north of Gondar town. The surveillance is funded by Centers for Disease Control and Prevention (CDC) through Ethiopian Public Health Association.

    Objectives Dabat HDSS/ Dabat Research Centre was established to generate longitudinal data on health and population at district level and provide a study base and sampling frame for community-based research.

    Methods Dabat district was initially selected purposively as a surveillance site for its unique three climatic conditions, namely Dega (high land and cold), Woina dega (mid land and temperate) and Kolla (low land and hot). The choice was made with the assumption that there would be differences in morbidity and mortality in the different climatic areas. Accordingly, seven kebeles from Dega, one kebele from Woina dega, and two kebeles from Kolla were selected randomly after stratification of the kebeles by climatic zone.

    After the re-census, update has been done regularly every 6 months. During each round, data has been collected using a semi-structured questionnaire which included information related to birth and other pregnancy outcomes, death, migration, and marital status change. Interviews are administered to the heads of the household but in the absence of the head, the next elder family member is interviewed. This is only done after repeated trial of getting the head. While the regular update round is every six months, deaths that occur in the surveillance site are reported immediately to the data collectors by the local guides. After the mourning period, usually 45 days, the trained data collectors administer Verbal Autopsy (VA) questionnaire to the close relative of the deceased to get information on the possible cause(s) of death. Three VA questionnaires are prepared for the age groups 0-28 days, 29 days to 15 years, and greater than 15 years. To assign cause(s) of death, the VA data collected by data collectors is given to physicians who have got training on VA. These physicians independently assign causes of death using the standard International Classification of Diseases (ICD-10).

    Geographic coverage

    Dabat Health and Demographic Surveillance System (HDSS) included seven rural kebeles (sub districts) and three urban kebeles in Dabat district which is located 75 km North of Gondar town in Ethiopia. There are highlands, midlands and few low land households in the HDSS site.

    Analysis unit

    Individual

    Universe

    All individuals residing in Dabat HDSS site.

    Kind of data

    Event history data

    Frequency of data collection

    Two rounds per year

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    All questionnaires are prepared in Amharic language. The surveillance questionnaires are related to birth and other pregnancy outcomes, death, and migration.

    Cleaning operations

    The filled questionnaire is checked by filled supervisors, document clerk, data entry clerks for missings and other violations. In addition, DRC Software, a software developed from Microsoft Access and Visual Basic, checks violations against set of rules for data quality during data entry.

    Response rate

    100% response rate

    Sampling error estimates

    Not applicable

    Data appraisal

    CentreId MetricTable QMetric  Illegal   Lega  Total  Metric RunDate 
    ET051 MicroDataCleaned Starts  0  59082  0  0.0 2014-06-27 19:33 
    ET051 MicroDataCleaned Transitions 0  129938 129938 0.0 2014-06-27 19:33 
    ET051 MicroDataCleaned Ends 0  59082  0  0.0 2014-06-27 19:33
  17. Demographic and clinical characteristics of individuals who died...

    • plos.figshare.com
    xls
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Reiterman; Andrew I. Chin; Heejung Bang (2025). Demographic and clinical characteristics of individuals who died in-hospital/outside of hospital 180 days post-hospital discharge (N = 11,406). [Dataset]. http://doi.org/10.1371/journal.pone.0332203.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 18, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Marc Reiterman; Andrew I. Chin; Heejung Bang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Demographic and clinical characteristics of individuals who died in-hospital/outside of hospital 180 days post-hospital discharge (N = 11,406).

  18. U.S. Mortality and Health Indicators

    • kaggle.com
    zip
    Updated Jan 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). U.S. Mortality and Health Indicators [Dataset]. https://www.kaggle.com/datasets/thedevastator/u-s-mortality-and-health-indicators/discussion
    Explore at:
    zip(1726637 bytes)Available download formats
    Dataset updated
    Jan 28, 2023
    Authors
    The Devastator
    Description

    U.S. Mortality and Health Indicators

    Impact of Risk Factors on Population Health Outcomes

    By Data Society [source]

    About this dataset

    This dataset provides county-level mortality and health indicators that are useful for measuring the impact of health policies in the United States. It includes data elements and values from over a dozen categories, including Demographics, Leading Causes of Death, Summary Measures of Health, Measures of Birth and Death, Relative Health Importance, Vulnerable Populations and Environmental Health, Preventive Services Use, Risk Factors and Access to Care. Additionally, this dataset offers Healthy People 2010 Targets and US Percentages or Rates for easy comparison across states. With comprehensive information for each county in each indicator domain available here at your fingertips could help you get insight into American population health from the local level like never before. Discover trends on disease outbreaks or immunizations that are unprecedentedly localized with insights from this dataset!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains various data elements related to the mortality and health of the US population at various levels such as county, state, etc. This dataset is an ideal source of information for researchers and policy makers who are interested in exploring patterns in the mortality and health of US citizens.

    In order to use this dataset effectively, it is important to understand the different indicators included as well as how to interpret these indicators. In this guide we will look at each indicator domain separately so that users can easily identify which relevant data elements they need for their analysis.

    Demographics: The Demographics indicator domain includes data elements related to demographic characteristics such as age composition, gender composition etc. These indicators can be used to explore trends across different parts of the country or identify disparities among populations.

    Leading Causes of Death: The Leading Causes of Death indicator domain contains information on fatalities by cause over a set period of time -- either two years or five years depending on availability -- so that researchers can identify causes that pose major threats to public health overall or in more specific regions such as certain counties. It is important to note that these largely report figures based on death certificates which may not always tell an exact story due to reporting inaccuracies caused by both individual factors and registration biases across counties/states over time.

     **Summary Measures Of Health**: The Summary Measures Of Health Indicator Domain includes measures commonly used for gauging overall population health such as birth rates and death rates but also key quality-of-life considerations like prevalence rate physical activity rate . These can be used together with other data sources (such as income info) when analyzing population health outcomes from a broader perspective than individual diseases or conditions would allow for . 
    
     **Measures Of Birth And Death**: This category provides further insight into the important summary level figures mentioned earlier by providing observations about frequency , timing , type etc where available . Additionally , it offers valuable insights about trends related specifically (among others ) out - migration /in - migration mortality ratio changes/births outside hospitals marriage age / labor force participation trends etc – all essential ingredients when trying solve complex issues related improving public one's life expectancy positively  
    
     **Relative Health Importance & Vulnerable Populations And Environment Capacity :** This section covers two closely intertwined fields revealing how they interact – socioeconomic status disparities & environment quality – around boundaries & neighborhoods influencing risks factors (not only related medical matters ) aspects such disabilities insurance coverage alcohol use & smoking habits road fatalities veh
    

    Research Ideas

    • Using the Health Status Indicators as input features, machine learning models can be built to predict county-level mortality rate, which can then be used as an important indicator for health and medical resource allocation.
    • The data can also be used to analyze the social determinants of health in different counties by combining with socioeconomic indicators such as poverty, population density and educational attainment levels.
    • Additionally, the dataset could help assess th...
  19. Cuba Life Expectancy

    • kaggle.com
    zip
    Updated Feb 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Asad Zaman (2021). Cuba Life Expectancy [Dataset]. https://www.kaggle.com/asaduzaman/cuba-life-expectancy
    Explore at:
    zip(13911 bytes)Available download formats
    Dataset updated
    Feb 18, 2021
    Authors
    Asad Zaman
    Area covered
    Cuba
    Description

    Context

    Data set taken from WHO: See Life Tables by Country (CUBA) & Life Expectancy at Birth (CUBA) Detailed information on year-wise deaths by age group, and population left alive by age group - this data permits calculations of Life Expectancies for Cuba. This is data for a lecture on computation of life-expectancies, which is part of a course on Real Statistics: An Islamic Approach. Lecture linked below provides further details on how to compute life expectancies from this data: Computing Life Expectancies from Mortality Tables.

    Content

    Rows 3 to 21 provide Age-Specific death rates for 5 year groups 0-5. 5-10, and so on up to 80-85, and 85+ Rows 22 to 40 provide probability of dying in each of these same age-categories. Rows 41 to 59 provide Number of people left alive in each of these 5- year age groups Rows 60 to 78 provide number of people who die in each of these age categories Rows 79 to 97 provide number of person-years lived by each of these 5-year age cohorts Rows 98 to 116 provide number of person-years lived ABOVE given age group Rows 117 to 135 provide life expectancy within each age category

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  20. Data from: Life Expectancy prediction Dataset

    • kaggle.com
    zip
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Kapadnis (2023). Life Expectancy prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sujaykapadnis/life-expectancy-prediction-dataset
    Explore at:
    zip(765628 bytes)Available download formats
    Dataset updated
    Dec 6, 2023
    Authors
    Sujay Kapadnis
    Description

    Across the world, people are living longer. In 1900, the average life expectancy of a newborn was 32 years. By 2021 this had more than doubled to 71 years. But where, when, how, and why has this dramatic change occurred? To understand it, we can look at data on life expectancy worldwide. The large reduction in child mortality has played an important role in increasing life expectancy. But life expectancy has increased at all ages. Infants, children, adults, and the elderly are all less likely to die than in the past, and death is being delayed. This remarkable shift results from advances in medicine, public health, and living standards. Along with it, many predictions of the ‘limit’ of life expectancy have been broken.

    Data Dictionary

    life_expectancy.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancydoublePeriod life expectancy at birth - Sex: all - Age: 0

    life_expectancy_different_ages.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancy0doublePeriod life expectancy at birth - Sex: all - Age: 0
    LifeExpectancy10doublePeriod life expectancy - Sex: all - Age: 10
    LifeExpectancy25doublePeriod life expectancy - Sex: all - Age: 25
    LifeExpectancy45doublePeriod life expectancy - Sex: all - Age: 45
    LifeExpectancy65doublePeriod life expectancy - Sex: all - Age: 65
    LifeExpectancy80doublePeriod life expectancy - Sex: all - Age: 80

    life_expectancy_female_male.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancyDiffFMdoubleLife expectancy difference (f-m) - Type: period - Sex: both - Age: 0

    citation(tidytuesday)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mohamadreza Momeni (2024). War and Peace [Dataset]. https://www.kaggle.com/datasets/imtkaggleteam/war-and-peace/code
Organization logo

War and Peace

Since 1800, more than 37 million people worldwide have died while actively fight

Explore at:
zip(86102 bytes)Available download formats
Dataset updated
Apr 6, 2024
Authors
Mohamadreza Momeni
Description

Data Description: Since 1800, more than 37 million people worldwide have died while actively fighting in wars.

The number would be much higher still if it also considered the civilians who died due to the fighting, the increased number of deaths from hunger and disease resulting from these conflicts, and the deaths in smaller conflicts that are not considered wars.

Wars are also terrible in many other ways: they make people’s lives insecure, lower their living standards, destroy the environment, and, if fought between countries armed with nuclear weapons, can be an existential threat to humanity.

Looking at the news alone, it can be difficult to understand whether more or less people are dying as a result of war than in the past. One has to rely on statistics that are carefully collected so that they can be compared over time.

How many wars are avoided, and whether the trend of fewer deaths in them continues, is up to our own actions. Conflict deaths recently increased in the Middle East, Africa, and Europe, stressing that the future of these trends is uncertain.

In this dataset, there are 6 csv files in one zip one. Everything is clear but if you have any question, feel free to ask. Good luck.

This dataset belongs to Ourworldindata By: Bastian Herre, Lucas Rodés-Guirao, Max Roser, Joe Hasell and Bobbie Macdonald

Search
Clear search
Close search
Google apps
Main menu