100+ datasets found
  1. Effect of suicide rates on life expectancy dataset

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filip Zoubek; Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. http://doi.org/10.5281/zenodo.4694270
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Filip Zoubek; Filip Zoubek
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Effect of suicide rates on life expectancy dataset

    Abstract
    In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
    The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

    Data

    The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

    LICENSE

    THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

    [1] https://www.kaggle.com/szamil/who-suicide-statistics

    [2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

  2. W

    My Brother's Keeper Key Statistical Indicators on Boys and Men of Color

    • cloud.csiss.gmu.edu
    • datasets.ai
    • +4more
    csv, microsoft excel
    Updated Mar 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2021). My Brother's Keeper Key Statistical Indicators on Boys and Men of Color [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/my-brothers-keeper-key-statistical-indicators-on-boys-and-men-of-color
    Explore at:
    microsoft excel, csvAvailable download formats
    Dataset updated
    Mar 7, 2021
    Dataset provided by
    United States
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    My Brother's Keeper (MBK) initiative is an interagency effort to improve measurably the expected educational and life outcomes for and address the persistent opportunity gaps faced by boys and young men of color (including African Americans, Hispanic Americans, and Native Americans). The MBK Task Force coordinates a Federal effort to improve significantly the expected life outcomes for boys and young men of color and their contributions to U.S. prosperity. The MBK Task Force collaborated with the Interagency Forum on Child and Family Statistics and federal statistical agencies to pull together new statistics for key indicators - derived from existing, publicly available datasets - cross tabulated for race and gender for the first time. These statistics are highlighted in the MBK Task Force May 2014 report and are posted on MBK.ed.gov.

  3. Suicides in England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2024). Suicides in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/suicidesintheunitedkingdomreferencetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2024
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Number of suicides and suicide rates, by sex and age, in England and Wales. Information on conclusion type is provided, along with the proportion of suicides by method and the median registration delay.

  4. f

    Table_1_Identifying features of risk periods for suicide attempts using...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rina Dutta; George Gkotsis; Sumithra U. Velupillai; Johnny Downs; Angus Roberts; Robert Stewart; Matthew Hotopf (2023). Table_1_Identifying features of risk periods for suicide attempts using document frequency and language use in electronic health records.DOCX [Dataset]. http://doi.org/10.3389/fpsyt.2023.1217649.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Frontiers
    Authors
    Rina Dutta; George Gkotsis; Sumithra U. Velupillai; Johnny Downs; Angus Roberts; Robert Stewart; Matthew Hotopf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundIndividualising mental healthcare at times when a patient is most at risk of suicide involves shifting research emphasis from static risk factors to those that may be modifiable with interventions. Currently, risk assessment is based on a range of extensively reported stable risk factors, but critical to dynamic suicide risk assessment is an understanding of each individual patient’s health trajectory over time. The use of electronic health records (EHRs) and analysis using machine learning has the potential to accelerate progress in developing early warning indicators.SettingEHR data from the South London and Maudsley NHS Foundation Trust (SLaM) which provides secondary mental healthcare for 1.8 million people living in four South London boroughs.ObjectivesTo determine whether the time window proximal to a hospitalised suicide attempt can be discriminated from a distal period of lower risk by analysing the documentation and mental health clinical free text data from EHRs and (i) investigate whether the rate at which EHR documents are recorded per patient is associated with a suicide attempt; (ii) compare document-level word usage between documents proximal and distal to a suicide attempt; and (iii) compare n-gram frequency related to third-person pronoun use proximal and distal to a suicide attempt using machine learning.MethodsThe Clinical Record Interactive Search (CRIS) system allowed access to de-identified information from the EHRs. CRIS has been linked with Hospital Episode Statistics (HES) data for Admitted Patient Care. We analysed document and event data for patients who had at some point between 1 April 2006 and 31 March 2013 been hospitalised with a HES ICD-10 code related to attempted suicide (X60–X84; Y10–Y34; Y87.0/Y87.2).Findingsn = 8,247 patients were identified to have made a hospitalised suicide attempt. Of these, n = 3,167 (39.8%) of patients had at least one document available in their EHR prior to their first suicide attempt. N = 1,424 (45.0%) of these patients had been “monitored” by mental healthcare services in the past 30 days. From 60 days prior to a first suicide attempt, there was a rapid increase in the monitoring level (document recording of the past 30 days) increasing from 35.1 to 45.0%. Documents containing words related to prescribed medications/drugs/overdose/poisoning/addiction had the highest odds of being a risk indicator used proximal to a suicide attempt (OR 1.88; precision 0.91 and recall 0.93), and documents with words citing a care plan were associated with the lowest risk for a suicide attempt (OR 0.22; precision 1.00 and recall 1.00). Function words, word sequence, and pronouns were most common in all three representations (uni-, bi-, and tri-gram).ConclusionEHR documentation frequency and language use can be used to distinguish periods distal from and proximal to a suicide attempt. However, in our study 55.0% of patients with documentation, prior to their first suicide attempt, did not have a record in the preceding 30 days, meaning that there are a high number who are not seen by services at their most vulnerable point.

  5. O

    COVID-19 Contact Tracing: Attempted and Successful Interviews by Week -...

    • data.ct.gov
    • catalog.data.gov
    application/rdfxml +5
    Updated Mar 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health (2022). COVID-19 Contact Tracing: Attempted and Successful Interviews by Week - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/COVID-19-Contact-Tracing-Attempted-and-Successful-/2yp7-5vqj
    Explore at:
    csv, tsv, application/rdfxml, xml, json, application/rssxmlAvailable download formats
    Dataset updated
    Mar 17, 2022
    Dataset authored and provided by
    Department of Public Health
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: This dataset has been archived and is no longer being updated.

    Contact tracing is the process of contacting all people who have tested positive for COVID-19 or have had contact with someone who tested positive. The software for contact tracing in Connecticut is called ContaCT. ContaCT is used for monitoring the health and wellbeing of people affected by COVID-19 and assists in facilitating timely and accurate contact tracing.

    This dataset includes the number of attempted and successful contact tracing interviews in the ContaCT system by week. This includes interviews for cases (those who have tested positive for COVID-19) and contacts (those who have been exposed to someone with COVID-19).

    Data presented are based on a weekly reporting period (Sunday - Saturday). All data are preliminary and are subject to change.

    Additional information on COVID-19 Contact Tracing can be found here: https://portal.ct.gov/Coronavirus/ContaCT

  6. f

    Study variables.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Nov 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shrestha, Roman; Paudel, Kiran; Poudel, Krishna C.; Bhandari, Prashamsa; Sharma, Sanjay; Gautam, Kamal; Dhakal, Manisha; Wickersham, Jeffrey A.; Ha, Toan (2023). Study variables. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001063631
    Explore at:
    Dataset updated
    Nov 24, 2023
    Authors
    Shrestha, Roman; Paudel, Kiran; Poudel, Krishna C.; Bhandari, Prashamsa; Sharma, Sanjay; Gautam, Kamal; Dhakal, Manisha; Wickersham, Jeffrey A.; Ha, Toan
    Description

    Men who have sex with men (MSM) are at increased risk for suicide, with a much higher prevalence of suicidality than the general population. While there is a growing interest in the identification of risk factors for suicidal behaviors globally, the understanding of the prevalence and risk factors for suicidal behaviors among MSM in the context of low- and middle-income countries is almost non-existent. Therefore, this study aimed to investigate suicidal ideation, plan, and attempts, and related factors among MSM in Nepal. A cross-sectional respondent driven survey was conducted on 250 MSM between October and December 2022. Bivariate and multivariable logistic regression was used to evaluate independent correlates of suicidal behaviors of MSM. Overall, the lifetime prevalence of suicidal ideation, plans, and attempts among MSM in this study were 42.4%, 31.2%, and 21.6%, respectively. MSM with depressive symptoms (aOR = 5.7, 95% CI = 2.4–14.1), advanced education (higher secondary and above; aOR = 2.9, 95% CI = 1.4–6.1), and smoking habit (aOR = 2.5, 95% CI = 1.2–5.3) were at increased risk for suicidal ideation. Similarly, those with depressive symptoms (aOR = 2.2, 95% CI = 1.1–4.8) and advanced education (aOR = 2.7, 95% CI = 1.2–5.7) were more likely to plan suicide, whereas young MSM were significantly more prone to attempting suicide (aOR = 2.7, 95% CI = 1.3–5.8). Interestingly, MSM with moderate to severe food insecurity were 2–3 times more likely to think about, plan, or attempt suicide (ideation: aOR = 3.5, 95% CI = 1.6–7.7; plan: aOR = 3.7, 95% CI = 1.6–8.3; attempt: aOR = 2.2, 95% CI = 1.1–4.6). The results suggest the importance of early assessment of suicidal behaviors among MSM and the need for tailored interventions to simultaneously address mental health problems and food insecurity to reduce suicide-related problems among Nepalese MSM.

  7. e

    From boys to men: Precluding the Proclivity to Perpetrate - Dataset - B2FIND...

    • b2find.eudat.eu
    Updated Oct 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). From boys to men: Precluding the Proclivity to Perpetrate - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/de910b9c-1686-507c-b50a-181a803f05e7
    Explore at:
    Dataset updated
    Oct 19, 2023
    Description

    Why do some boys grow up to be domestic abuse perpetrators when others do not? How can we change the attitudes and feelings that give rise to abusive tendencies? The research will answer these questions through a mixed methods study that examines: 1200 secondary school children's attitudes to domestic violence. Using attitudinal and self-report measures we will also examine how these attitudes relate to children's direct experiences of violence as victims, witnesses to, and perpetrators of domestic abuse and dating violence. The contingencies through which domestic and dating violence are legitimised. Focus groups will be used to explore in what circumstances teenagers who might otherwise condemn violence are prepared to condone it. Young men's biographical accounts of experiencing, witnessing and/or perpetrating acts of dating or domestic violence. Through a detailed analysis of each of these 3 datasets and their interrelations, the research will attempt to explain why some young men come to adopt pro-violence attitudes and others do not; the roles attitudes, emotional well-being, and experiences of both parent-child and intimate partner relationships play in this; and how to intervene more effectively in the lives of young people already manifesting violence proclivities. Phase 1 data were collected in the form of a questionnaire survey administered to 1203 young people aged 13-14, before and after a school-based educational intervention was delivered to half of the sample. Those who participated in the intervention also completed the survey at three month follow-up.Phase 2 data were collected as a series of 13 focus group discussions, conducted with 69 young people aged 13-19, the groups were selected to reflect potentially distinctive relationships to violence and/or intimacy. Discussion explored young men’s attitudes to domestic abuse by inviting responses to a government anti-violence publicity campaign and a series of hypothetical vignettes.Phase 3 data were collected as a series of in-depth life history interviews with 30 young men aged 16-21 who had experienced domestic violence as victims, perpetrators or witnesses.

  8. Financial Statement Data Sets

    • kaggle.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vadim Vanak (2025). Financial Statement Data Sets [Dataset]. https://www.kaggle.com/datasets/vadimvanak/company-facts-2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vadim Vanak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset offers a detailed collection of US-GAAP financial data extracted from the financial statements of exchange-listed U.S. companies, as submitted to the U.S. Securities and Exchange Commission (SEC) via the EDGAR database. Covering filings from January 2009 onwards, this dataset provides key financial figures reported by companies in accordance with U.S. Generally Accepted Accounting Principles (GAAP).

    Dataset Features:

    • Data Scope: The dataset is restricted to figures reported under US-GAAP standards, with the exception of EntityCommonStockSharesOutstanding and EntityPublicFloat.
    • Currency and Units: The dataset exclusively includes figures reported in USD or shares, ensuring uniformity and comparability. It excludes ratios and non-financial metrics to maintain focus on financial data.
    • Company Selection: The dataset is limited to companies with U.S. exchange tickers, providing a concentrated analysis of publicly traded firms within the United States.
    • Submission Types: The dataset only incorporates data from 10-Q, 10-K, 10-Q/A, and 10-K/A filings, ensuring consistency in the type of financial reports analyzed.

    Data Sources and Extraction:

    This dataset primarily relies on the SEC's Financial Statement Data Sets and EDGAR APIs: - SEC Financial Statement Data Sets - EDGAR Application Programming Interfaces

    In instances where specific figures were missing from these sources, data was directly extracted from the companies' financial statements to ensure completeness.

    Please note that the dataset presents financial figures exactly as reported by the companies, which may occasionally include errors. A common issue involves incorrect reporting of scaling factors in the XBRL format. XBRL supports two tag attributes related to scaling: 'decimals' and 'scale.' The 'decimals' attribute indicates the number of significant decimal places but does not affect the actual value of the figure, while the 'scale' attribute adjusts the value by a specific factor.

    However, there are several instances, numbering in the thousands, where companies have incorrectly used the 'decimals' attribute (e.g., 'decimals="-6"') under the mistaken assumption that it controls scaling. This is not correct, and as a result, some figures may be inaccurately scaled. This dataset does not attempt to detect or correct such errors; it aims to reflect the data precisely as reported by the companies. A future version of the dataset may be introduced to address and correct these issues.

    The source code for data extraction is available here

  9. R

    Food_new Dataset

    • universe.roboflow.com
    zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allergen30 (2024). Food_new Dataset [Dataset]. https://universe.roboflow.com/allergen30/food_new-uuulf
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset authored and provided by
    Allergen30
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Food Bounding Boxes
    Description

    Allergen30

    About Allergen30

    Allergen30 is created by Mayank Mishra, Nikunj Bansal, Tanmay Sarkar and Tanupriya Choudhury with a goal of building a robust detection model that can assist people in avoiding possible allergic reactions.

    It contains more than 6,000 images of 30 commonly used food items which can cause an adverse reaction within a human body. This dataset is one of the first research attempts in training a deep learning based computer vision model to detect the presence of such food items from images. It also serves as a benchmark for evaluating the efficacy of object detection methods in learning the otherwise difficult visual cues related to food items.

    Description of class labels

    There are multiple food items pertaining to specific food intolerances which can trigger an allergic reaction. Such food intolerance primarily include Lactose, Histamine, Gluten, Salicylate, Caffeine and Ovomucoid intolerance. https://github.com/mmayank74567/mmayank74567.github.io/blob/master/images/FoodIntol.png?raw=true" alt="Food intolerance">

    The following table contains the description relating to the 30 class labels in our dataset.

    S. No.AllergenFood labelDescription
    1OvomucoideggImages of egg with yolk (e.g. sunny side up eggs)
    2Ovomucoidwhole_egg_boiledImages of soft and hard boiled eggs
    3Lactose/HistaminemilkImages of milk in a glass
    4LactoseicecreamImages of icecream scoops
    5LactosecheeseImages of swiss cheese
    6Lactose/ Caffeinemilk_based_beverageImages of tea/ coffee with milk in a cup/glass
    7Lactose/CaffeinechocolateImages of chocolate bars
    8Caffeinenon_milk_based_beverageImages of soft drinks and tea/coffee without milk in a cup/glass
    9Histaminecooked_meatImages of cooked meat
    10Histamineraw_meatImages of raw meat
    11HistaminealcoholImages of alcohol bottles
    12Histaminealcohol_glassImages of wine glasses with alcohol
    13HistaminespinachImages of spinach bundle
    14HistamineavocadoImages of avocado sliced in half
    15HistamineeggplantImages of eggplant
    16SalicylateblueberryImages of blueberry
    17SalicylateblackberryImages of blackberry
    18SalicylatestrawberryImages of strawberry
    19SalicylatepineappleImages of pineapple
    20SalicylatecapsicumImages of bell pepper
    21SalicylatemushroomImages of mushrooms
    22SalicylatedatesImages of dates
    23SalicylatealmondsImages of almonds
    24SalicylatepistachiosImages of pistachios
    25SalicylatetomatoImages of tomato and tomato slices
    26GlutenrotiImages of roti
    27GlutenpastaImages of one serving of penne pasta
    28GlutenbreadImages of bread slices
    29Glutenbread_loafImages of bread loaf
    30GlutenpizzaImages of pizza and pizza slices

    Data collection

    We used search engines (Google and Bing) to crawl and look for suitable images using JavaScript queries for each food item from the list created. The images with incomplete RGB channels were removed, and the images collected from different search engines were compiled. When downloading images from search engines, many images were irrelevant to the purpose, especially the ones with a lot of text in them. We deployed the EAST text detector to segregate such images. Finally, a comprehensive manual inspection was conducted to ensure the relevancy of images in the dataset.

    Fair use

    This dataset contains some copyrighted material whose use has not been specifically authorized by the copyright owners. In an effort to advance scientific research, we make this material available for academic research. If you wish to use copyrighted material in our dataset for purposes of your own that go beyond non-commercial research and academic purposes, you must obtain permission directly from the copyright owner. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for non-commercial research and educational purposes.(adapted from Christopher Thomas).

    **Citatio

  10. Z

    Empathy dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathematical Research Data Initiative (2024). Empathy dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7683906
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset authored and provided by
    Mathematical Research Data Initiative
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The database for this study (Briganti et al. 2018; the same for the Braun study analysis) was composed of 1973 French-speaking students in several universities or schools for higher education in the following fields: engineering (31%), medicine (18%), nursing school (16%), economic sciences (15%), physiotherapy, (4%), psychology (11%), law school (4%) and dietetics (1%). The subjects were 17 to 25 years old (M = 19.6 years, SD = 1.6 years), 57% were females and 43% were males. Even though the full dataset was composed of 1973 participants, only 1270 answered the full questionnaire: missing data are handled using pairwise complete observations in estimating a Gaussian Graphical Model, meaning that all available information from every subject are used.

    The feature set is composed of 28 items meant to assess the four following components: fantasy, perspective taking, empathic concern and personal distress. In the questionnaire, the items are mixed; reversed items (items 3, 4, 7, 12, 13, 14, 15, 18, 19) are present. Items are scored from 0 to 4, where “0” means “Doesn’t describe me very well” and “4” means “Describes me very well”; reverse-scoring is calculated afterwards. The questionnaires were anonymized. The reanalysis of the database in this retrospective study was approved by the ethical committee of the Erasmus Hospital.

    Size: A dataset of size 1973*28

    Number of features: 28

    Ground truth: No

    Type of Graph: Mixed graph

    The following gives the description of the variables:

    Feature FeatureLabel Domain Item meaning from Davis 1980

    001 1FS Green I daydream and fantasize, with some regularity, about things that might happen to me.

    002 2EC Purple I often have tender, concerned feelings for people less fortunate than me.

    003 3PT_R Yellow I sometimes find it difficult to see things from the “other guy’s” point of view.

    004 4EC_R Purple Sometimes I don’t feel very sorry for other people when they are having problems.

    005 5FS Green I really get involved with the feelings of the characters in a novel.

    006 6PD Red In emergency situations, I feel apprehensive and ill-at-ease.

    007 7FS_R Green I am usually objective when I watch a movie or play, and I don’t often get completely caught up in it.(Reversed)

    008 8PT Yellow I try to look at everybody’s side of a disagreement before I make a decision.

    009 9EC Purple When I see someone being taken advantage of, I feel kind of protective towards them.

    010 10PD Red I sometimes feel helpless when I am in the middle of a very emotional situation.

    011 11PT Yellow sometimes try to understand my friends better by imagining how things look from their perspective

    012 12FS_R Green Becoming extremely involved in a good book or movie is somewhat rare for me. (Reversed)

    013 13PD_R Red When I see someone get hurt, I tend to remain calm. (Reversed)

    014 14EC_R Purple Other people’s misfortunes do not usually disturb me a great deal. (Reversed)

    015 15PT_R Yellow If I’m sure I’m right about something, I don’t waste much time listening to other people’s arguments. (Reversed)

    016 16FS Green After seeing a play or movie, I have felt as though I were one of the characters.

    017 17PD Red Being in a tense emotional situation scares me.

    018 18EC_R Purple When I see someone being treated unfairly, I sometimes don’t feel very much pity for them. (Reversed)

    019 19PD_R Red I am usually pretty effective in dealing with emergencies. (Reversed)

    020 20FS Green I am often quite touched by things that I see happen.

    021 21PT Yellow I believe that there are two sides to every question and try to look at them both.

    022 22EC Purple I would describe myself as a pretty soft-hearted person.

    023 23FS Green When I watch a good movie, I can very easily put myself in the place of a leading character.

    024 24PD Red I tend to lose control during emergencies.

    025 25PT Yellow When I’m upset at someone, I usually try to “put myself in his shoes” for a while.

    026 26FS Green When I am reading an interesting story or novel, I imagine how I would feel if the events in the story were happening to me.

    027 27PD Red When I see someone who badly needs help in an emergency, I go to pieces.

    028 28PT Yellow Before criticizing somebody, I try to imagine how I would feel if I were in their place

    More information about the dataset is contained in empathy_description.html file.

  11. Synthetic Financial Datasets For Fraud Detection

    • kaggle.com
    zip
    Updated Apr 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edgar Lopez-Rojas (2017). Synthetic Financial Datasets For Fraud Detection [Dataset]. https://www.kaggle.com/ealaxi/paysim1
    Explore at:
    zip(186385561 bytes)Available download formats
    Dataset updated
    Apr 3, 2017
    Authors
    Edgar Lopez-Rojas
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    There is a lack of public available datasets on financial services and specially in the emerging mobile money transactions domain. Financial datasets are important to many researchers and in particular to us performing research in the domain of fraud detection. Part of the problem is the intrinsically private nature of financial transactions, that leads to no publicly available datasets.

    We present a synthetic dataset generated using the simulator called PaySim as an approach to such a problem. PaySim uses aggregated data from the private dataset to generate a synthetic dataset that resembles the normal operation of transactions and injects malicious behaviour to later evaluate the performance of fraud detection methods.

    Content

    PaySim simulates mobile money transactions based on a sample of real transactions extracted from one month of financial logs from a mobile money service implemented in an African country. The original logs were provided by a multinational company, who is the provider of the mobile financial service which is currently running in more than 14 countries all around the world.

    This synthetic dataset is scaled down 1/4 of the original dataset and it is created just for Kaggle.

    Headers

    This is a sample of 1 row with headers explanation:

    1,PAYMENT,1060.31,C429214117,1089.0,28.69,M1591654462,0.0,0.0,0,0

    step - maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (30 days simulation).

    type - CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.

    amount - amount of the transaction in local currency.

    nameOrig - customer who started the transaction

    oldbalanceOrg - initial balance before the transaction

    newbalanceOrig - new balance after the transaction

    nameDest - customer who is the recipient of the transaction

    oldbalanceDest - initial balance recipient before the transaction. Note that there is not information for customers that start with M (Merchants).

    newbalanceDest - new balance recipient after the transaction. Note that there is not information for customers that start with M (Merchants).

    isFraud - This is the transactions made by the fraudulent agents inside the simulation. In this specific dataset the fraudulent behavior of the agents aims to profit by taking control or customers accounts and try to empty the funds by transferring to another account and then cashing out of the system.

    isFlaggedFraud - The business model aims to control massive transfers from one account to another and flags illegal attempts. An illegal attempt in this dataset is an attempt to transfer more than 200.000 in a single transaction.

    Past Research

    There are 5 similar files that contain the run of 5 different scenarios. These files are better explained at my PhD thesis chapter 7 (PhD Thesis Available here http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12932).

    We ran PaySim several times using random seeds for 744 steps, representing each hour of one month of real time, which matches the original logs. Each run took around 45 minutes on an i7 intel processor with 16GB of RAM. The final result of a run contains approximately 24 million of financial records divided into the 5 types of categories: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.

    Acknowledgements

    This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.

    Please refer to this dataset using the following citations:

    PaySim first paper of the simulator:

    E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016

  12. Suicide death rate by age group

    • data.europa.eu
    • db.nomics.world
    csv, html, tsv, xml
    Updated Nov 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2017). Suicide death rate by age group [Dataset]. https://data.europa.eu/data/datasets/cajrcg2qbzdghfsuwhfw?locale=en
    Explore at:
    html, xml, tsv(3539), csv, xml(9875)Available download formats
    Dataset updated
    Nov 28, 2017
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Crude death rate from suicide and intentional self-harm per 100 000 people, by age group. Suicide registration methods vary between countries and over time. Figures do not include deaths from events of undetermined intent (part of which should be considered as suicides) and attempted suicides which did not result in death.

  13. e

    Schweigen Impossible - eine Begegnung von Übersetzern, Dolmetschern und...

    • b2find.eudat.eu
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Schweigen Impossible - eine Begegnung von Übersetzern, Dolmetschern und Besserwissern - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/67abc99c-a418-52a0-a240-2d3c4fc06ace
    Explore at:
    Dataset updated
    Dec 15, 2023
    Description

    Abstract: The play delves into the challenges of communication between individuals with and without disabilities, blurring the lines of who is considered disabled as the narrative unfolds. The irony of incompetent translators and impostors combines humour as well as philosophizing about human interaction and communication. Details: The play commences with André, a wheelchair user, making a dramatic entrance onto the stage, uttering incomprehensible words. Five men attempt to engage with him, but their impatience, selective hearing, or assumption that André speaks a different language hinder communication. Eventually, the men and André take their seats at the stage’s edge, bathed in dim light, with a black canvas serving as both curtain and backdrop. Despite the visual separation, their feet (or André’s wheels) remain visible. During the song “Rose Garden,” an audio cassette recorder traverses the stage, while another actor appears to translate the lyrics into sign language. This scene repeats later, with the recorder passing by thrice, even when no music plays, emphasizing the characters' ability to sense the song through movement alone. Likewise, other recorded scenes recur, such as a heated argument between two men, accompanied by yelling and indecipherable sounds. Despite a sign language translator, the dialogue escalates into verbal violence, rendering the interpreter unable to keep up. The exchange ends with one man shouting “ass” before exiting, providing the only understandable word in this scene. Another recurring scene involves a blond man enthusiastically shouting “Interview!” In these dialogues, he poses questions, only to receive answers that never align with his inquiries. For instance, when inquiring about the pyramids, he is met with an explanation in broken English regarding the perpetually unfinished Berlin Airport, emphasizing the idea of “always waiting.” The play interconnects these scenes with brief dance or song displays, spanning from traditional children’s music to expressive dance. It culminates with a scene featuring three men attempting to translate an academic article on communication into sign language. The sign language interpreter, fluent in English but not German, struggles to convey the complex German-to-English translation of the intricate sentences. Amid this, the men realize that the academic text remains inaccessible to individuals with disabilities, adding a poignant layer to the narrative.

  14. Anti Spoofing Selfie Live Dataset - 5,000+ files

    • kaggle.com
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Axon Labs (2024). Anti Spoofing Selfie Live Dataset - 5,000+ files [Dataset]. https://www.kaggle.com/datasets/axondata/anti-spoofing-live-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Axon Labs
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Anti Spoofing Selfie Live dataset - Selfie collection

    What is inside this dataset?

    Biometric Attack dataset consists of >5k selfie images of people from >50 countries. Each participant provided 1 real life selfy image. Live selfies help facial recognition models to identify real faces and detect spoofing attempts, decreasing false negative results for Liveness detection tests.

    Dataset parameters:

    • Key nationalities are covered (Caucasians, Black, Asian, Hispanic etc)
    • Variety of lightning conditions and capturing devices
    • Different demographic parameters (broad range of Age, balanced gender and race distribution)

    Full version of dataset is available for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

    How Live selfie dataset helps Liveness models?

    Selfies provide a diverse range of facial features, lighting conditions, and capturing devices, which are essential for training robust facial recognition models that can accurately distinguish between real and spoofed faces

    Potential Use Cases:

    Liveness detection: This dataset is ideal for training and evaluating liveness detection models, enabling researchers to distinguish between real and spoof data with high accuracy

    Keywords: Real life data, Live data, Selfie data, Antispoofing for AI, Liveness Detection dataset for AI, Spoof Detection dataset, Facial Recognition dataset, Biometric Authentication dataset, AI Dataset, Anti-Spoofing Technology, Facial Biometrics, Machine Learning Dataset, Deep Learning

  15. C

    Violence Reduction - Victims of Homicides and Non-Fatal Shootings

    • data.cityofchicago.org
    • catalog.data.gov
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). Violence Reduction - Victims of Homicides and Non-Fatal Shootings [Dataset]. https://data.cityofchicago.org/Public-Safety/Violence-Reduction-Victims-of-Homicides-and-Non-Fa/gumc-mgzr
    Explore at:
    xml, application/geo+json, kml, csv, kmz, xlsxAvailable download formats
    Dataset updated
    Aug 21, 2025
    Dataset authored and provided by
    City of Chicago
    Description

    This dataset contains individual-level homicide and non-fatal shooting victimizations, including homicide data from 1991 to the present, and non-fatal shooting data from 2010 to the present (2010 is the earliest available year for shooting data). This dataset includes a "GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized-access dataset, but with "UNKNOWN" in the shooting column.

    Each row represents a single victimization, i.e., a unique event when an individual became the victim of a homicide or non-fatal shooting. Each row does not represent a unique victim—if someone is victimized multiple times there will be multiple rows for each of those distinct events.

    The dataset is refreshed daily, but excludes the most recent complete day to allow the Chicago Police Department (CPD) time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information.

    A version of this dataset with additional crime types is available by request. To make a request, please email dataportal@cityofchicago.org with the subject line: Violence Reduction Victims Access Request. Access will require an account on this site, which you may create at https://data.cityofchicago.org/signup.

    How does this dataset classify victims?

    The methodology by which this dataset classifies victims of violent crime differs by victimization type:

    Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table.

    To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset. For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Victims dataset do not align with the homicide/non-fatal shooting categorization:

    1. In instances where a homicide victimization does not correspond to an IUCR code 0110 or 0130, we set the IUCR code to "01XX" to indicate that the victimization was a homicide but we do not know whether it was a first-degree murder (IUCR code = 0110) or a second-degree murder (IUCR code = 0130).
    2. When a non-fatal shooting victimization does not correspond to an IUCR code that signifies a criminal sexual assault, robbery, or aggravated battery, we enter “UNK” in the IUCR column, “YES” in the GUNSHOT_I column, and “NON-FATAL” in the PRIMARY column to indicate that the victim was non-fatally shot, but the precise IUCR code is unknown.

    Other violent crime victims: For other violent crime types, we refer to the IUCR classification that exists in CPD's victim table, with only one exception:

    1. When there is an incident that is associated with no victim with a matching IUCR code, we assume that this is an error. Every crime should have at least 1 victim with a matching IUCR code. In these cases, we change the IUCR code to reflect the incident IUCR code because CPD's incident table is considered to be more reliable than the victim table.

    Note: The definition of “homicide” (shooting or otherwise) does not include justifiable homicide or involuntary manslaughter. This dataset also excludes any cases that CPD considers to be “unfounded” or “noncriminal.” Officer-involved shootings are not included.

    Note: The initial reporting officer usually asks victims to report demographic data. If victims are unable to recall, the reporting officer will use their best judgment. “Unknown” can be reported if it is truly unknown.

    Note: In some instances, CPD's raw incident-level data and victim-level data that were inputs into this dataset do not align on the type of crime that occurred. In those instances, this dataset attempts to correct mismatches between incident and victim specific crime types. When it is not possible to determine which victims are associated with the most reliable crime determination, the dataset will show empty cells in the respective demographic fields (age, sex, race, etc.).

    Note: Homicide victims names are delayed by two weeks to allow time for the victim’s family to be notified of their passing.

    Note: The initial reporting officer usually asks victims to report demographic data. If victims are unable to recall, the reporting officer will use their best judgment. “Unknown” can be reported if it is truly unknown.

    Note: This dataset includes variables referencing administrative or political boundaries that are subject to change. These include Street Outreach Organization boundary, Ward, Chicago Police Department District, Chicago Police Department Area, Chicago Police Department Beat, Illinois State Senate District, and Illinois State House of Representatives District. These variables reflect current geographic boundaries as of November 1st, 2021. In some instances, current boundaries may conflict with those that were in place at the time that a given incident occurred in prior years. For example, the Chicago Police Department districts 021 and 013 no longer exist. Any historical violent crime victimization that occurred in those districts when they were in existence are marked in this dataset as having occurred in the current districts that expanded to replace 013 and 021."

  16. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • data.europa.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  17. f

    Risk prediction model of death at first suicide attempt using multivariable...

    • plos.figshare.com
    xls
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suwanna Arunpongpaisal; Sawitri Assanangkornchai; Virasakdi Chongsuvivatwong (2024). Risk prediction model of death at first suicide attempt using multivariable logistic regression (Model development dataset, N = 1,824). [Dataset]. http://doi.org/10.1371/journal.pone.0297904.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 10, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Suwanna Arunpongpaisal; Sawitri Assanangkornchai; Virasakdi Chongsuvivatwong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Risk prediction model of death at first suicide attempt using multivariable logistic regression (Model development dataset, N = 1,824).

  18. Data from: Introducing the COVID-19 YouTube (COVYT) speech dataset featuring...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Triantafyllopoulos; Andreas Triantafyllopoulos; Anastasia Semertzidou; Meishu Song; Florian B. Pokorny; Florian B. Pokorny; Björn W. Schuller; Björn W. Schuller; Anastasia Semertzidou; Meishu Song (2022). Introducing the COVID-19 YouTube (COVYT) speech dataset featuring the same speakers with and without infection [Dataset]. http://doi.org/10.5281/zenodo.6962930
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andreas Triantafyllopoulos; Andreas Triantafyllopoulos; Anastasia Semertzidou; Meishu Song; Florian B. Pokorny; Florian B. Pokorny; Björn W. Schuller; Björn W. Schuller; Anastasia Semertzidou; Meishu Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVYT dataset contains speech samples from individuals who self-reported their COVID-19 infection on public social media platforms (YouTube, Xiaohongshu). These videos, as well as accompanying videos of the same people prior to infection, were mined in an attempt to gather publicly-available data for COVID-19 research. This release includes the links to the original videos along with the accompanying manual segmentation and diarisation that identifies the utterances of the target individuals. We are additionally releasing features derived from the segmented utterances. Finally, the dataset includes partitioning information according to 4 different cross-validation schemes. See the arxiv pre-print for more details: https://arxiv.org/abs/2206.11045

  19. d

    Data from: Sex-specific patterns of reproductive senescence in a long-lived...

    • datadryad.org
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Jun 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan Murgatroyd; Staffan Roos; Richard Evans; Alex Sansom; D. Philip Whitfield; David Sexton; Robin Reid; Justin Grant; Arjun Amar (2019). Sex-specific patterns of reproductive senescence in a long-lived reintroduced raptor [Dataset]. http://doi.org/10.5061/dryad.b5408s1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2019
    Dataset provided by
    Dryad
    Authors
    Megan Murgatroyd; Staffan Roos; Richard Evans; Alex Sansom; D. Philip Whitfield; David Sexton; Robin Reid; Justin Grant; Arjun Amar
    Time period covered
    Jun 11, 2018
    Area covered
    Scotland
    Description

    1) For many species there is evidence that breeding performance changes as an individual ages. In iteroparous species, breeding performance often increases through early-life and is expected to level out or even decline (senesce) later in life. Furthermore, an individual’s sex and conditions experienced in early-life can affect breeding performance and how this changes with age. 2) Long-term monitoring of individuals from reintroduced populations can provide unique opportunities to explore age-related trends in breeding performance that might otherwise be logistically challenging. 3) We used a unique dataset from a reintroduced population of white-tailed eagles Haliaeetus albicilla in Scotland, which has been intensively monitored since their initial reintroduction in 1975, to study age- and sex-specific trends in two measures of breeding performance. This monitoring provided data on breeding performance of known individuals ranging in age from 3 to 26 years old. We also explored change...

  20. p

    Taiwan Number Dataset

    • listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Taiwan Number Dataset [Dataset]. https://listtodata.com/taiwan-dataset
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset authored and provided by
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Taiwan
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    Taiwan number dataset will help you generate sales leads. First of all, people can start text with product info and descriptions and send buyers through this dataset. In fact, driving a telemarketing campaign is required at present. Moreover, you can literally call and message with the help of this Taiwan number dataset. Also, the Taiwan number dataset is crucial to let your audience know of the features and uses of your product. Above all, by doing this people can easily increase their marketing area. Even, they can create a bond with tier client and gain their trust with this mobile cell phone number list. Taiwan phone data has the potential to get valuable customers. A businessman will be able to earn more money without spending too much on ads. The SMS marketing plan is the best option, that possible to run promotions cheaply here. So, take the contact number directory at an affordable cost and try it for your help. Taiwan phone data will sustain your telemarketing with useful details. On the other hand, if anyone needs to reach someone as soon as possible, then the phone number is the best choice. Besides, you can directly send messages to their inbox through these datasets. Therefore, the numbers on our Taiwan phone data will aid your marketing efforts greatly. Overall, you can use List To Data for your product publicity so that you can find curious buyers among them. Taiwan phone number list is a top-notch mobile database. Likewise, the List To Data website is obstinate about giving our clients the best service for their money. Mainly, we have organized a 24/7 active support group to ensure that. You can ask them anything about this package, or even bring 95% real samples of the lead from them. Both your branding and sales will be enhanced with this Taiwan phone number list. Hence, make a good conclusion for your business and collect this lead right now. Further, the Taiwan phone number list will let you continue to promote any products all across the country. The user count of these platforms is so big that even that provides you with such a big customer base. Clearly, this will surely raise the possibility of finding interested customers for your benefit.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Filip Zoubek; Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. http://doi.org/10.5281/zenodo.4694270
Organization logo

Effect of suicide rates on life expectancy dataset

Explore at:
csvAvailable download formats
Dataset updated
Apr 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Filip Zoubek; Filip Zoubek
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

Effect of suicide rates on life expectancy dataset

Abstract
In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

Data

The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

LICENSE

THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

[1] https://www.kaggle.com/szamil/who-suicide-statistics

[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

Search
Clear search
Close search
Google apps
Main menu