100+ datasets found
  1. Mental Health Conversational Data

    • kaggle.com
    Updated Oct 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    elvis (2022). Mental Health Conversational Data [Dataset]. https://www.kaggle.com/datasets/elvis23/mental-health-conversational-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    elvis
    Description

    A dataset containing basic conversations, mental health FAQ, classical therapy conversations, and general advice provided to people suffering from anxiety and depression.

    This dataset can be used to train a model for a chatbot that can behave like a therapist in order to provide emotional support to people with anxiety & depression.

    The dataset contains intents. An “intent” is the intention behind a user's message. For instance, If I were to say “I am sad” to the chatbot, the intent, in this case, would be “sad”. Depending upon the intent, there is a set of Patterns and Responses appropriate for the intent. Patterns are some examples of a user’s message which aligns with the intent while Responses are the replies that the chatbot provides in accordance with the intent. Various intents are defined and their patterns and responses are used as the model’s training data to identify a particular intent.

  2. h

    Kaggle-Mental-Health-Survey-Data

    • huggingface.co
    Updated Aug 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shanti flagg (2024). Kaggle-Mental-Health-Survey-Data [Dataset]. https://huggingface.co/datasets/sflagg/Kaggle-Mental-Health-Survey-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2024
    Authors
    shanti flagg
    Description

    sflagg/Kaggle-Mental-Health-Survey-Data dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. Mental Health in Tech Survey

    • kaggle.com
    Updated Jan 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Mental Health in Tech Survey [Dataset]. https://www.kaggle.com/datasets/thedevastator/mental-health-in-tech-survey
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 20, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    Mental Health in Tech Survey

    Understanding Employee Mental Health Needs in the Tech Industry

    By Stephen Myers [source]

    About this dataset

    This dataset contains survey responses from individuals in the tech industry about their mental health, including questions about treatment, workplace resources, and attitudes towards discussing mental health in the workplace. Mental health is an issue that affects all people of all ages, genders and walks of life. The prevalence of these issues within the tech industry–one that places hard demands on those who work in it–is no exception. By analyzing this dataset, we can better understand how prevalent mental health issues are among those who work in the tech sector.–and what kinds of resources they rely upon to find help–so that more can be done to create a healthier working environment for all.

    This dataset tracks key measures such as age, gender and country to determine overall prevalence, along with responses surrounding employee access to care options; whether mental health or physical illness are being taken as seriously by employers; whether or not anonymity is protected with regards to seeking help; and how coworkers may perceive those struggling with mental illness issues such as depression or anxiety. With an ever-evolving landscape due to new technology advancing faster than ever before – these statistics have never been more important for us to analyze if we hope remain true promoters of a healthy world inside and outside our office walls

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    In this dataset you will find data on age, gender, country, and state of survey respondents in addition to numerous questions that assess an individual's mental state including: self-employment status, family history of mental illness, treatment status and access or lack thereof; how their mental health condition affects their work; number of employees at the company they work for; remote work status; tech company status; benefit information from employers such as mental health benefits and wellness program availability; anonymity protection if seeking treatment resources for substance abuse or mental health issues ; ease (or difficulty) for medical leave for a mental health condition ; whether discussing physical or medical matters with employers have negative consequences. You will also find comments from survey participants.

    To use this dataset effectively: - Clean the data by removing invalid responses/duplicates/missing values - you can do this with basic Pandas commands like .dropna() , .drop_duplicates(), .replace(). - Utilize descriptive statistics such as mean and median to draw general conclusions about patterns of responses - you can do this with Pandas tools such as .groupby() and .describe(). - Run various types analyses such as mean comparisons on different kinds of variables(age vs gender), correlations between different features etc using appropriate statistical methods - use commands like Statsmodels' OLS models (.smf) , calculate z-scores , run hypothesis tests etc depending on what analysis is needed. Make sure you are aware any underlying assumptions your analysis requires beforehand !
    - Visualize your results with plotting libraries like Matplotlib/Seaborn to easily interpret these findings! Use boxplots/histograms/heatmaps where appropriate depending on your question !

    Research Ideas

    • Using the results of this survey, you could develop targeted outreach campaigns directed at underrepresented groups that answer “No” to questions about their employers providing resources for mental health or discussing it as part of wellness programs.
    • Analyzing the employee characteristics (e.g., age and gender) of those who reported negative consequences from discussing their mental health in the workplace could inform employer policies to support individuals with mental health conditions and reduce stigma and discrimination in the workplace.
    • Correlating responses to questions about remote work, leave policies, and anonymity with whether or not individuals have sought treatment for a mental health condition may provide insight into which types of workplace resources are most beneficial for supporting employees dealing with these issues

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redi...

  4. Mental Health Insights: Vulnerable Cancer Patients

    • kaggle.com
    Updated Dec 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irin Hoque (2023). Mental Health Insights: Vulnerable Cancer Patients [Dataset]. http://doi.org/10.34740/kaggle/dsv/7284561
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 26, 2023
    Dataset provided by
    Kaggle
    Authors
    Irin Hoque
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    We collected over 10,087 posts from cancer patients and their caregivers on platforms like Reddit, Daily Strength, and the Health Board. The posts were related to five types of cancer: brain, colon, liver, leukemia, and lung cancer. Two team members scored each post based on the emotions expressed, using a scale from -2 to 1. Negative scores (-1 or -2) were given for posts showing grief or suffering, positive scores (1) for happy emotions like relief or accomplishment, and posts with no emotion received a score of 0 and were considered neutral. This analysis aims to understand the emotional aspects of cancer patients posts for a mental health study.

  5. Mental Health Client-Level Dataset

    • kaggle.com
    zip
    Updated Mar 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kole Nguyen (2024). Mental Health Client-Level Dataset [Dataset]. https://www.kaggle.com/datasets/kolenguyen/mental-health-client-level-dataset
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Mar 20, 2024
    Authors
    Kole Nguyen
    Description

    Data Source: Substance Abuse and Mental Health Services Administration (SAMHSA) 2020, U.S. Department of Health and Human Services (HHS).

    This is the dataset used for my first project for mental health analysis with Ann Bertram and Tiffany McBride at Purdue Fort Wayne. It has been cleaned and divided into datasets based on the states. Each dataset will include demographic information such as age, education level, ethnicity, race, genders, mental illness flags, etc. For more information, please refer to the codebook.

  6. Human and LLM Mental Health Conversations

    • kaggle.com
    Updated Feb 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordan J. Bird (2024). Human and LLM Mental Health Conversations [Dataset]. https://www.kaggle.com/datasets/birdy654/human-and-llm-mental-health-conversations
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 5, 2024
    Dataset provided by
    Kaggle
    Authors
    Jordan J. Bird
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset comprises both human expert and Large Language Model responses to queries about mental health

    Please note: patient context and psychologist responses found within this dataset are all collected from Kaggle, from the NLP Mental Health Conversations repository.

    The additional "LLM" column within this dataset has been generated by the MISTRAL-7B instruct v0.2 model, via the prompt:

    You are a psychologist speaking to a patient. The patient will speak to you and you will then answer their query. [/INST] Okay. Go ahead, patient. I will answer you as a psychologist. [INST] Patient: QUERY_GOES_HERE Psychologist: [/INST]

    This data was generated for, and analysed within the following study:

    Bird, J.J., Wright, D., Sumich, A., and Lotfi, A., 2024, June. Generative AI in Psychological Therapy: Perspectives on Computational Linguistics and Large Language Models in Written Behaviour Monitoring. In Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments.

  7. Psychosocial Mental Health Analysis

    • kaggle.com
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Ismiel Hossen Abir (2024). Psychosocial Mental Health Analysis [Dataset]. https://www.kaggle.com/datasets/mdismielhossenabir/psychosocial-mental-health-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Kaggle
    Authors
    Md. Ismiel Hossen Abir
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    We collect direct text data from the narrative of the people who faced psychological problem. Then, we make this dataset from the text. In this dataset there are 6 columns those are Age, Gender, Problem description, problem summary, problem category and problem psychological category.

  8. A

    ‘Mental Health Patients 2021-2022 ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Aug 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2020). ‘Mental Health Patients 2021-2022 ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-mental-health-patients-2021-2022-8003/ac8e4be4/?iid=001-512&v=presentation
    Explore at:
    Dataset updated
    Aug 4, 2020
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Mental Health Patients 2021-2022 ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/meetnagadia/district-wise-mental-health-patients-20212022 on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Title

    District Wise Number of Mental Health Patients in year 2021-2020 in Country India State Karnataka

    Description

    District Wise number of mental health patients such as severe mental illness, common mental disorder, alcohol, and substance abuse, cases referred to higher centers, suicide attempt cases

    Contributor

    Karnataka, Health and Family Welfare Department, Karnataka

    Sectors

    Health and Family welfare › Health

    Source

    Karnataka data government Click Here to visit the website

    Group Name

    Department of Health and Family Welfare

    --- Original source retains full ownership of the source dataset ---

  9. Healthcare Workforce Mental Health Dataset

    • kaggle.com
    Updated Feb 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rivalytics (2025). Healthcare Workforce Mental Health Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/10768196
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2025
    Dataset provided by
    Kaggle
    Authors
    Rivalytics
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    📌**Context**

    The Healthcare Workforce Mental Health Dataset is designed to explore workplace mental health challenges in the healthcare industry, an environment known for high stress and burnout rates.

    This dataset enables users to analyze key trends related to:

    💠 Workplace Stressors: Examining the impact of heavy workloads, poor work environments, and emotional demands.

    💠 Mental Health Outcomes: Understanding how stress and burnout influence job satisfaction, absenteeism, and turnover intention.

    💠 Educational & Analytical Applications: A valuable resource for data analysts, students, and career changers looking to practice skills in data exploration and data visualization.

    To help users gain deeper insights, this dataset is fully compatible with a Power BI Dashboard, available as part of a complete analytics bundle for enhanced visualization and reporting.

    📌**Source**

    This dataset was synthetically generated using the following methods:

    💠 Python & Data Science Techniques: Probabilistic modeling to simulate realistic data distributions. Industry-informed variable relationships based on healthcare workforce studies.

    💠 Guidance & Validation Using AI (ChatGPT): Assisted in refining dataset realism and logical mappings.

    💠 Industry Research & Reports: Based on insights from WHO, CDC, OSHA, and academic studies on workplace stress and mental health in healthcare settings.

    📌**Inspiration**

    This dataset was inspired by ongoing discussions in healthcare regarding burnout, mental health, and staff retention. The goal is to bridge the gap between raw data and actionable insights by providing a structured, analyst-friendly dataset.

    For those who want a ready-to-use reporting solution, a Power BI Dashboard Template is available, designed for interactive data exploration, workforce insights, and stress factor analysis.

    📌**Important Note** This dataset is synthetic and intended for educational purposes only. It is not real-world employee data and should not be used for actual decision-making or policy implementation.

  10. P

    SMHD Dataset

    • paperswithcode.com
    Updated May 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arman Cohan; Bart Desmet; Andrew Yates; Luca Soldaini; Sean MacAvaney; Nazli Goharian (2024). SMHD Dataset [Dataset]. https://paperswithcode.com/dataset/smhd
    Explore at:
    Dataset updated
    May 27, 2024
    Authors
    Arman Cohan; Bart Desmet; Andrew Yates; Luca Soldaini; Sean MacAvaney; Nazli Goharian
    Description

    A novel large dataset of social media posts from users with one or multiple mental health conditions along with matched control users.

  11. A

    ‘OSMI Mental Health In Tech Survey 2020’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘OSMI Mental Health In Tech Survey 2020’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-osmi-mental-health-in-tech-survey-2020-7711/d1697696/?iid=014-385&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘OSMI Mental Health In Tech Survey 2020’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/osmihelp/osmi-2020-mental-health-in-tech-survey-results on 30 September 2021.

    --- No further description of dataset provided by original source ---

    --- Original source retains full ownership of the source dataset ---

  12. o

    Mental Health Discourse Toxicity Dataset

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Mental Health Discourse Toxicity Dataset [Dataset]. https://www.opendatabay.com/data/healthcare/19feba79-b4de-42d1-9ea5-39992f7ec572
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Mental Health & Wellness
    Description

    This dataset is a collection of texts primarily focused on individuals experiencing anxiety, depression, and other mental health challenges. Its purpose is to facilitate understanding of language and sentiment related to mental health issues. The corpus can be applied to diverse tasks such as sentiment analysis, toxic language detection, and general mental health language analysis. The dataset is notably balanced, meaning it contains an equitable distribution of comments considered "poisonous" and those not.

    Columns

    • text: This column contains the raw text of the comments.
    • label: This column provides a numerical classification for each comment. A value of '1' indicates the comment is considered poisonous with mental health issues, while '0' indicates it is not considered poisonous.

    Distribution

    The dataset is typically structured for distribution in a CSV file format. It contains a total of 27,972 unique records. The distribution of labels shows 14,139 records are classified with a label of '0' (not poisonous), and 13,838 records are classified with a label of '1' (poisonous), indicating its balanced nature.

    Usage

    This dataset is an ideal resource for developing and refining machine learning models for sentiment analysis, particularly within mental health contexts. It is also highly suitable for creating toxic language detection systems and for conducting linguistic research aimed at understanding patterns in mental health discourse.

    Coverage

    The geographic scope of this dataset is global. It encompasses a wide range of text comments associated with mental health conditions such as anxiety and depression. The provided sources do not specify a particular time range for the data or specific demographic availability beyond the nature of the comments themselves.

    License

    CC-BY

    Who Can Use It

    The dataset is especially beneficial for researchers studying mental health language, mental health professionals seeking insights into online discourse, and developers creating AI models for content moderation, sentiment analysis tools, or support applications related to mental well-being.

    Dataset Name Suggestions

    • Mental Health Discourse Toxicity Dataset
    • Mental Health Comments Corpus
    • Toxic Mental Health Language Data
    • Anxiety Depression Text Corpus
    • Mental Wellness Language Dataset

    Attributes

    Original Data Source: Mental Health Corpus

  13. h

    mental-health-posts-dataset

    • huggingface.co
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KingPat (2025). mental-health-posts-dataset [Dataset]. https://huggingface.co/datasets/Noobie314/mental-health-posts-dataset
    Explore at:
    Dataset updated
    May 2, 2025
    Authors
    KingPat
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    🧠 Mental Health Posts Dataset

    This dataset is curated for mental health emotion classification tasks. It originates from the Counsel Chat Dataset available on Kaggle and has been preprocessed and restructured to suit NLP-based classification models.

      📄 Overview
    

    The dataset is designed to support the training and evaluation of models that classify user-generated mental health posts into one of the following categories:

    depression anxiety suicidal addiction… See the full description on the dataset page: https://huggingface.co/datasets/Noobie314/mental-health-posts-dataset.

  14. h

    mental_disorder_symptoms

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noah Syed (2025). mental_disorder_symptoms [Dataset]. https://huggingface.co/datasets/faisalsns/mental_disorder_symptoms
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Noah Syed
    Description

    Mental Disorders Symptom Dataset

    This dataset is adapted from Basel Bakeer's Kaggle dataset, prepared for use in agentic LLM pipelines and co-occurrence symptom analysis. { "dataset_name": { "tags": ["mental health", "symptoms", "medical", "psychiatry"] } } tags:

    healthcare mental-health tabular csv co-occurrence ai-health

      Features
    

    Binary labels for 20+ mental health symptoms Demographic info (age) Diagnosed disorder column Cleaned and ready for ML/NLP… See the full description on the dataset page: https://huggingface.co/datasets/faisalsns/mental_disorder_symptoms.

  15. o

    Twitter Mental Health Classification Data

    • opendatabay.com
    .undefined
    Updated Jul 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Twitter Mental Health Classification Data [Dataset]. https://www.opendatabay.com/data/ai-ml/528d3302-f98e-4a27-a218-51d2816cabe7
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 2, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Mental Health & Wellness
    Description

    This dataset provides uncleaned Twitter data, specifically filtered for English content, designed for mental health classification at the Tweet-level. It serves as a valuable resource for developing and evaluating models that identify mental health indicators from social media text. The dataset includes raw tweet text and associated user metrics. Additionally, it can be used to explore and apply data cleaning and feature extraction techniques, such as Topic Modelling Features using Latent Dirichlet Allocation (LDA) to summarise tweets into top k topics, and Emoji Sentiment Features to count positive, negative, and neutral expression emojis present in tweets.

    Columns

    • post_id: The unique identification number for each Twitter post.
    • post_created: The timestamp indicating when the post was created.
    • post_text: The raw, uncleaned text content of the tweet.
    • user_id: The unique identification number for the user who posted the tweet.
    • followers: The number of followers the user had at the time of the post.
    • friends: The number of friends (accounts the user is following) the user had at the time of the post.
    • favourites: The total number of likes (favourites) the user's account has received across all their tweets.
    • statuses: The total count of statuses (tweets) posted by the user.
    • retweets: The total number of retweets received by the current tweet.
    • Label: The classification label for mental health, intended for binary classification tasks.

    Distribution

    The data files are typically provided in CSV format and are in an uncleaned state. While a specific total number of rows or records is not explicitly stated, the dataset contains approximately 19,102 unique post IDs and 19,488 unique user IDs. Further details on the distribution of specific metrics like followers, friends, favourites, statuses, and retweets are available within the dataset's meta-information, showing various ranges and their corresponding counts.

    Usage

    This dataset is ideal for: * Developing and testing mental health classification models using social media data. * Practising and demonstrating Natural Language Processing (NLP) techniques, including text analysis and feature engineering. * Exploring and applying data cleaning methodologies on raw social media text. * Implementing and evaluating Topic Modelling using algorithms like LDA. * Conducting sentiment analysis based on emoji usage in tweets. * Research in social media analytics, public health, and digital epidemiology.

    Coverage

    The dataset's coverage is global, with tweets specifically filtered to contain English context only. There is no specific time range for the collection period of the tweets provided, but the dataset was listed on 05/06/2025.

    License

    CCO

    Who Can Use It

    This dataset is suitable for: * Data scientists and machine learning engineers working on text classification and NLP projects. * Researchers in mental health, social sciences, and computational linguistics. * Students and academics learning about social media data analysis, feature engineering, and model development for health applications. * Healthcare professionals interested in leveraging social media for insights into mental wellness trends.

    Dataset Name Suggestions

    • Twitter Mental Health Classification Data
    • English Tweets Depression Classifier
    • Social Media Mental Health Indicators
    • Tweet-Level Mental Well-being Dataset
    • Depression Prediction from Twitter

    Attributes

    Original Data Source: Depression: Twitter Dataset + Feature Extraction

  16. o

    Depression: Twitter Dataset + Feature Extraction

    • opendatabay.com
    .csv
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Depression: Twitter Dataset + Feature Extraction [Dataset]. https://www.opendatabay.com/data/dataset/528d3302-f98e-4a27-a218-51d2816cabe7
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jun 7, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Mental Health & Wellness
    Description

    The data is in uncleaned format and is collected using Twitter API. The Tweets has been filtered to keep only the English context. It targets mental health classification of the user at Tweet-level. Also check out notebooks I have provided which demonstrates Data Cleaning and Feature Extraction Techniques on the given dataset

    Topic Modelling Features using LDA (Latent Dirichlet Allocation) i.e. summarizing tweet into one of Top k topics Emoji Sentiment Features i.e. count of Positive, Negative and Neutral Expression emoji's present in the tweet

    Original Data Source: Depression: Twitter Dataset + Feature Extraction

  17. Mental Health Chatbot Pairs

    • kaggle.com
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Mental Health Chatbot Pairs [Dataset]. https://www.kaggle.com/datasets/thedevastator/mental-health-chatbot-pairs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Mental Health Chatbot Pairs

    AI-based Tailored Support for Mental Health Conversation

    By Huggingface Hub [source]

    About this dataset

    This dataset contains a compilation of carefully-crafted Q&A pairs which are designed to provide AI-based tailored support for mental health. These carefully chosen questions and answers offer an avenue for those looking for help to gain the assistance they need. With these pre-processed conversations, Artificial Intelligence (AI) solutions can be developed and deployed to better understand and respond appropriately to individual needs based on their input. This comprehensive dataset is crafted by experts in the mental health field, providing insightful content that will further research in this growing area. These data points will be invaluable for developing the next generation of personalized AI-based mental health chatbots capable of truly understanding what people need

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains pre-processed Q&A pairs for AI-based tailored support for mental health. As such, it represents an excellent starting point in building a conversational model which can handle conversations about mental health issues. Here are some tips on how to use this dataset to its fullest potential:

    • Understand your data: Spend time getting to know the text of the conversation between the user and the chatbot and familiarize yourself with what type of questions and answers are included in this specific dataset. This will help you better formulate queries for your own conversational model or develop new ones you can add yourself.

    • Refine your language processing models: By studying the patterns in syntax, grammar, tone, voice, etc., within this conversational data set you can hone your natural language processing capabilities - such as keyword extractions or entity extraction – prior to implementing them into a larger bot system .

    • Test assumptions: Have an idea of what you think may work best with a particular audience or context? See if these assumptions pan out by applying different variations of text to this dataset to see if it works before rolling out changes across other channels or programs that utilize AI/chatbot services

    • Research & Analyze Results : After testing out different scenarios on real-world users by using various forms of q&a within this chatbot pair data set , analyze & record any relevant results pertaining towards understanding user behavior better through further analysis after being exposed to tailored texted conversations about Mental Health topics both passively & actively . The more information you collect here , leads us closer towards creating effective AI powered conversations that bring our desired outcomes from our customer base .

    Research Ideas

    • Developing a chatbot for personalized mental health advice and guidance tailored to individuals' unique needs, experiences, and struggles.
    • Creating an AI-driven diagnostic system that can interpret mental health conversations and provide targeted recommendations for interventions or treatments based on clinical expertise.
    • Designing an AI-powered recommendation engine to suggest relevant content such as articles, videos, or podcasts based on users’ questions or topics of discussion during their conversation with the chatbot

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:--------------|:------------------------------------------------------------------------| | text | The text of the conversation between the user and the chatbot. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Huggingface Hub.

  18. o

    WebMD Psychiatric Medication Feedback

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). WebMD Psychiatric Medication Feedback [Dataset]. https://www.opendatabay.com/data/ai-ml/9b04667a-e038-477d-aae7-e94412a4483d
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Reviews & Ratings
    Description

    This dataset provides insights into patient and caregiver experiences with various psychiatric medications. It features unstructured text reviews alongside categorical ratings and demographic information. The primary aim is to capture real-world feedback on drug effectiveness, side effects, and overall satisfaction. The dataset currently includes over 61,000 reviews for hundreds of drugs used to treat conditions such as depression, anxiety, bipolar disorder, and schizophrenia. Future updates are planned to include more recent reviews.

    Columns

    • drug_name: The name of the psychiatric medication.
    • date: The date the review was submitted.
    • age: The age range of the reviewer (e.g., "45-54", "35-44", "13-18").
    • gender: The gender of the reviewer.
    • time_on_drug: The duration the reviewer has been taking the medication (e.g., "1 to less than 2 years", "less than 1 month").
    • reviewer_type: Indicates whether the reviewer is a "Patient" or a "Caregiver".
    • condition: The medical condition for which the drug was prescribed (e.g., "Posttraumatic Stress Syndrome", "Depression", "Panic Disorder").
    • rating_overall: The overall rating given by the reviewer, typically on a scale of 1 to 5.
    • rating_effectiveness: The reviewer's rating of the drug's effectiveness.
    • rating_ease_of_use: The reviewer's rating for how easy the drug was to use.
    • rating_satisfaction: The reviewer's rating of their satisfaction with the drug.
    • text: The detailed, unstructured text review provided by the patient or caregiver, describing their experiences.

    Distribution

    The dataset is typically provided as a data file, often in CSV format. It contains over 61,000 individual reviews. The exact number of rows or records for a specific sample may vary, but the overall dataset is substantial. Each record is structured with distinct columns as detailed above, allowing for both quantitative and qualitative analysis.

    Usage

    This dataset is ideal for a variety of applications focusing on real-world drug outcomes and patient sentiment. It can be used for: * Natural Language Processing (NLP): Training models for sentiment analysis, topic modelling, and entity recognition on healthcare-related text. * Social Science Research: Studying patient perceptions, drug adherence, and the psychosocial impact of psychiatric medications. * Healthcare Analytics: Identifying trends in drug effectiveness and side effects across different demographics and conditions. * Pharmaceutical Research: Understanding patient feedback to inform drug development and post-market surveillance. * Machine Learning: Developing predictive models for drug response or side effect occurrence based on review data.

    Coverage

    The dataset's coverage is global, collecting reviews from diverse geographical regions. The time range for the reviews is ongoing, with updates expanding the dataset to include recent submissions to WebMD. Demographically, it includes feedback from patients and caregivers across various age groups and genders. The primary focus for conditions includes depression, anxiety (including anxiety with depression), bipolar disorder, and schizophrenia, with potential for expansion to other psychiatric disorders in future versions.

    License

    CC-BY-NC

    Who Can Use It

    This dataset is suitable for: * Academic Researchers: For studies on pharmacovigilance, mental health, and patient-reported outcomes. * Data Scientists and Analysts: To build and refine models for text analysis and predictive analytics in the healthcare domain. * Healthcare Providers: To gain a broader understanding of patient experiences beyond clinical trials. * AI and LLM Developers: For training and fine-tuning language models on domain-specific healthcare text. * Pharmaceutical Companies: For market research, competitor analysis, and identifying unmet patient needs.

    Dataset Name Suggestions

    • Psychiatric Drug Patient Reviews
    • WebMD Psychiatric Medication Feedback
    • Mental Health Drug Experience Dataset
    • Patient Reported Psychiatric Outcomes
    • Sertraline Oral Patient Experiences

    Attributes

    Original Data Source: WebMD Reviews for Psychiatric Drugs

  19. OSMI Mental Health in Tech Survey

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open Sourcing Mental Illness Ltd (2023). OSMI Mental Health in Tech Survey [Dataset]. http://doi.org/10.6084/m9.figshare.5579458.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Open Sourcing Mental Illness Ltd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These data sets are survey results collected by OSMI and are made available by the CC-BY-SA 4.0 license. One survey was performed in 2014 and the other in 2016. Both of the surveys seek to understand how people that work in technology view mental health issues and to understand what support they receive from their employer.OSMI has made the data sets for both the 2014 survey and the 2016 survey available on Kaggle.

  20. Bangladeshi University Students' Mental Health Dataset

    • figshare.com
    txt
    Updated Mar 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M M Mahbubul Syeed; Ashifur Rahman; Md. Rajaul Karim (2024). Bangladeshi University Students' Mental Health Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.25347691.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    M M Mahbubul Syeed; Ashifur Rahman; Md. Rajaul Karim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    This dataset comprises mental health data from 1977 Bangladeshi university students across 15 top universities, collected from November to December 2023 using Google Forms. It includes assessments of academic anxiety, stress, and depression using widely used psychometric scales. The structured questionnaire covers sociodemographic variables and their associations, facilitating comprehensive analysis. Statistical analysis yielded satisfactory internal consistency (Cronbach’s alpha: 0.79), with anonymized participant data valuable for policymakers.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
elvis (2022). Mental Health Conversational Data [Dataset]. https://www.kaggle.com/datasets/elvis23/mental-health-conversational-data
Organization logo

Mental Health Conversational Data

Dataset containing conversations regarding mental health

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
elvis
Description

A dataset containing basic conversations, mental health FAQ, classical therapy conversations, and general advice provided to people suffering from anxiety and depression.

This dataset can be used to train a model for a chatbot that can behave like a therapist in order to provide emotional support to people with anxiety & depression.

The dataset contains intents. An “intent” is the intention behind a user's message. For instance, If I were to say “I am sad” to the chatbot, the intent, in this case, would be “sad”. Depending upon the intent, there is a set of Patterns and Responses appropriate for the intent. Patterns are some examples of a user’s message which aligns with the intent while Responses are the replies that the chatbot provides in accordance with the intent. Various intents are defined and their patterns and responses are used as the model’s training data to identify a particular intent.

Search
Clear search
Close search
Google apps
Main menu