3 datasets found
  1. Mental Health Dataset

    • kaggle.com
    Updated Mar 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2024). Mental Health Dataset [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/mental-health-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhavik Jikadara
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset appears to contain a variety of features related to text analysis, sentiment analysis, and psychological indicators, likely derived from posts or text data. Some features include readability indices such as Automated Readability Index (ARI), Coleman Liau Index, and Flesch-Kincaid Grade Level, as well as sentiment analysis scores like sentiment compound, negative, neutral, and positive scores. Additionally, there are features related to psychological aspects such as economic stress, isolation, substance use, and domestic stress. The dataset seems to cover a wide range of linguistic, psychological, and behavioural attributes, potentially suitable for analyzing mental health-related topics in online communities or text data.

    Benefits of using this dataset:

    • Insight into Mental Health: The dataset provides valuable insights into mental health by analyzing linguistic patterns, sentiment, and psychological indicators in text data. Researchers and data scientists can gain a better understanding of how mental health issues manifest in online communication.
    • Predictive Modeling: With a wide range of features, including sentiment analysis scores and psychological indicators, the dataset offers opportunities for developing predictive models to identify or predict mental health outcomes based on textual data. This can be useful for early intervention and support.
    • Community Engagement: Mental health is a topic of increasing importance, and this dataset can foster community engagement on platforms like Kaggle. Data enthusiasts, researchers, and mental health professionals can collaborate to analyze the data and develop solutions to address mental health challenges.
    • Data-driven Insights: By analyzing the dataset, users can uncover correlations and patterns between linguistic features, sentiment, and mental health indicators. These insights can inform interventions, policies, and support systems aimed at promoting mental well-being.
    • Educational Resource: The dataset can serve as a valuable educational resource for teaching and learning about mental health analytics, sentiment analysis, and text mining techniques. It provides a real-world dataset for students and practitioners to apply data science skills in a meaningful context.
  2. m

    PHQ-9 Student Depression Dataset

    • data.mendeley.com
    Updated Mar 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Abdullah Ibne Aziz Miraz (2025). PHQ-9 Student Depression Dataset [Dataset]. http://doi.org/10.17632/kkzjk253cy.3
    Explore at:
    Dataset updated
    Mar 10, 2025
    Authors
    Md Abdullah Ibne Aziz Miraz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PHQ-9 Student Depression Dataset contains responses from 500 students to the PHQ-9 questionnaire, a well-established tool for diagnosing depression. This dataset is designed to support the development of machine learning models aimed at automated depression detection by analyzing text responses to common depression-related questions.

    The PHQ-9 questionnaire includes 9 questions that assess symptoms of depression over the past two weeks, covering areas like mood, energy levels, sleep, appetite, and thoughts of self-harm. The responses are scored on a scale from 0 (Not at all) to 3 (Nearly every day), with the total score ranging from 0 to 27. Based on this score, the depression severity is classified into one of the following categories: Minimal (0-4) Mild (5-9) Moderate (10-14) Moderately Severe (15-19) Severe (20-27)

    This dataset is primarily designed for building models that can assist in automated depression detection. Some potential use cases include: Sentiment Analysis: Analyzing emotional tones in text responses to assess depression. Text Classification: Classifying responses into different depression severity levels. Predictive Modeling: Predicting depression severity based on textual responses. Feature Engineering: Extracting linguistic features (e.g., sentiment, keywords) to predict depression. The dataset is diverse, with synthetic responses across different levels of depression, providing a versatile foundation for machine learning applications. While the dataset does not contain personally identifiable information (PII), real-world applications should follow ethical guidelines regarding privacy, consent, and mental health resources. When working with real data or applying this dataset in clinical research, it is essential to adhere to ethical standards, including:

    Data Privacy: Anonymizing personal information. Informed Consent: Ensuring participants give consent before data collection. Support Resources: Providing support for individuals who may exhibit serious mental health concerns.

    Applications: Clinical Research: This dataset is valuable for studying depression detection using natural language processing and machine learning techniques. AI in Healthcare: It can be used in the development of tools for automated mental health assessment. Education: Training students or professionals in recognizing depression symptoms and analyzing responses.

  3. f

    Sample Posts from the ADHD dataset.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Akib Jawad Karim; Kazi Hafiz Md. Asad; Md. Golam Rabiul Alam (2025). Sample Posts from the ADHD dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0315829.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ahmed Akib Jawad Karim; Kazi Hafiz Md. Asad; Md. Golam Rabiul Alam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT-based model for natural language processing (NLP) applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task—classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million-resulting in a model approximately 73.64% smaller. On the General Language Understanding Evaluation (GLUE) benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy of 85%, F1 score of 85%, precision of 85%, and recall of 85%. When compared to DistilBERT (66 million parameters) and ClinicalBERT (110 million parameters), LastBERT demonstrated comparable performance, with DistilBERT slightly outperforming it at 87%, and ClinicalBERT achieving 86% across the same metrics. These findings highlight the LastBERT model’s capacity to classify degrees of ADHD severity properly, so it offers a useful tool for mental health professionals to assess and comprehend material produced by users on social networking platforms. The study emphasizes the possibilities of knowledge distillation to produce effective models fit for use in resource-limited conditions, hence advancing NLP and mental health diagnosis. Furthermore underlined by the considerable decrease in model size without appreciable performance loss is the lower computational resources needed for training and deployment, hence facilitating greater applicability. Especially using readily available computational tools like Google Colab and Kaggle Notebooks. This study shows the accessibility and usefulness of advanced NLP methods in pragmatic world applications.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bhavik Jikadara (2024). Mental Health Dataset [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/mental-health-dataset
Organization logo

Mental Health Dataset

Exploring the Landscape of Mental Well-being: A Comprehensive Dataset Analysis

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 18, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bhavik Jikadara
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset appears to contain a variety of features related to text analysis, sentiment analysis, and psychological indicators, likely derived from posts or text data. Some features include readability indices such as Automated Readability Index (ARI), Coleman Liau Index, and Flesch-Kincaid Grade Level, as well as sentiment analysis scores like sentiment compound, negative, neutral, and positive scores. Additionally, there are features related to psychological aspects such as economic stress, isolation, substance use, and domestic stress. The dataset seems to cover a wide range of linguistic, psychological, and behavioural attributes, potentially suitable for analyzing mental health-related topics in online communities or text data.

Benefits of using this dataset:

  • Insight into Mental Health: The dataset provides valuable insights into mental health by analyzing linguistic patterns, sentiment, and psychological indicators in text data. Researchers and data scientists can gain a better understanding of how mental health issues manifest in online communication.
  • Predictive Modeling: With a wide range of features, including sentiment analysis scores and psychological indicators, the dataset offers opportunities for developing predictive models to identify or predict mental health outcomes based on textual data. This can be useful for early intervention and support.
  • Community Engagement: Mental health is a topic of increasing importance, and this dataset can foster community engagement on platforms like Kaggle. Data enthusiasts, researchers, and mental health professionals can collaborate to analyze the data and develop solutions to address mental health challenges.
  • Data-driven Insights: By analyzing the dataset, users can uncover correlations and patterns between linguistic features, sentiment, and mental health indicators. These insights can inform interventions, policies, and support systems aimed at promoting mental well-being.
  • Educational Resource: The dataset can serve as a valuable educational resource for teaching and learning about mental health analytics, sentiment analysis, and text mining techniques. It provides a real-world dataset for students and practitioners to apply data science skills in a meaningful context.
Search
Clear search
Close search
Google apps
Main menu