8 datasets found
  1. P

    Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) Dataset

    • paperswithcode.com
    Updated Dec 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) Dataset [Dataset]. https://paperswithcode.com/dataset/distress-analysis-interview-corpus-wizard-of
    Explore at:
    Dataset updated
    Dec 28, 2022
    Description

    The Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) dataset [24, 25] comprises voice and text samples from 189 interviewed healthy and control persons and their PHQ-8 depression detection questionnaire. This dataset is commonly used in research works for text-based detection, voice-based detection, and in multi-modal architecture

  2. f

    Table 1_Sentence-level multi-modal feature learning for depression...

    • frontiersin.figshare.com
    docx
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guanghua Zhang; Guangping Zhuo; Yang Yang; Guohua Xu; Shukui Ma; Hao Liu; Zhiyong Ren (2025). Table 1_Sentence-level multi-modal feature learning for depression recognition.docx [Dataset]. http://doi.org/10.3389/fpsyt.2025.1439577.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 21, 2025
    Dataset provided by
    Frontiers
    Authors
    Guanghua Zhang; Guangping Zhuo; Yang Yang; Guohua Xu; Shukui Ma; Hao Liu; Zhiyong Ren
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe global prevalence of depression has escalated, exacerbated by societal and economic pressures. Current diagnostic methodologies predominantly utilize single-modality data, which, despite the existence of certain multi-modal strategies, often fail to effectively harness the distinct contributions of each modality in depression detection.MethodsThis study collected multi-modal features from 100 participants (67 depressed patients and 33 non-depressed individuals) to formulate a MMD2023 dataset, and introduces the Sentence-level Multi-modal Feature Learning (SMFL) approach, an automated system designed to enhance depression recognition. SMFL analyzes synchronized sentence-level segments of facial expressions, vocal features, and transcribed texts obtained from patient-doctor interactions. It incorporates Temporal Convolutional Networks (TCN) and Long Short-Term Memory (LSTM) networks to meticulously extract features from each modality, aligned with the structured temporal flow of dialogues. Additionally, the novel Cross-Modal Joint Attention (CMJAT) mechanism is developed to reconcile variances in feature representation across modalities, adeptly adjusting the influence of each modality and amplifying weaker signals to equate with more pronounced features.ResultsValidated on our collected MMD2023 dataset and a public available DAIC-WOZ containing 192 patients dataset, the SMFL achieves accuracies of 91% and 89% respectively, demonstrating superior performance in binary depression classification. This advanced approach not only achieves a higher precision in identifying depression but also ensures a balanced and unified multi-modal feature representation.ConclusionThe SMFL methodology represents a significant advancement in the diagnostic processes of depression, promising a cost-effective, private, and accessible diagnostic tool that aligns with the PHQ-8 clinical standard. By broadening the accessibility of mental health resources, this methodology has the potential to revolutionize the landscape of psychiatric evaluation, augmenting the precision of depression identification and enhancing the overall mental health management infrastructure.

  3. f

    Confusion matrix.

    • figshare.com
    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  4. f

    Explanation of features.

    • plos.figshare.com
    • figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Explanation of features. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  5. f

    PHQ8_Score File.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). PHQ8_Score File. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  6. Random forest model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Random forest model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  7. XGBoost model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). XGBoost model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  8. SVM model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). SVM model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) Dataset [Dataset]. https://paperswithcode.com/dataset/distress-analysis-interview-corpus-wizard-of

Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) Dataset

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Dec 28, 2022
Description

The Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) dataset [24, 25] comprises voice and text samples from 189 interviewed healthy and control persons and their PHQ-8 depression detection questionnaire. This dataset is commonly used in research works for text-based detection, voice-based detection, and in multi-modal architecture

Search
Clear search
Close search
Google apps
Main menu