12 datasets found
  1. Daicwoz

    • kaggle.com
    zip
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saif Zaman (2024). Daicwoz [Dataset]. https://www.kaggle.com/datasets/saifzaman123445/daicwoz
    Explore at:
    zip(93620735718 bytes)Available download formats
    Dataset updated
    Aug 12, 2024
    Authors
    Saif Zaman
    Description

    Dataset

    This dataset was created by Saif Zaman

    Contents

  2. Daic-Woz Transcripts

    • kaggle.com
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wahyu indra wardana (2025). Daic-Woz Transcripts [Dataset]. https://www.kaggle.com/datasets/indrasz/daic-woz-transcripts
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    wahyu indra wardana
    Description

    Dataset

    This dataset was created by wahyu indra wardana

    Contents

  3. f

    Distribution of male and female participants across train, validation and...

    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja (2023). Distribution of male and female participants across train, validation and test partitions of the DAIC-WOZ depression dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0272659.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Distribution of male and female participants across train, validation and test partitions of the DAIC-WOZ depression dataset.

  4. f

    Recognition accuracies in terms of Weighted Accuracy (WA) and Unweighted...

    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja (2023). Recognition accuracies in terms of Weighted Accuracy (WA) and Unweighted Accuracy (UA) and F1-scores for different tensor based techinques for test set of Daic-Woz dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0272659.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recognition accuracies in terms of Weighted Accuracy (WA) and Unweighted Accuracy (UA) and F1-scores for different tensor based techinques for test set of Daic-Woz dataset.

  5. f

    Comparison with the state-of-the-art techniques on the test partition of...

    • plos.figshare.com
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja (2023). Comparison with the state-of-the-art techniques on the test partition of DAIC-WOZ dataset in terms of Weighted Accuracy(WA), Unweighted Accuracy(UA) and F1-scores. [Dataset]. http://doi.org/10.1371/journal.pone.0272659.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Sandeep Kumar Pandey; Hanumant Singh Shekhawat; S. R. M. Prasanna; Shalendar Bhasin; Ravi Jasuja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison with the state-of-the-art techniques on the test partition of DAIC-WOZ dataset in terms of Weighted Accuracy(WA), Unweighted Accuracy(UA) and F1-scores.

  6. f

    Confusion matrix.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  7. f

    XGBoost model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). XGBoost model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  8. f

    PHQ8_Score File.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). PHQ8_Score File. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  9. f

    SVM model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). SVM model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  10. f

    Random forest model details.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Random forest model details. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  11. f

    Explanation of features.

    • plos.figshare.com
    xls
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan (2025). Explanation of features. [Dataset]. http://doi.org/10.1371/journal.pone.0322299.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Giuliano Lorenzoni; Cristina Tavares; Nathalia Nascimento; Paulo Alencar; Donald Cowan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and background. Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since depression diagnosis is highly dependent on expert professionals and is time-consuming. Research problems. Recent research has evidenced that machine learning (ML) and natural language processing (NLP) tools and techniques have significantly benefited the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. Purpose of the study. This paper tackles such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. Methodology. The experimental case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Major findings. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model. Conclusions. More comprehensive assessments of ML classification algorithms and NLP techniques for depression detection can advance the state of the art in terms of improved experimental settings and performance.

  12. f

    Table 1_Sex differences in PTSD speech biomarkers assessed by virtual...

    • frontiersin.figshare.com
    docx
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Menne; Louisa Schwed; Felix Dörr; Nicklas Linz; Johannes Tröger; Alexandra König (2025). Table 1_Sex differences in PTSD speech biomarkers assessed by virtual agent-induced conversations.docx [Dataset]. http://doi.org/10.3389/fpsyg.2025.1509206.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    Frontiers
    Authors
    Felix Menne; Louisa Schwed; Felix Dörr; Nicklas Linz; Johannes Tröger; Alexandra König
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionWomen face a substantially elevated risk of developing PTSD compared to men. With the emergence of automated digital biomarkers for assessing complex psychiatric disorders, it becomes imperative to take into account possible sex differences.ObjectivesOur objective was to explore sex-related speech differences in individuals with PTSD.MethodsWe utilized data from the DAIC-WOZ dataset, consisting of dialogs between participants with PTSD (n = 31) and a virtual avatar. Throughout these dialogs, the avatar utilized diverse prompts to maintain a conversation. Features were extracted from the transcripts, and acoustic features were obtained from the recorded audio files. Group comparisons, correlations, and linear models were calculated to assess sex-related differences in these features between male and female individuals with PTSD.ResultsGroup comparisons yielded significant differences between male and female patients in acoustic features such as the F2 frequency Standard Deviation (higher in males) and Harmonics to Noise Ratio (lower in males). Correlations revealed that Loudness Standard Deviation was significantly associated with PCL-C scores in males, but not in females. Additionally, we found interaction effects for linguistic and temporal features such as verb phrase usage, adposition rate, mean utterance duration, and speech ratio, with males showing positive associations and females showing inverse associations.ConclusionSex-related variations in the expression of PTSD severity through speech suggest contrasting effects in acoustic and linguistic features. These results underscore the importance of considering sex-specific expressions of behavioral symptoms in developing digital speech biomarkers for diagnostic and monitoring purposes in PTSD.

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Saif Zaman (2024). Daicwoz [Dataset]. https://www.kaggle.com/datasets/saifzaman123445/daicwoz
Organization logo

Daicwoz

Explore at:
25 scholarly articles cite this dataset (View in Google Scholar)
zip(93620735718 bytes)Available download formats
Dataset updated
Aug 12, 2024
Authors
Saif Zaman
Description

Dataset

This dataset was created by Saif Zaman

Contents

Search
Clear search
Close search
Google apps
Main menu