DementiaBank is a medical domain task. It contains 117 people diagnosed with Alzheimer Disease, and 93 healthy people, reading a description of an image, and the task is to classify these groups. This release contains only the audio part of this dataset, without the text features.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('dementiabank', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This study explores the effectiveness of Automatic Speech Recognition (ASR) in building end-to-end automatic speech diagnosis and prediction models. We implemented three publicly available ASR engines including Xunfei, Tencent, and Aliyun, and compared the classifiability using the ADReSS-IS2020 public dataset (https://dementia.talkbank.org/). The dataset is a balanced subset selected from the Pitt corpus in the DementiaBank database with the effects of gender and age bias removed. The provided feature file name is composed of the ASR engine name and the data collection category. Our feature data file contains 157 native English-speaking participants, including 78 AD patients and 78 healthy individuals. The test set division for classification was officially provided, where the training set contained 108 participants and the test set contained 48 testers. The data columns contain the sex and label of the participants and the names of the extracted acoustic and textual features. Here we have used only textual features for all the experiments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
% error and perplexity on AD-type dementia held-out test set (h = Hidden layer size; Bz = Batch size).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Performance comparison with the LPOCV AUC on the AD-type dementia dataset, N = 198.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Details of n-gram vocabularies from the MCI and AD-type dementia datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Performance comparison with the LPOCV AUC on the MCI dataset, N = 38.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(h = Hidden layer size; Bz = Batch size).
Dataset Card for Conversational AI Model
Datasets of multi-lang support and dementia targets over trained, tested and validated data for day to day task.
Source Data
Dementia Bank
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw transformed data. These files contain the transformed linguistic features from the DementiaBank dataset and appear in the Comma Separated Values file format. (ZIP 18.4 kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAnimal fluency is a widely used task to assess people with Alzheimer’s disease (AD) and other neurological disorders. The mechanisms that drive performance in this task are argued to rely on language and executive functions. However, there is little information regarding what specific aspects of these cognitive processes drive performance on this task.ObjectiveTo understand which aspects of language (i.e., semantics, phonological output lexicon, phonological assembly) and executive function (i.e., mental set shifting; information updating and monitoring; inhibition of possible responses) are involved in the performance of animal fluency in people with AD.MethodsAnimal fluency data from 58 people with probable AD from the DementiaBank Pittsburgh Corpus were analyzed. Number of clusters and switches were measured and nine word properties (e.g., frequency, familiarity) for each of the correct words (i.e., each word counting toward the total score, disregarding non-animals and repetitions) were determined. Random forests were used to understand which variables predicted the total number of correct words, and conditional inference trees were used to search for interactions between the variables. Finally, Wilcoxon tests were implemented to cross-validate the results, by comparing the performance of participants with scores below the norm in animal fluency against participants with scores within the norm based on a large normative sample.ResultsSwitches and age of acquisition emerged as the most important variables to predict total number of correct words in animal fluency in people with AD. Cross-validating the results, people with AD whose animal fluency scores fell below the norm produced fewer switches and words with lower age of acquisition than people with AD with scores in the normal range.ConclusionThe results indicate that people with AD rely on executive functioning (information updating and monitoring) and language (phonological output lexicon, not necessarily semantics) to produce words on animal fluency.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
DementiaBank is a medical domain task. It contains 117 people diagnosed with Alzheimer Disease, and 93 healthy people, reading a description of an image, and the task is to classify these groups. This release contains only the audio part of this dataset, without the text features.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('dementiabank', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.