AbstractTTS/IEMOCAP dataset hosted on Hugging Face and contributed by the HF Datasets community
Processed IEMOCAP dataset released by Dai, W., Zheng, D., Yu, F., Zhang, Y., & Hou, Y. (2025, February 12). A Novel Approach to for Multimodal Emotion Recognition : Multimodal semantic information fusion. arXiv.org. https://arxiv.org/abs/2502.08573
Dataset Card for "IEMOCAP_Text"
This dataset obtained from IEMOCAP dataset. For more information go to IEMOCAP webpage. This dataset contains 5 most common classes includes angry, happy, excitement, neutral and sad. Based on articles in this field, we merge excitement and happy classes. Our dataset contaions 5531 utterances and it splits based on the sessions. More Information needed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of previous research on video-based emotion recognition using IEMOCAP database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test splits for the categorical emotion datasets CREMA-D, emoDB, IEMOCAP, MELD, RAVDESS used inside audEERING.
For each dataset, a CSV file is provided listing the file names included in the test split.
The test splits were designed trying to balance gender and emotional categories as good as possible.
https://sail.usc.edu/iemocap/Data_Release_Form_IEMOCAP.pdfhttps://sail.usc.edu/iemocap/Data_Release_Form_IEMOCAP.pdf
The Interactive Emotional Dyadic Motion Capture (IEMOCAP) database is an acted, multimodal and multispeaker database, recently collected at SAIL lab at USC. It contains approximately 12 hours of audiovisual data, including video, speech, motion capture of face, text transcriptions. It consists of dyadic sessions where actors perform improvisations or scripted scenarios, specifically selected to elicit emotional expressions. IEMOCAP database is annotated by multiple annotators into categorical labels, such as anger, happiness, sadness, neutrality, as well as dimensional labels such as valence, activation and dominance. The detailed motion capture information, the interactive setting to elicit authentic emotions, and the size of the database make this corpus a valuable addition to the existing databases in the community for the study and modeling of multimodal and expressive human communication.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistics of the MELD, EmoryNLP, DailyDialog, and IEMOCAP.
This dataset was created by Rumaiya
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Training and test instances for the IEMOCAP corpus.
This dataset was created by AyaOsama21
audio statistics
mteb/iemocap dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recalls for speech emotion recognition using IEMOCAP and DNN.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Precision of speech emotion recognition using IEMOCAP and CNN.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
F1-scores for speech emotion recognition using a common model set and CNN.
cairocode/IEMOCAP dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Confusion matrix [%] using IEMOCAP and DNN with MFCC/SDC features.
WiktorJakubowski/IEMOCAP dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Bảo Xuyên Nguyễn Lê
Released under Apache 2.0
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
VAD Scoring: - VAD Scoring: Voice Activity Detection and emotion dimensionality (Valence, Arousal, Dominance) computed using audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim model.
Metadata: Includes audio duration, transcript length, character counts, and VAD scores
An example of a field in the metadata file is as follows { "file_id": "emovdb_amused_1-15_0001_1933", "original_path": "..\data_collection\tts_data\processed\emovdb\amused_1-15_0001.wav", "dataset": "emovdb", "status": "success", "error": null, "processed_audio_path": "None", "transcript_path": "processed_datasets\transcripts\emovdb_amused_1-15_0001_1933.json", "vad_path": "processed_datasets\vad_scores\emovdb_amused_1-15_0001_1933.json", "text": "Author of the Danger Trail, Phillips Deals, etc.", "language": "en", "audio_duration": 4.384671201814059, "text_length": 48, "valence": 0.7305971384048462, "arousal": 0.704948365688324, "dominance": 0.6887099146842957, "vad_confidence": 0.9830486676764241 },
This work is licensed under CC BY-NC-SA 4.0.
Required Citations: - CREMA-D: Cao, H., Cooper, D. G., Keutmann, M. K., Gur, R. C., Nenkova, A., & Verma, R. (2014). CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset. IEEE Transactions on Affective Computing, 5(4), 377-390.
EmoV-DB: Adigwe, A., Tits, N., Haddad, K. E., Ostadabbas, S., & Dutoit, T. (2018). The emotional voices database: Towards controlling the emotion dimension in voice generation systems. arXiv preprint arXiv:1806.09514.
IEMOCAP: Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., Kim, S., Chang, J. N., Lee, S., & Narayanan, S. S. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. Language Resources and Evaluation, 42(4), 335-359.
RAVDESS: Livingstone, S. R., & Russo, F. A. (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE, 13(5), e0196391.
AbstractTTS/IEMOCAP dataset hosted on Hugging Face and contributed by the HF Datasets community