1 dataset found

BanglaSER: Bangla Audio for Emotion Recognition
kaggle.com
Updated Aug 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2024). BanglaSER: Bangla Audio for Emotion Recognition [Dataset]. https://www.kaggle.com/datasets/evilspirit05/emotion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 27, 2024
Dataset provided by
Kaggle
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
BanglaSER is a specialized dataset designed for the task of Bangla speech emotion recognition. This dataset includes a rich collection of speech-audio recordings that capture a variety of fundamental human emotions. It is curated to support research and development in the field of speech emotion recognition, particularly for the Bangla language, and is suitable for various deep learning architectures.

Dataset Composition:

Total Number of Recordings: 1,467

Number of Speakers: 34 (17 male and 17 female)

Age Range of Speakers: 19 to 47 years

Recording Devices: Smartphones and laptops

Emotional States Covered:

Angry

Happy

Neutral

Sad

Surprise

Recording Structure:

Each emotional state is represented by:

3 Statements spoken three times by each participant.

For Angry, Happy, Sad, and Surprise: 3 statements × 3 repetitions × 34 speakers = 1,224 recordings.

For Neutral: 3 statements × 3 repetitions × 27 speakers = 243 recordings

Key Features:

Balanced Representation:

The dataset is carefully balanced with an equal number of male and female participants, ensuring that the recordings reflect diverse voices and emotional expressions.

Emotions are evenly distributed across the dataset, providing a robust basis for training and evaluating emotion recognition models.

Realistic Recording Conditions:

Recordings are made using commonly available devices, such as smartphones and laptops, which helps in preserving the naturalistic quality of the audio.

The dataset reflects real-life acoustic environments, making it more applicable to real-world applications.

Deep Learning Compatibility:

BanglaSER is designed to be compatible with various deep learning architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory Networks (LSTMs), and Bidirectional LSTMs (BiLSTMs).

The dataset can be used for a range of tasks, from emotion classification to sentiment analysis, and more.

Usage and Applications:

Emotion Recognition Models: BanglaSER provides a diverse set of recordings that are ideal for training models to recognize and classify emotions in Bangla speech.

Benchmarking and Evaluation: The dataset serves as a benchmark for evaluating the performance of emotion recognition systems and can help in comparing different model architectures and techniques.

Research and Development: Researchers can use BanglaSER to explore new methods in speech emotion recognition, develop novel algorithms, and enhance the understanding of emotion in speech.

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Documentation: Detailed documentation and guidelines for using the dataset are provided to assist users in effectively leveraging the data.

Acknowledgments:

We extend our gratitude to the contributors and participants who made this dataset possible. Their efforts have greatly enriched the field of speech emotion recognition and provided valuable resources for the community. Feel free to explore the dataset and utilize it in your research and projects. We look forward to seeing the innovative applications and advancements that will emerge from the use of BanglaSER
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Evil Spirit05 (2024). BanglaSER: Bangla Audio for Emotion Recognition [Dataset]. https://www.kaggle.com/datasets/evilspirit05/emotion

BanglaSER: Bangla Audio for Emotion Recognition

Bangla Audio Dataset for Emotion Recognition with Varied Speakers

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 27, 2024

Dataset provided by

Kaggle

Authors

Evil Spirit05

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

BanglaSER is a specialized dataset designed for the task of Bangla speech emotion recognition. This dataset includes a rich collection of speech-audio recordings that capture a variety of fundamental human emotions. It is curated to support research and development in the field of speech emotion recognition, particularly for the Bangla language, and is suitable for various deep learning architectures.

Dataset Composition:

Total Number of Recordings: 1,467
Number of Speakers: 34 (17 male and 17 female)
Age Range of Speakers: 19 to 47 years
Recording Devices: Smartphones and laptops

Emotional States Covered:

Angry
Happy
Neutral
Sad
Surprise

Recording Structure:

Each emotional state is represented by:

3 Statements spoken three times by each participant.
For Angry, Happy, Sad, and Surprise: 3 statements × 3 repetitions × 34 speakers = 1,224 recordings.
For Neutral: 3 statements × 3 repetitions × 27 speakers = 243 recordings

Key Features:

Balanced Representation:

The dataset is carefully balanced with an equal number of male and female participants, ensuring that the recordings reflect diverse voices and emotional expressions.
Emotions are evenly distributed across the dataset, providing a robust basis for training and evaluating emotion recognition models.

Realistic Recording Conditions:

Recordings are made using commonly available devices, such as smartphones and laptops, which helps in preserving the naturalistic quality of the audio.
The dataset reflects real-life acoustic environments, making it more applicable to real-world applications.

Deep Learning Compatibility:

BanglaSER is designed to be compatible with various deep learning architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory Networks (LSTMs), and Bidirectional LSTMs (BiLSTMs).
The dataset can be used for a range of tasks, from emotion classification to sentiment analysis, and more.

Usage and Applications:

Emotion Recognition Models: BanglaSER provides a diverse set of recordings that are ideal for training models to recognize and classify emotions in Bangla speech.
Benchmarking and Evaluation: The dataset serves as a benchmark for evaluating the performance of emotion recognition systems and can help in comparing different model architectures and techniques.
Research and Development: Researchers can use BanglaSER to explore new methods in speech emotion recognition, develop novel algorithms, and enhance the understanding of emotion in speech.

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Documentation: Detailed documentation and guidelines for using the dataset are provided to assist users in effectively leveraging the data.

Acknowledgments:

We extend our gratitude to the contributors and participants who made this dataset possible. Their efforts have greatly enriched the field of speech emotion recognition and provided valuable resources for the community.

Feel free to explore the dataset and utilize it in your research and projects. We look forward to seeing the innovative applications and advancements that will emerge from the use of BanglaSER

Clear search

Close search

Google apps

Main menu

BanglaSER: Bangla Audio for Emotion Recognition

Dataset Composition:

Emotional States Covered:

Recording Structure:

Each emotional state is represented by:

Key Features:

Balanced Representation:

Realistic Recording Conditions:

Deep Learning Compatibility:

Usage and Applications:

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Acknowledgments:

BanglaSER: Bangla Audio for Emotion RecognitionSee More Versions

Bangla Audio Dataset for Emotion Recognition with Varied Speakers

Dataset Composition:

Emotional States Covered:

Recording Structure:

Each emotional state is represented by:

Key Features:

Balanced Representation:

Realistic Recording Conditions:

Deep Learning Compatibility:

Usage and Applications:

Dataset Access:

Download Link: https://data.mendeley.com/datasets/t9h6p943xy/5

Acknowledgments:

BanglaSER: Bangla Audio for Emotion Recognition