MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Prasanna18/HAS-Corpus dataset hosted on Hugging Face and contributed by the HF Datasets community
WiserBrand offers a unique dataset of real consumer-to-business phone conversations. These high-quality audio recordings capture authentic interactions between consumers and support agents across industries. Unlike synthetic data or scripted samples, our dataset reflects natural speech patterns, emotion, intent, and real-world phrasing — making it ideal for:
Training ASR (Automatic Speech Recognition) systems
Improving voice assistants and LLM audio understanding
Enhancing call center AI tools (e.g., sentiment analysis, intent detection)
Benchmarking conversational AI performance with real-world noise and context
We ensure strict data privacy: all personally identifiable information (PII) is removed before delivery. Recordings are produced on demand and can be tailored by vertical (e.g., telecom, finance, e-commerce) or use case.
Whether you're building next-gen voice technology or need realistic conversational datasets to test models, this dataset provides what synthetic corpora lack — realism, variation, and authenticity.
General: The purpose of the Multimodal Sentiment Analysis in Real-life media Challenge and Workshop (MuSe) is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based).
We introduce the novel dataset MuSe-CAR that covers the range of aforementioned desiderata. MuSe-CAR is a large (>36h), multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity.
We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. This enables robust learning, even with a high degree of novel, in-the-wild characteristics, for example as related to: i) Video: Shot size (a mix of closeup, medium, and long shots), face-angle (side, eye, low, high), camera motion (free, free but stable, and free but unstable, switch, e.g., zoom, fixed), reviewer visibility (full body, half-body, face only, and hands only), highly varying backgrounds, and people interacting with objects (car parts). ii) Audio: Ambient noises (car noises, music), narrator and host diarisation, diverse microphone types, and speaker locations. iii) Text: Colloquialisms, and domain-specific terms.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Speech Emotion Recognition
Dataset comprises 30,000+ audio recordings featuring 4 distinct emotions: euphoria, joy, sadness, and surprise. This extensive collection is designed for research in emotion recognition, focusing on the nuances of emotional speech and the subtleties of speech signals as individuals vocally express their feelings. By utilizing this dataset, researchers and developers can enhance their understanding of sentiment analysis and improve automatic speech… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/speech-emotion-recognition.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Twitter is one of the those social media platforms where people are free to share their opinions on any topic. Sometimes we see a strong discussion on Twitter about someone's opinion that sometimes results in collection of negative Tweets.
This dataset is to do the sentiment analysis on tweets, it contains the tweets message.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The speech and voice analytics market is experiencing robust growth, driven by the increasing adoption of AI-powered solutions across various sectors. The market's expansion is fueled by several key factors: the rising need for improved customer experience through sentiment analysis and call center optimization, the proliferation of voice-enabled devices and virtual assistants generating vast amounts of voice data, and the growing demand for enhanced security and fraud detection through voice authentication and anomaly detection. Businesses are increasingly leveraging speech and voice analytics to gain valuable insights from customer interactions, optimize operational efficiency, and enhance decision-making processes. This has led to significant investments in research and development, resulting in advanced analytical capabilities and improved accuracy. The market is segmented by application (large enterprises and SMEs) and type (real-time and non-real-time), with large enterprises currently dominating due to their higher budgets and sophisticated analytical needs. However, the SME segment is expected to witness significant growth in the coming years due to the increasing affordability and accessibility of cloud-based solutions. Geographic distribution shows strong presence in North America and Europe, but significant growth opportunities exist in the Asia-Pacific region due to its burgeoning tech sector and expanding digital economy. Competition in the speech and voice analytics market is intense, with a mix of established players like Nuance Communications, Verint Systems, and Nice Systems, alongside rapidly growing technology companies such as Speechmatics, and cloud giants like Google, Amazon, and Microsoft offering integrated speech-to-text capabilities. The market is characterized by continuous innovation, with new features and functionalities emerging regularly. These include advancements in natural language processing (NLP), machine learning (ML), and deep learning algorithms that enhance the accuracy and efficiency of speech and voice analytics. Challenges remain, however, including data privacy concerns, the need for robust data security measures, and the ongoing effort to address language diversity and dialectal variations for broader applicability. Despite these challenges, the market is projected to maintain a strong growth trajectory throughout the forecast period (2025-2033), driven by continuous technological advancements and the expanding adoption across diverse industries. The integration of speech and voice analytics with other emerging technologies like the Internet of Things (IoT) and big data analytics will further propel market growth.
MuSe-Wild of MuSe2020: Predicting the level of emotional dimensions (arousal, valence) in a time-continuous manner from audio-visual recordings. This package includes only MuSe-Wild features (all partitions) and annotations of the training and development set (test scoring via the MuSe website).
General: The purpose of the Multimodal Sentiment Analysis in Real-life media Challenge and Workshop (MuSe) is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based).
We introduce the novel dataset MuSe-CAR that covers the range of aforementioned desiderata. MuSe-CAR is a large (>36h), multimodal dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild, e.g., the emotional engagement that takes place during product reviews (i.e., automobile reviews) where a sentiment is linked to a topic or entity.
We have designed MuSe-CAR to be of high voice and video quality, as informative video social media content, as well as everyday recording devices have improved in recent years. This enables robust learning, even with a high degree of novel, in-the-wild characteristics, for example as related to: i) Video: Shot size (a mix of close-up, medium, and long shots), face-angle (side, eye, low, high), camera motion (free, free but stable, and free but unstable, switch, e.g., zoom, fixed), reviewer visibility (full body, half-body, face only, and hands only), highly varying backgrounds, and people interacting with objects (car parts). ii) Audio: Ambient noises (car noises, music), narrator and host diarisation, diverse microphone types, and speaker locations. iii) Text: Colloquialisms, and domain-specific terms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The MuSe (Music Sentiment) dataset contains sentiment information for 90,408 songs. We computed scores for the affective dimensions of valence, dominance and arousal, based on the user-generated tags that are available for each song via Last.fm. In addition, we provide artist and title metadata as well as a Spotify ID and a MusicBrainz ID, which allow researchers to extend the dataset with further metadata, such as genre or year.
Though the tags themselves cannot be included in the dataset, we include a jupyter notebook in our accompanying Github repository that demonstrates how to fetch the tags of a given song from the Last.fm API (Last.fm_API.ipynb)
We further include a jupyter notebook in the same repository that demonstrates how one might enrich the dataset with audio features using different endpoints of the Spotify API using the included Spotify IDs (spotify_API.ipynb). Please note that in its current form, the dataset only contains tentative spotify IDs for a subset (around 68%) of the songs.
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The global speech and voice analytics technology market, valued at $1420 million in 2025, is projected to experience robust growth, driven by the increasing adoption of AI-powered solutions across diverse sectors. A compound annual growth rate (CAGR) of 4.4% from 2025 to 2033 signifies a steady expansion, fueled by several key factors. The rising need for efficient customer service, improved operational efficiency, and enhanced security across industries like BFSI (Banking, Financial Services, and Insurance), IT and Telecom, and Healthcare is significantly boosting market demand. Furthermore, advancements in natural language processing (NLP) and machine learning (ML) are enabling more sophisticated analytics, leading to better insights from voice and speech data. The market is segmented by technology (Speech Engine, Indexing and Query Tools, Reporting and Visualization Tools, Quality Management, Root Cause Analysis, Others) and application (BFSI, IT and Telecom, Media and Entertainment, Healthcare and Life Sciences, Retail and E-commerce, Travel Industry, Government and Defense, Others), reflecting the widespread applicability of this technology. The presence of established players like Google, Amazon, and Nuance, alongside emerging innovative companies, indicates a competitive and dynamic market landscape. The market's growth trajectory is expected to be influenced by factors such as increasing data volumes, the rising adoption of cloud-based solutions, and the growing need for real-time analytics. However, challenges such as data privacy concerns, the complexity of implementing these systems, and the high initial investment costs could potentially restrain market growth to some extent. Nevertheless, the overall outlook remains positive, with a significant potential for growth driven by the continued technological advancements and the expanding adoption across diverse industry verticals. Regional analysis suggests strong growth across North America and Europe initially, with emerging markets in Asia-Pacific expected to contribute significantly in the later forecast period. The competitive landscape is characterized by both established players and innovative startups, leading to continuous product innovation and market consolidation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please cite this paper when using this dataset: N. Thakur, “Mpox narrative on Instagram: A labeled multilingual dataset of Instagram posts on mpox for sentiment, hate speech, and anxiety analysis,” arXiv [cs.LG], 2024, URL: https://arxiv.org/abs/2409.05292Abstract: The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. During recent virus outbreaks, social media platforms have played a crucial role in keeping the global population informed and updated regarding various aspects of the outbreaks. As a result, in the last few years, researchers from different disciplines have focused on the development of social media datasets focusing on different virus outbreaks. No prior work in this field has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper (stated above) aims to address this research gap. It presents this multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages.For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed. This process included classifying each post intoone of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutralhate or not hateanxiety/stress detected or no anxiety/stress detected.These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.The 52 distinct languages in which Instagram posts are present in the dataset are English, Portuguese, Indonesian, Spanish, Korean, French, Hindi, Finnish, Turkish, Italian, German, Tamil, Urdu, Thai, Arabic, Persian, Tagalog, Dutch, Catalan, Bengali, Marathi, Malayalam, Swahili, Afrikaans, Panjabi, Gujarati, Somali, Lithuanian, Norwegian, Estonian, Swedish, Telugu, Russian, Danish, Slovak, Japanese, Kannada, Polish, Vietnamese, Hebrew, Romanian, Nepali, Czech, Modern Greek, Albanian, Croatian, Slovenian, Bulgarian, Ukrainian, Welsh, Hungarian, and Latvian.The following is a description of the attributes present in this dataset:Post ID: Unique ID of each Instagram postPost Description: Complete description of each post in the language in which it was originally publishedDate: Date of publication in MM/DD/YYYY formatLanguage: Language of the post as detected using the Google Translate APITranslated Post Description: Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.Sentiment: Results of sentiment analysis (using the preprocessed version of the translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutralHate: Results of hate speech detection (using the preprocessed version of the translated Post Description) where each post was classified as hate or not hateAnxiety or Stress: Results of anxiety or stress detection (using the preprocessed version of the translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.All the Instagram posts that were collected during this data mining process to develop this dataset were publicly available on Instagram and did not require a user to log in to Instagram to view the same (at the time of writing this paper).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the first Arabic Natural Audio Dataset (ANAD) developed to recognize 3 discrete emotions: Happy,angry, and surprised.
Eight videos of live calls between an anchor and a human outside the studio were downloaded from online Arabic talk shows. Each video was then divided into turns: callers and receivers. To label each video, 18 listeners were asked to listen to each video and select whether they perceive a happy, angry or surprised emotion. Silence, laughs and noisy chunks were removed. Every chunk was then automatically divided into 1 sec speech units forming our final corpus composed of 1384 records.
Twenty five acoustic features, also known as low-level descriptors, were extracted. These features are: intensity, zero crossing rates, MFCC 1-12 (Mel-frequency cepstral coefficients), F0 (Fundamental frequency) and F0 envelope, probability of voicing and, LSP frequency 0-7. On every feature nineteen statistical functions were applied. The functions are: maximum, minimum, range, absolute position of maximum, absolute position of minimum, arithmetic of mean, Linear Regression1, Linear Regression2, Linear RegressionA, Linear RegressionQ, standard Deviation, kurtosis, skewness, quartiles 1, 2, 3 and, inter-quartile ranges 1-2, 2-3, 1-3. The delta coefficient for every LLD is also computed as an estimate of the first derivative hence leading to a total of 950 features.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global voice analytics market is experiencing robust growth, projected to reach a significant size within the forecast period (2025-2033). A compound annual growth rate (CAGR) of 15% indicates substantial market expansion driven by several key factors. The increasing adoption of cloud-based solutions simplifies deployment and reduces infrastructure costs, fueling market expansion across diverse sectors. Furthermore, the rising need for enhanced customer experience management and operational efficiency is driving demand for sophisticated voice analytics tools. Businesses across various verticals, including retail, telecommunications, BFSI, healthcare, and government, are leveraging these tools to gain actionable insights from customer interactions, improve service quality, and detect fraudulent activities. The segmentations within the market reflect this broad adoption. Cloud deployment models dominate, while large enterprises lead in adoption due to their greater resources and complex needs. Applications like sentiment analysis, sales & marketing optimization, and risk & fraud detection are experiencing especially rapid growth, showcasing the versatility and impact of voice analytics. The market's growth is further propelled by advancements in artificial intelligence (AI) and machine learning (ML), enabling more accurate and insightful analysis of voice data. This includes improved sentiment analysis, speaker identification, and topic extraction. However, data privacy concerns and the need for robust security measures pose challenges to the market's expansion. The high cost of implementation and integration of advanced voice analytics systems can also limit adoption, particularly among smaller enterprises. Despite these challenges, the ongoing technological advancements and the increasing volume of voice data generated daily promise continued market growth. Future market growth will likely be influenced by the development of more sophisticated analytics capabilities, broader industry adoption across niche applications, and the resolution of privacy and security concerns. Recent developments include: September 2022: Contact center AI platform Observe.AI launched a new set of tools for determining what the AI's data analysis signifies. The new Conversation Intelligence Consulting Services provides a mechanism for contact centers to integrate better and analyze how user interactions with human and virtual agents are progressing and what can be done to improve them., June 2022: QuadraByte, LLC announced its collaboration with Vonage as a Vonage Voice API integration partner. QuadraByte, a provider of an Intelligence Economy, integrates its experience into the Vonage relationship, allowing Vonage customers to quickly interface with Voice API and implement improved calling experiences.. Notable trends are: Applications driving the growth of the market.
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
Voice Analytics Market is Segmented by Component (Solution and Services), Deployment Mode (Cloud and On-Premise), Organization Size (Small and Medium-Sized Enterprises and Large Enterprises), Application (Health Monitoring, Sentiment Analysis, and More), End-User Vertical (Retail and E-Commerce, Telecom and IT, BFSI, Healthcare, and More) and by Geography. The Market Forecasts are Provided in Terms of Value (USD).
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Emotion Recognition and Sentiment Analysis Software Market is experiencing robust growth, projected to reach $849.76 million in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 14.15% from 2025 to 2033. This expansion is fueled by several key drivers. Increasing adoption of AI-powered solutions across diverse sectors, including customer service, market research, and healthcare (patient diagnosis), is a primary factor. Businesses leverage these tools to gain valuable insights into customer preferences, improve product development, and personalize user experiences. The rise of cloud-based deployment models further accelerates market growth, offering scalability, cost-effectiveness, and enhanced accessibility. Furthermore, the growing need for effective brand monitoring and reputation management, particularly on social media, is driving demand for sentiment analysis tools. While data privacy concerns and ethical considerations surrounding emotion recognition technology pose certain restraints, the overall market outlook remains exceptionally positive. The market is segmented by application (customer service/experience, product/market research, patient diagnosis, others) and deployment (on-premises, cloud-based), reflecting the diverse use cases and deployment preferences of different industries. North America currently holds a significant market share, driven by early adoption and technological advancements. However, APAC is expected to exhibit substantial growth in the coming years, fueled by increasing digitalization and a burgeoning tech industry in countries like China and Japan. Leading companies are focusing on strategic partnerships, acquisitions, and the development of innovative solutions to maintain a competitive edge in this rapidly evolving landscape. The competitive landscape is characterized by a mix of established tech giants like Microsoft and IBM alongside specialized emotion AI companies. The market’s success hinges on the continuous improvement of algorithm accuracy, addressing ethical concerns, and ensuring responsible data handling. Future growth will depend on advancements in deep learning and computer vision, enabling more nuanced and accurate emotion recognition across various modalities, including facial expressions, voice tone, and text analysis. Addressing data bias and ensuring compliance with data privacy regulations are crucial for sustainable growth. The market's segmentation reflects its adaptability across various industries, underscoring its potential for widespread application and sustained expansion throughout the forecast period.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Techsalerator’s Location Sentiment Data for Mauritania
Techsalerator’s Location Sentiment Data for Mauritania provides deep insights into how people perceive different locations across urban, rural, and industrial areas. This dataset is crucial for businesses, researchers, and policymakers aiming to understand sentiment trends across various regions in Mauritania.
For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.
Techsalerator’s Location Sentiment Data for Mauritania offers a structured analysis of public sentiment across cities, towns, and remote areas. This dataset is essential for market research, urban development, AI sentiment analysis, and regional planning.
To obtain Techsalerator’s Location Sentiment Data for Mauritania, contact info@techsalerator.com with your specific requirements. Techsalerator provides customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.
For in-depth insights into public sentiment and regional perception in Mauritania, Techsalerator’s dataset is an invaluable resource for businesses, researchers, policymakers, and urban planners.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global voice analytics tools market is experiencing robust growth, driven by the increasing adoption of AI-powered solutions across diverse sectors. The market's expansion is fueled by several key factors: the rising demand for improved customer experience through sentiment analysis and real-time feedback mechanisms, the increasing need for enhanced security and fraud detection using voice biometrics, and the growing adoption of conversational AI in various applications such as chatbots and virtual assistants. Furthermore, the proliferation of data from various sources, including call centers, social media, and IoT devices, provides ample raw material for sophisticated voice analytics. While challenges remain, such as data privacy concerns and the need for sophisticated algorithms to interpret complex vocal nuances, the market is poised for sustained growth. We estimate the 2025 market size to be approximately $8 billion, based on industry reports indicating rapid expansion in adjacent AI and analytics markets. Considering a reasonable CAGR of 15% (a conservative estimate given the technology's rapid evolution), we project substantial growth over the forecast period, with significant potential in emerging markets like Asia-Pacific fueled by increasing digitalization and adoption of voice-based technologies. The segmentation by application (large enterprises and SMEs) and type (solution and services) allows for a nuanced understanding of specific market needs and opportunities. The competitive landscape is diverse, with both established tech giants and specialized voice analytics companies vying for market share, leading to innovation and affordability. The growth in the voice analytics market is expected to continue across all major geographical regions, particularly in North America and Europe, driven by early adoption and robust technological infrastructure. However, Asia-Pacific is anticipated to witness the fastest growth due to the burgeoning economies and expanding digital infrastructure within this region. Market penetration is further bolstered by government initiatives promoting digital transformation and the development of AI-related technologies, which facilitate a more rapid adoption of advanced analytical tools. The segment focusing on services is likely to witness significant growth, driven by the increasing need for customized solutions and ongoing support for complex deployments. Large enterprises remain the primary drivers of current market revenue, but SMEs are anticipated to increasingly contribute to the growth trajectory, particularly with the development of user-friendly and cost-effective tools catering to their specific requirements.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Emotion Analytics Market is experiencing robust growth, projected to reach $3.40 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 18.2% from 2025 to 2033. This expansion is driven by several key factors. The increasing adoption of AI and machine learning technologies allows for more accurate and nuanced emotion detection from various data sources, including facial expressions, voice tone, and text analysis. Furthermore, the rising demand for personalized customer experiences across industries like retail, healthcare, and finance is fueling the market. Businesses are leveraging emotion analytics to understand customer sentiment, optimize marketing campaigns, and improve product development. The growing need for enhanced security and public safety, particularly in law enforcement, is another significant driver, enabling more effective crime prevention and investigation through real-time emotion analysis. While data privacy concerns represent a potential restraint, the development of robust ethical guidelines and anonymization techniques is mitigating this risk. The market is segmented by application, with Customer Experience Management, Sales & Marketing Management, and Competitive Intelligence currently holding the largest shares, followed by rapidly growing segments such as Public Safety and Law Enforcement. North America and Europe are currently leading the market, but the Asia-Pacific region, particularly China and India, is expected to show significant growth in the coming years, driven by increasing digitalization and technological advancements. The competitive landscape is dynamic, with a mix of established technology companies and emerging specialized vendors. Successful players are focusing on developing sophisticated and versatile platforms, integrating multiple data sources, and offering customizable solutions to meet specific client needs. Strategic partnerships and mergers and acquisitions are common strategies to expand market reach and enhance technological capabilities. While market entry barriers are relatively high due to the need for advanced technical expertise and data infrastructure, the overall market is poised for considerable expansion, driven by continued technological innovation and increasing demand for data-driven decision-making across various sectors. The long-term forecast indicates sustained high growth, with market segmentation expected to become more granular as specific application areas mature and new functionalities emerge.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global Speech and Voice Analytics Technology market is experiencing robust growth, projected to reach $1420 million in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 4.4% from 2025 to 2033. This expansion is driven by several key factors. The increasing adoption of cloud-based solutions and the rise of artificial intelligence (AI) are significantly impacting market growth, enabling more sophisticated analysis and easier integration with existing systems. Furthermore, the growing need for enhanced customer experience across various industries, from BFSI (Banking, Financial Services, and Insurance) to healthcare and retail, is fueling demand for solutions that can accurately analyze customer interactions to identify areas for improvement and enhance service delivery. The market's segmentation reflects this diverse application, with strong growth expected across sectors prioritizing customer engagement and operational efficiency. The availability of advanced speech-to-text technologies, improved natural language processing (NLP) capabilities, and the increasing affordability of these solutions also contribute to market expansion. The competitive landscape is characterized by a mix of established players and emerging technology companies. Major players such as Google, Amazon Web Services (AWS), and Nuance Communications are leveraging their existing infrastructure and expertise to offer comprehensive speech and voice analytics solutions. However, smaller, more agile companies are also making significant inroads by focusing on niche applications or offering innovative features. The market's geographical distribution reveals strong growth in North America and Europe, driven by early adoption and established technological infrastructure. However, emerging economies in Asia-Pacific and other regions are demonstrating significant growth potential, as businesses increasingly recognize the value of speech analytics for improved operational efficiency and enhanced customer understanding. Continued advancements in AI and machine learning will further drive market growth in the coming years, shaping the future of customer interaction and business intelligence.
https://data.aussda.at/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11587/EOPCOBhttps://data.aussda.at/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11587/EOPCOB
Full edition for scientific use. The dataset contains 125871 sentences extracted from Austrian parliamentary debates and party press releases. Press releases were collected under the auspices of the Austrian National Election Study (AUTNES) and cover 6 weeks prior to each national election 1995-2013. Data from parliamentary debates stem from a random sample of sentences drawn from sessions of the Austrian National Council (1995-2013). The sentiment of the sentences was crowdcoded on a five-point-scale ranging from 0 “Not negative” to 5 “Very strongly negative”. As each sentence has been coded by ten coders, there are multiple codingids for each unitid (sentence).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
These carpete contains the datasets features used and described in the research paper entitled
García-Cuesta, E., Barba, A., Gachet, D. "EmoMatchSpanishDB: Study of Speech Emotion Recognition Machine Learning Models in a New Spanish Elicited Database" , Multimedia Tools and Applications, Ed. Springer, 2023
In this paper we address the task of real time emotion recognition for elicited emotions. For this purpose we have created a publicly accessible dataset composed by fifty subjects expressing the emotions of anger, disgust, fear, happiness, sadness, and surprise in Spanish language. In addition, a neutral tone of each subject has been added. This article describes how this database have been created including the recording and the performed crowdsourcing perception test in order to statistically validate the emotion of each sample and remove noisy data samples. Moreover we present a baseline comparative study between different machine learning techniques in terms of accuracy, specificity, precision, and recall. Prosodic and spectral features are extracted and used for this classification purpose. We expect that this database will be useful to get new insights within this area of study.
The first dataset is "EmoSpanishDB" that contains a set of 13 and 140 spectral and prosodic features for a total of 3550 audios of 50 individuals reproducing the 12 sentences for the six different emotions, ’anger, disgust, fear, happiness, sadness, surprise’ (Ekman’s basic emotions]) plus neutral.
The second dataset is "EmoMatchSpanishDB" and contains a set of 13 and 140 spectral and prosodic features for a total of 2050 audios of 50 individuals reproducing the 12 sentences for the six different emotions, ’anger, disgust, fear, happiness, sadness, surprise’ (Ekman’s basic emotions]) plus neutral. These 2050 audios' features are a subset of EmoSpanishDB resulting of the matched audios after application of a crowdsourcing process to validate that the elicited emotion corresponds with the expressed.
The third dataset is "EmoMatchSpanishDB-Compare-features.zip" that contains the COMPARE features for the experiments of dependent-speaker and LOSO. These datasets have been used in the paper "EmoMatchSpanishDB: Study of Machine Learning Models in a New Spanish Elicited Dataset" and their creation, its contents, and also a set of baseline machine learning experiments and results are fully described within it.
The features are available under MIT license and if you want to get access to the original raw audio files for creating your own features and research purposes you can get them under CC-BY-NC completing and signing the agreement file (EMOMATCHAgreement.docx) and sending it via email to esteban.garcia@upm.es
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Prasanna18/HAS-Corpus dataset hosted on Hugging Face and contributed by the HF Datasets community