100+ datasets found
  1. c

    emotion analysis based on text Dataset

    • cubig.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). emotion analysis based on text Dataset [Dataset]. https://cubig.ai/store/products/139/emotion-analysis-based-on-text-dataset
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data introduction • Emotion-analysis dataset is data for analyzing the emotions of text.

    2) Data utilization (1) Emotion-analysis data has characteristics that: • Contains a variety of texts that convey emotions ranging from happiness to anger to sadness. The goal is to build an efficient model for detecting emotions in text. (2) Emotion-analysis data can be used to: • Sentiment classification models: This dataset can be used to train machine learning models that classify text based on sentiment, which helps companies and researchers understand public opinion and sentiment trends. • Market research: Researchers can analyze sentiment data to understand consumer preferences and market trends and support data-driven decision making.

  2. m

    BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis

    • data.mendeley.com
    Updated Nov 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Ataur Rahman (2020). BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis [Dataset]. http://doi.org/10.17632/24xd7w7dhp.1
    Explore at:
    Dataset updated
    Nov 20, 2020
    Authors
    Md Ataur Rahman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present a manually annotated Bangla Emotion corpus, which incorporates the diversity of fine-grained emotion expressions in social-media text. We tried to consider more fine-grained emotion labels such as Sadness, Happiness, Disgust, Surprise, Fear and Anger - which are, according to Paul Ekman (1999), the six basic emotion categories. For this task, we collected a large amount of raw text data from the user’s comments on two different Facebook groups (Ekattor TV and Airport Magistrates) and from the public post of a popular blogger and activist Dr. Imran H Sarker. These comments are mostly reactions to ongoing socio-political issues and towards the economic success and failure of Bangladesh. We scrape a total of 32923 comments from the three sources aforementioned above. Out of these, a total of 6314 comments were annotated into the six categories. The distribution of the annotated corpus is as follows:

    sad = 1341 happy = 1908 disgust = 703 surprise = 562 fear = 384 angry = 1416

    We have also provided a balanced set from the above data and split the dataset into training and test set of equal ratio. We considered a proportion of 5:1 for training and evaluation purpose. More information on the dataset and the experiments on it could be found in our paper (related links below).

  3. h

    emotion

    • huggingface.co
    Updated Feb 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DAIR.AI (2023). emotion [Dataset]. https://huggingface.co/datasets/dair-ai/emotion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    DAIR.AI
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for "emotion"

      Dataset Summary
    

    Emotion is a dataset of English Twitter messages with six basic emotions: anger, fear, joy, love, sadness, and surprise. For more detailed information please refer to the paper.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Needed

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    An example looks as follows. { "text": "im feeling quite sad and sorry for myself but… See the full description on the dataset page: https://huggingface.co/datasets/dair-ai/emotion.

  4. c

    Sentiment Analysis Dataset

    • cubig.ai
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Sentiment Analysis Dataset [Dataset]. https://cubig.ai/store/products/270/sentiment-analysis-dataset
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Sentiment Analysis Dataset is a dataset for emotional analysis, including large-scale tweet text collected from Twitter and emotional polarity (0=negative, 2=neutral, 4=positive) labels for each tweet, featuring automatic labeling based on emoticons.

    2) Data Utilization (1) Sentiment Analysis Dataset has characteristics that: • Each sample consists of six columns: emotional polarity, tweet ID, date of writing, search word, author, and tweet body, and is suitable for training natural language processing and classification models using tweet text and emotion labels. (2) Sentiment Analysis Dataset can be used to: • Emotional Classification Model Development: Using tweet text and emotional polarity labels, we can build positive, negative, and neutral emotional automatic classification models with various machine learning and deep learning models such as logistic regression, SVM, RNN, and LSTM. • Analysis of SNS public opinion and trends: By analyzing the distribution of emotions by time series and keywords, you can explore changes in public opinion on specific issues or brands, positive and negative trends, and key emotional keywords.

  5. E

    A Sentiment Analysis Dataset for Code-Mixed Malayalam-English

    • live.european-language-grid.eu
    • zenodo.org
    tsv
    Updated Dec 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). A Sentiment Analysis Dataset for Code-Mixed Malayalam-English [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7634
    Explore at:
    tsvAvailable download formats
    Dataset updated
    Dec 13, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There is an increasing demand for sentiment analysis of text from social media which are mostly code-mixed. Systems trained on monolingual data fail for code-mixed data due to the complexity of mixing at different levels of the text. However, very few resources are available for code-mixed data to create models specific for this data. Although much research in multilingual and cross-lingual sentiment analysis has used semi-supervised or unsupervised methods, supervised methods still performs better. Only a few datasets for popular languages such as English-Spanish, English-Hindi, and English-Chinese are available. There are no resources available for Malayalam-English code-mixed data. This paper presents a new gold standard corpus for sentiment analysis of code-mixed text in Malayalam-English annotated by voluntary annotators. This gold standard corpus obtained a Krippendorff’s alpha above 0.8 for the dataset. We use this new corpus to provide the benchmark for sentiment analysis in Malayalam-English code-mixed texts.

  6. Text Emotion Recognition

    • kaggle.com
    Updated Mar 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shreejit Cheela (2023). Text Emotion Recognition [Dataset]. https://www.kaggle.com/shreejitcheela/text-emotion-recognition/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2023
    Dataset provided by
    Kaggle
    Authors
    Shreejit Cheela
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Emotions play a vital role in human communication, and detecting emotions from text data is a challenging task. The ability to automatically recognize emotions from text has many practical applications, such as in sentiment analysis, social media monitoring, and customer feedback analysis.

    In this project, we will discuss the working principle of a text emotion recognition model and its important terminologies. We will also provide a detailed description of the model architecture used and its training process. Finally, we will conclude by evaluating the model using confusion matrix and classification report. Here, in the "emotions" column 0: sad 1: happy

    slang.txt in Abbreviations step can be taken from: https://www.kaggle.com/datasets/mansis97/slangs

  7. E

    Emotions Analytics (EA) Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jul 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Emotions Analytics (EA) Software Report [Dataset]. https://www.datainsightsmarket.com/reports/emotions-analytics-ea-software-1973364
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Jul 13, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Emotions Analytics (EA) Software market is experiencing robust growth, driven by increasing demand for personalized customer experiences, advancements in artificial intelligence (AI) and machine learning (ML), and the rising adoption of digital channels across various industries. The market, estimated at $2 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $7 billion by 2033. This expansion is fueled by several key factors. Firstly, businesses are leveraging EA to gain deeper insights into consumer behavior, enabling more effective marketing strategies, product development, and customer service improvements. Secondly, the sophistication of EA technology continues to improve, with more accurate emotion detection capabilities and the integration of diverse data sources (facial expressions, voice tone, text analysis) resulting in more comprehensive and reliable insights. Finally, growing regulatory requirements concerning data privacy and ethical considerations are driving demand for robust and compliant EA solutions. However, the market's growth is not without its challenges. High initial investment costs for implementing EA systems and the need for specialized expertise to interpret and analyze the collected data can act as significant barriers to entry for smaller businesses. Moreover, concerns surrounding data privacy and the potential for misuse of emotionally sensitive information remain important hurdles that need to be addressed through transparent data handling practices and robust ethical guidelines. The competitive landscape is characterized by a mix of large established technology firms like Microsoft and IBM, alongside innovative specialized companies like iMotions and Affectiva, fostering a dynamic market environment with varied technological approaches and service offerings. Future growth will depend on continued technological advancements, the development of robust ethical frameworks, and increased awareness of the value proposition of EA across diverse sectors.

  8. Z

    Data for manuscript: "Longitudinal Analysis of Sentiment and Emotion in News...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymized (2022). Data for manuscript: "Longitudinal Analysis of Sentiment and Emotion in News Media Headlines Using Automated Labelling with Transformer Language Models" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5144112
    Explore at:
    Dataset updated
    Sep 13, 2022
    Dataset authored and provided by
    Anonymized
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set contains automated sentiment and emotionality annotations of 23 million headlines from 47 popular news media outlets popular in the United States.

    The set of 47 news media outlets analysed (listed in Figure 1 of the main manuscript) was derived from the AllSides organization 2019 Media Bias Chart v1.1. The human ratings of outlets’ ideological leanings were also taken from this chart and are listed in Figure 2 of the main manuscript.

    News articles headlines from the set of outlets analyzed in the manuscript are available in the outlets’ online domains and/or public cache repositories such as The Internet Wayback Machine, Google cache and Common Crawl. Articles headlines were located in articles’ HTML raw data using outlet-specific XPath expressions.

    The temporal coverage of headlines across news outlets is not uniform. For some media organizations, news articles availability in online domains or Internet cache repositories becomes sparse for earlier years. Furthermore, some news outlets popular in 2019, such as The Huffington Post or Breitbart, did not exist in the early 2000’s. Hence, our data set is sparser in headlines sample size and representativeness for earlier years in the 2000-2019 timeline. Nevertheless, 20 outlets in our data set have chronologically continuous partial or full headline data availability since the year 2000. Figure S 1 in the SI reports the number of headlines per outlet and per year in our analysis.

    In a small percentage of articles, outlet specific XPath expressions might fail to properly capture the content of the headline due to the heterogeneity of HTML elements and CSS styling combinations with which articles text content is arranged in outlets online domains. After manual testing, we determined that the percentage of headlines following in this category is very small. Additionally, our method might miss detecting some articles in the online domains of news outlets. To conclude, in a data analysis of over 23 million headlines, we cannot manually check the correctness of every single data instance and hundred percent accuracy at capturing headlines’ content is elusive due to the small number of difficult to detect boundary cases such as incorrect HTML markup syntax in online domains. Overall however, we are confident that our headlines set is representative of headlines in print news media content for the studied time period and outlets analyzed.

    The list of compressed files in this data set is listed next:

    -analysisScripts.rar contains the analysis scripts used in the main manuscript as well as aggregated data of sentiment and emotionality automated annotations of the headlines and human annotations of a subset of headlines sentiment and emotionality used as ground truth.

    -models.rar contains the Transformer sentiment and emotion annotation models used in the analysis. Namely:

    Siebert/sentiment-roberta-large-english from https://huggingface.co/siebert/sentiment-roberta-large-english. This model is a fine-tuned checkpoint of RoBERTa-large (Liu et al. 2019). It enables reliable binary sentiment analysis for various types of English-language text. For each instance, it predicts either positive (1) or negative (0) sentiment. The model was fine-tuned and evaluated on 15 data sets from diverse text sources to enhance generalization across different types of texts (reviews, tweets, etc.). See more information from the original authors at https://huggingface.co/siebert/sentiment-roberta-large-english

    DistilbertSST2.rar is the default sentiment classification model of the HuggingFace Transformer library https://huggingface.co/ This model is only used to replicate the results of the sentiment analysis with sentiment-roberta-large-english

    DistilRoberta j-hartmann/emotion-english-distilroberta-base from https://huggingface.co/j-hartmann/emotion-english-distilroberta-base. The model is a fine-tuned checkpoint of DistilRoBERTa-base. The model allows annotation of English text with Ekman's 6 basic emotions, plus a neutral class. The model was trained on 6 diverse datasets. Please refer to the original author at https://huggingface.co/j-hartmann/emotion-english-distilroberta-base for an overview of the data sets used for fine tuning. https://huggingface.co/j-hartmann/emotion-english-distilroberta-base

    -headlinesDataWithSentimentLabelsAnnotationsFromSentimentRobertaLargeModel.rar URLs of headlines analyzed and the sentiment annotations of the siebert/sentiment-roberta-large-english Transformer model. https://huggingface.co/siebert/sentiment-roberta-large-english

    -headlinesDataWithSentimentLabelsAnnotationsFromDistilbertSST2.rar URLs of headlines analyzed and the sentiment annotations of the default HuggingFace sentiment analysis model fine-tuned on the SST-2 dataset. https://huggingface.co/

    -headlinesDataWithEmotionLabelsAnnotationsFromDistilRoberta.rar URLs of headlines analyzed and the emotion categories annotations of the j-hartmann/emotion-english-distilroberta-base Transformer model. https://huggingface.co/j-hartmann/emotion-english-distilroberta-base

  9. H

    Data from: An emotion analysis dataset of course comment texts in massive...

    • dataverse.harvard.edu
    Updated Sep 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiang Feng; Keyi Yuan; Xiu Guan; Longhui Qiu (2022). An emotion analysis dataset of course comment texts in massive online learning course platforms [Dataset]. http://doi.org/10.7910/DVN/LC6GHO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Xiang Feng; Keyi Yuan; Xiu Guan; Longhui Qiu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Datasets are critical for emotion analysis in the machine learning field. This study aims to explore emotion analysis datasets and related benchmarks in online learning, since, currently, there are very few studies that explore the same. We have scientifically labeled the topic and nine-category emotion of 4715 comment texts in online learning platforms using the “three-person voting label method” based on the “sentence-level” and multi-category labeling dimensions with our self-developed system. After testing the consistency of the labeling results using the Fleiss Kappa method, we found that the consistency of the dataset was about 0.51, representing a moderate strength of agreement. Based on the dataset, the prediction accuracy of the Long-Short Term Memory (LSTM) method is about 0.68. This dataset provides a benchmark for the multi- category emotion dataset in the Chinese online learning field. It can provide a basis for the subsequent solution of emotion analysis, monitoring, and intervention in the education field. It can also provide a reference for constructing subsequent datasets in the education field. We need to remind you that this is a Chinese dataset. If you want to use this dataset, please contact the author and you should request for the dataset below.

  10. h

    multiclass-sentiment-analysis-dataset

    • huggingface.co
    Updated Jul 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahriar Parvez (2023). multiclass-sentiment-analysis-dataset [Dataset]. https://huggingface.co/datasets/Sp1786/multiclass-sentiment-analysis-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 14, 2023
    Authors
    Shahriar Parvez
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for Dataset Name

      Dataset Summary
    

    This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    [More Information Needed]

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    [More Information Needed]

      Data Fields
    

    [More Information Needed]

      Data Splits
    

    [More Information Needed]

      Dataset Creation… See the full description on the dataset page: https://huggingface.co/datasets/Sp1786/multiclass-sentiment-analysis-dataset.
    
  11. Emotion Recognition and Sentiment Analysis Software Market Analysis North...

    • technavio.com
    Updated Jan 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Emotion Recognition and Sentiment Analysis Software Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, China, Japan, UK, Germany - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/emotion-recognition-and-sentiment-analysis-software-market-industry-analysis
    Explore at:
    Dataset updated
    Jan 19, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    United States, Global
    Description

    Snapshot img

    Emotion Recognition and Sentiment Analysis Software Market Size 2024-2028

    The emotion recognition and sentiment analysis software market size is forecast to increase by USD 797.17 million at a CAGR of 14.15% between 2023 and 2028.

    The market is experiencing significant growth, driven by the increasing popularity of wearable devices and the adoption of real-time sensing analysis. These technologies enable more accurate and timely emotion recognition, providing valuable insights for various applications, including healthcare, marketing, and customer service. However, the market faces challenges, most notably the issue of low-quality video content hampering emotional interpretation. Regulatory hurdles also impact adoption, as organizations navigate complex data privacy and security regulations.
    To capitalize on market opportunities and navigate challenges effectively, companies must focus on improving data quality, investing in advanced algorithms, and addressing regulatory requirements. By doing so, they can differentiate themselves in a competitive landscape and drive innovation in the market.
    

    What will be the Size of the Emotion Recognition and Sentiment Analysis Software Market during the forecast period?

    Request Free Sample

    The market is experiencing significant growth, driven by the increasing adoption of conversational AI and virtual assistants. This technology enables the analysis of both textual and multimedia data, including audio and video, to extract emotional insights from user interactions. Data mining techniques, such as predictive modeling and model deployment, play a crucial role in processing and interpreting this data. Sentiment analysis dashboards and emotion recognition dashboards provide valuable insights into user experience, allowing businesses to map and optimize both the employee and customer journey. Cognitive computing and cognitive AI technologies are also integral to this market, enabling real-time analysis of user behavior and feedback.
    Data ethics and responsible AI are becoming increasingly important considerations in this market, with a focus on data governance and model training to ensure accurate and explainable AI. Biometric data and behavioral data are also being leveraged to enhance the capabilities of emotion recognition systems, further expanding their applications. Model evaluation and model training are essential components of this market, ensuring the accuracy and effectiveness of AI models. Interpretable AI and explainable AI are also gaining traction, enabling businesses to understand the reasoning behind AI decisions and build trust in the technology. Data annotation and data annotation tools are critical for training AI models, ensuring high-quality data and accurate sentiment analysis.
    Overall, the market is poised for continued growth, offering businesses valuable insights into user emotions and improving the user experience.
    

    How is this Emotion Recognition and Sentiment Analysis Software Industry segmented?

    The emotion recognition and sentiment analysis software industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Application
    
      Customer service/experience
      Product/market research
      Patient diagnosis
      Others
    
    
    Deployment
    
      On-premises
      Cloud-based
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        Japan
    
    
      Rest of World (ROW)
    

    By Application Insights

    The customer service/experience segment is estimated to witness significant growth during the forecast period.

    Emotion AI technology, integrated with sentiment analysis tools, is revolutionizing business operations by enabling real-time understanding of customer emotions and feedback. These solutions utilize machine learning, natural language processing, and computer vision to analyze text, voice, and facial expressions for sentiment scoring, emotion classification, and polarity analysis. Emotion lexicons and sentiment lexicons are used to identify and categorize emotions, while deep learning and predictive analytics provide insights into historical trends. Sentiment analysis plays a crucial role in various industries, including human resources for employee engagement and feedback analysis, fraud detection, and brand reputation management. It is also used in customer service to enhance customer experience through personalized communication and proactive issue resolution.

    Social media monitoring and text analysis help businesses stay updated on brand mentions and customer sentiments, while voice analysis and tone analysis provide valuable insights from customer interactions. Integration with APIs, cloud computing, and data visualization tools streamlines the process, allowing for seamless im

  12. E

    Emotion Recognition and Sentiment Analysis Software Market Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Emotion Recognition and Sentiment Analysis Software Market Report [Dataset]. https://www.marketreportanalytics.com/reports/emotion-recognition-and-sentiment-analysis-software-market-11382
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Emotion Recognition and Sentiment Analysis Software Market is experiencing robust growth, projected to reach $849.76 million in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 14.15% from 2025 to 2033. This expansion is fueled by several key drivers. Increasing adoption of AI-powered solutions across diverse sectors, including customer service, market research, and healthcare (patient diagnosis), is a primary factor. Businesses leverage these tools to gain valuable insights into customer preferences, improve product development, and personalize user experiences. The rise of cloud-based deployment models further accelerates market growth, offering scalability, cost-effectiveness, and enhanced accessibility. Furthermore, the growing need for effective brand monitoring and reputation management, particularly on social media, is driving demand for sentiment analysis tools. While data privacy concerns and ethical considerations surrounding emotion recognition technology pose certain restraints, the overall market outlook remains exceptionally positive. The market is segmented by application (customer service/experience, product/market research, patient diagnosis, others) and deployment (on-premises, cloud-based), reflecting the diverse use cases and deployment preferences of different industries. North America currently holds a significant market share, driven by early adoption and technological advancements. However, APAC is expected to exhibit substantial growth in the coming years, fueled by increasing digitalization and a burgeoning tech industry in countries like China and Japan. Leading companies are focusing on strategic partnerships, acquisitions, and the development of innovative solutions to maintain a competitive edge in this rapidly evolving landscape. The competitive landscape is characterized by a mix of established tech giants like Microsoft and IBM alongside specialized emotion AI companies. The market’s success hinges on the continuous improvement of algorithm accuracy, addressing ethical concerns, and ensuring responsible data handling. Future growth will depend on advancements in deep learning and computer vision, enabling more nuanced and accurate emotion recognition across various modalities, including facial expressions, voice tone, and text analysis. Addressing data bias and ensuring compliance with data privacy regulations are crucial for sustainable growth. The market's segmentation reflects its adaptability across various industries, underscoring its potential for widespread application and sustained expansion throughout the forecast period.

  13. Datasets for Sentiment Analysis

    • zenodo.org
    csv
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.

    Below are the datasets specified, along with the details of their references, authors, and download sources.

    ----------- STS-Gold Dataset ----------------

    The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.

    Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.

    File name: sts_gold_tweet.csv

    ----------- Amazon Sales Dataset ----------------

    This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.

    Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)

    Features:

    • product_id - Product ID
    • product_name - Name of the Product
    • category - Category of the Product
    • discounted_price - Discounted Price of the Product
    • actual_price - Actual Price of the Product
    • discount_percentage - Percentage of Discount for the Product
    • rating - Rating of the Product
    • rating_count - Number of people who voted for the Amazon rating
    • about_product - Description about the Product
    • user_id - ID of the user who wrote review for the Product
    • user_name - Name of the user who wrote review for the Product
    • review_id - ID of the user review
    • review_title - Short review
    • review_content - Long review
    • img_link - Image Link of the Product
    • product_link - Official Website Link of the Product

    License: CC BY-NC-SA 4.0

    File name: amazon.csv

    ----------- Rotten Tomatoes Reviews Dataset ----------------

    This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.

    This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).

    Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics

    File name: data_rt.csv

    ----------- Preprocessed Dataset Sentiment Analysis ----------------

    Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
    Stemmed and lemmatized using nltk.
    Sentiment labels are generated using TextBlob polarity scores.

    The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).

    DOI: 10.34740/kaggle/dsv/3877817

    Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }

    This dataset was used in the experimental phase of my research.

    File name: EcoPreprocessed.csv

    ----------- Amazon Earphones Reviews ----------------

    This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)

    License: U.S. Government Works

    Source: www.amazon.in

    File name (original): AllProductReviews.csv (contains 14337 reviews)

    File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)

    ----------- Amazon Musical Instruments Reviews ----------------

    This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).

    Source: http://jmcauley.ucsd.edu/data/amazon/

    File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)

    File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)

  14. f

    Four Text Datasets Used For Comparison Between Hedonometer and Azure...

    • figshare.com
    txt
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddhant Jaydeep Mahajani; Shashank Srivastava; Alan Smeaton (2023). Four Text Datasets Used For Comparison Between Hedonometer and Azure Sentiment Analysis Tools [Dataset]. http://doi.org/10.6084/m9.figshare.24539410.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 9, 2023
    Dataset provided by
    figshare
    Authors
    Siddhant Jaydeep Mahajani; Shashank Srivastava; Alan Smeaton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Lexicon-based approaches to sentiment analysis of text are based on each word or lexical entry having a pre-definedweight indicating its sentiment polarity. We compute sentiment for more than 150,000 English language texts drawn from 4 domains using the Hedonometer, a lexicon-based technique and Azure, a contemporary machine-learning based approach. We model differences in sentiment scores between approaches for documents in each domain using a regression and analyse the independent variables (Hedonometer lexical entries) as indicators of each word's importance and contribution to the score differences.1. Finance Data: This dataset contains 5,000 records of different financial news texts from company press reviews and news headlines.2. News Headlines Data: This dataset consists of 50,000 news headlines for the period of 8 months (November 2015 to July 2016) on four different topics: Economy, Microsoft, Obama, and Palestine.3. IMDb Dataset: This dataset consists of 50,000 reviews posted by customers on the online IMDb platform which is an International Movie Database platform.4. Twitter Dataset: This dataset consists of almost 40,000 tweets from users around the globe on every thing.5. Hedonometer Bag of Words: This is the bag of words used to perform sentiment analysis using traditional lexicon approach which consists of 10,223 words with their respective happiness score. The actual file can be downloaded from here: https://hedonometer.org/words/labMT-en-v2/6. Combined p-values results: This is the result file which was generated once we performed sentiment analysis on all the above domains and only identified words that are present in the hedonometer sheet. The sheet consists of the words and their respective happiness score and their p-values on all different domains.7. Data visualisations: This is the visualisation code base in Tableau which was used to generate visualisations.

  15. f

    Seven-element emotion classification algorithm on event-related microblog...

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingyang Wang; Huan Wu; Tianyu Zhang; Shengqing Zhu (2023). Seven-element emotion classification algorithm on event-related microblog texts. [Dataset]. http://doi.org/10.1371/journal.pone.0241355.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mingyang Wang; Huan Wu; Tianyu Zhang; Shengqing Zhu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Seven-element emotion classification algorithm on event-related microblog texts.

  16. g

    Multimodal Sentiment Analysis Dataset

    • gts.ai
    json
    Updated Jun 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Multimodal Sentiment Analysis Dataset [Dataset]. https://gts.ai/dataset-download/multimodal-sentiment-analysis-dataset/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 28, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore our unique Multimodal Sentiment Analysis Dataset, featuring high-quality images and corresponding text descriptions with sentiment labels.

  17. Twitter Tweets Sentiment Dataset

    • kaggle.com
    Updated Apr 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Twitter Tweets Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/twitter-tweets-sentiment-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Twitter_Sentiment_Analysis_/main/twitt.jpg" alt="">

    Description:

    Twitter is an online Social Media Platform where people share their their though as tweets. It is observed that some people misuse it to tweet hateful content. Twitter is trying to tackle this problem and we shall help it by creating a strong NLP based-classifier model to distinguish the negative tweets & block such tweets. Can you build a strong classifier model to predict the same?

    Each row contains the text of a tweet and a sentiment label. In the training set you are provided with a word or phrase drawn from the tweet (selected_text) that encapsulates the provided sentiment.

    Make sure, when parsing the CSV, to remove the beginning / ending quotes from the text field, to ensure that you don't include them in your training.

    You're attempting to predict the word or phrase from the tweet that exemplifies the provided sentiment. The word or phrase should include all characters within that span (i.e. including commas, spaces, etc.)

    Columns:

    1. textID - unique ID for each piece of text
    2. text - the text of the tweet
    3. sentiment - the general sentiment of the tweet

    Acknowledgement:

    The dataset is download from Kaggle Competetions:
    https://www.kaggle.com/c/tweet-sentiment-extraction/data?select=train.csv

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification models to predict the twitter sentiments.
    • Compare the evaluation metrics of vaious classification algorithms.
  18. m

    Product Reviews Dataset for Emotions Classification Tasks - Indonesian...

    • data.mendeley.com
    Updated May 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rhio Sutoyo (2022). Product Reviews Dataset for Emotions Classification Tasks - Indonesian (PRDECT-ID) Dataset [Dataset]. http://doi.org/10.17632/574v66hf2v.1
    Explore at:
    Dataset updated
    May 19, 2022
    Authors
    Rhio Sutoyo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PRDECT-ID Dataset is a collection of Indonesian product review data annotated with emotion and sentiment labels. The data were collected from one of the giant e-commerce in Indonesia named Tokopedia. The dataset contains product reviews from 29 product categories on Tokopedia that use the Indonesian language. Each product review is annotated with a single emotion, i.e., love, happiness, anger, fear, or sadness. The group of annotators does the annotation process to provide emotion labels by following the emotions annotation criteria created by an expert in clinical psychology. Other attributes related to the product review are also extracted, such as Location, Price, Overall Rating, Number Sold, Total Review, and Customer Rating, to support further research.

  19. h

    turkish-sentiment-analysis-dataset

    • huggingface.co
    Updated Jun 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Batuhan (2022). turkish-sentiment-analysis-dataset [Dataset]. https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 21, 2022
    Authors
    Batuhan
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset contains positive , negative and notr sentences from several data sources given in the references. In the most sentiment models , there are only two labels; positive and negative. However , user input can be totally notr sentence. For such cases there were no data I could find. Therefore I created this dataset with 3 class. Positive and negative sentences are listed below. Notr examples are extraced from turkish wiki dump. In addition, added some random text… See the full description on the dataset page: https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset.

  20. Z

    EmoLit

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rei, Luis (2023). EmoLit [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7883953
    Explore at:
    Dataset updated
    Jun 27, 2023
    Dataset authored and provided by
    Rei, Luis
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Emotions in Literature

    Description Literature sentences from Project Gutenberg. 38 emotion labels (+neutral examples). Semi-Supervised dataset.

    Article

    Detecting Fine-Grained Emotions in Literature

    Please cite:

    @Article{app13137502, AUTHOR = {Rei, Luis and Mladenić, Dunja}, TITLE = {Detecting Fine-Grained Emotions in Literature}, JOURNAL = {Applied Sciences}, VOLUME = {13}, YEAR = {2023}, NUMBER = {13}, ARTICLE-NUMBER = {7502}, URL = {https://www.mdpi.com/2076-3417/13/13/7502}, ISSN = {2076-3417}, DOI = {10.3390/app13137502} }

    Abstract

    Emotion detection in text is a fundamental aspect of affective computing and is closely linked to natural language processing. Its applications span various domains, from interactive chatbots to marketing and customer service. This research specifically focuses on its significance in literature analysis and understanding. To facilitate this, we present a novel approach that involves creating a multi-label fine-grained emotion detection dataset, derived from literary sources. Our methodology employs a simple yet effective semi-supervised technique. We leverage textual entailment classification to perform emotion-specific weak-labeling, selecting examples with the highest and lowest scores from a large corpus. Utilizing these emotion-specific datasets, we train binary pseudo-labeling classifiers for each individual emotion. By applying this process to the selected examples, we construct a multi-label dataset. Using this dataset, we train models and evaluate their performance within a traditional supervised setting. Our model achieves an F1 score of 0.59 on our labeled gold set, showcasing its ability to effectively detect fine-grained emotions. Furthermore, we conduct evaluations of the model's performance in zero- and few-shot transfer scenarios using benchmark datasets. Notably, our results indicate that the knowledge learned from our dataset exhibits transferability across diverse data domains, demonstrating its potential for broader applications beyond emotion detection in literature. Our contribution thus includes a multi-label fine-grained emotion detection dataset built from literature, the semi-supervised approach used to create it, as well as the models trained on it. This work provides a solid foundation for advancing emotion detection techniques and their utilization in various scenarios, especially within the cultural heritage analysis.

    Labels

    • admiration: finds something admirable, impressive or worthy of respect

    • amusement: finds something funny, entertaining or amusing

    • anger: is angry, furious, or strongly displeased; displays ire, rage, or wrath

    • annoyance: is annoyed or irritated

    • approval: expresses a favorable opinion, approves, endorses or agrees with something or someone

    • boredom: feels bored, uninterested, monotony, tedium

    • calmness: is calm, serene, free from agitation or disturbance, experiences emotional tranquility

    • caring: cares about the well-being of someone else, feels sympathy, compassion, affectionate concern towards someone, displays kindness or generosity

    • courage: feels courage or the ability to do something that frightens one, displays fearlessness or bravery

    • curiosity: is interested, curious, or has strong desire to learn something

    • desire: has a desire or ambition, wants something, wishes for something to happen

    • despair: feels despair, helpless, powerless, loss or absence of hope, desperation, despondency

    • disappointment: feels sadness or displeasure caused by the non-fulfillment of hopes or expectations, being or let down, expresses regret due to the unfavorable outcome of a decision

    • disapproval: expresses an unfavorable opinion, disagrees or disapproves of something or someone

    • disgust: feels disgust, revulsion, finds something or someone unpleasant, offensive or hateful

    • doubt: has doubt or is uncertain about something, bewildered, confused, or shows lack of understanding

    • embarrassment: feels embarrassed, awkward, self-conscious, shame, or humiliation

    • envy: is covetous, feels envy or jealousy; begrudges or resents someone for their achievements, possessions, or qualities

    • excitement: feels excitement or great enthusiasm and eagerness

    • faith: expresses religious faith, has a strong belief in the doctrines of a religion, or trust in god

    • fear: is afraid or scared due to a threat, danger, or harm

    • frustration: feels frustrated: upset or annoyed because of inability to change or achieve something

    • gratitude: is thankful or grateful for something

    • greed: is greedy, rapacious, avaricious, or has selfish desire to acquire or possess more than what one needs

    • grief: feels grief or intense sorrow, or grieves for someone who has died

    • guilt: feels guilt, remorse, or regret to have committed wrong or failed in an obligation

    • indifference: is uncaring, unsympathetic, uncharitable, or callous, shows indifference, lack of concern, coldness towards someone

    • joy: is happy, feels joy, great pleasure, elation, satisfaction, contentment, or delight

    • love: feels love, strong affection, passion, or deep romantic attachment for someone

    • nervousness: feels nervous, anxious, worried, uneasy, apprehensive, stressed, troubled or tense

    • nostalgia: feels nostalgia, longing or wistful affection for the past, something lost, or for a period in one's life, feels homesickness, a longing for one's home, city, or country while being away; longing for a familiar place

    • optimism: feels optimism or hope, is hopeful or confident about the future, that something good may happen, or the success of something - pain: feels physical pain or is experiences physical suffering

    • pride: is proud, feels pride from one's own achievements, self-fulfillment, or from the achievements of those with whom one is closely associated, or from qualities or possessions that are widely admired

    • relief: feels relaxed, relief from tension or anxiety

    • sadness: feels sadness, sorrow, unhappiness, depression, dejection

    • surprise: is surprised, astonished or shocked by something unexpected

    • trust: trusts or has confidence in someone, or believes that someone is good, honest, or reliable

    Dataset

    EmoLit (Zenodo)

    Code

    EmoLit Train (Github)

    Models

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
CUBIG (2025). emotion analysis based on text Dataset [Dataset]. https://cubig.ai/store/products/139/emotion-analysis-based-on-text-dataset

emotion analysis based on text Dataset

Explore at:
10 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 25, 2025
Dataset authored and provided by
CUBIG
License

https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description

1) Data introduction • Emotion-analysis dataset is data for analyzing the emotions of text.

2) Data utilization (1) Emotion-analysis data has characteristics that: • Contains a variety of texts that convey emotions ranging from happiness to anger to sadness. The goal is to build an efficient model for detecting emotions in text. (2) Emotion-analysis data can be used to: • Sentiment classification models: This dataset can be used to train machine learning models that classify text based on sentiment, which helps companies and researchers understand public opinion and sentiment trends. • Market research: Researchers can analyze sentiment data to understand consumer preferences and market trends and support data-driven decision making.

Search
Clear search
Close search
Google apps
Main menu