39 datasets found
  1. Datasets for Sentiment Analysis

    • zenodo.org
    csv
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of CĂłrdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.

    Below are the datasets specified, along with the details of their references, authors, and download sources.

    ----------- STS-Gold Dataset ----------------

    The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.

    Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.

    File name: sts_gold_tweet.csv

    ----------- Amazon Sales Dataset ----------------

    This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.

    Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)

    Features:

    • product_id - Product ID
    • product_name - Name of the Product
    • category - Category of the Product
    • discounted_price - Discounted Price of the Product
    • actual_price - Actual Price of the Product
    • discount_percentage - Percentage of Discount for the Product
    • rating - Rating of the Product
    • rating_count - Number of people who voted for the Amazon rating
    • about_product - Description about the Product
    • user_id - ID of the user who wrote review for the Product
    • user_name - Name of the user who wrote review for the Product
    • review_id - ID of the user review
    • review_title - Short review
    • review_content - Long review
    • img_link - Image Link of the Product
    • product_link - Official Website Link of the Product

    License: CC BY-NC-SA 4.0

    File name: amazon.csv

    ----------- Rotten Tomatoes Reviews Dataset ----------------

    This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.

    This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).

    Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics

    File name: data_rt.csv

    ----------- Preprocessed Dataset Sentiment Analysis ----------------

    Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
    Stemmed and lemmatized using nltk.
    Sentiment labels are generated using TextBlob polarity scores.

    The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).

    DOI: 10.34740/kaggle/dsv/3877817

    Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }

    This dataset was used in the experimental phase of my research.

    File name: EcoPreprocessed.csv

    ----------- Amazon Earphones Reviews ----------------

    This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)

    License: U.S. Government Works

    Source: www.amazon.in

    File name (original): AllProductReviews.csv (contains 14337 reviews)

    File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)

    ----------- Amazon Musical Instruments Reviews ----------------

    This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.

    This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

    The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).

    Source: http://jmcauley.ucsd.edu/data/amazon/

    File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)

    File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)

  2. Web Analytics Market By Solution (Search Engine Tracking And Ranking, Heat...

    • verifiedmarketresearch.com
    Updated Nov 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Web Analytics Market By Solution (Search Engine Tracking And Ranking, Heat Map Analytics), By Application (Social Media Management, Display Advertising Optimization), By Vertical (Baking, Financial Services And Insurance (BFSI), Retail), And Region for 2026-2032 [Dataset]. https://www.verifiedmarketresearch.com/product/web-analytics-market/
    Explore at:
    Dataset updated
    Nov 15, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    Global
    Description

    Web Analytics Market was valued at USD 6.16 Billion in 2024 and is projected to reach USD 13.6 Billion by 2032, growing at a CAGR of 18.58% from 2026 to 2032.

    Web Analytics Market Drivers

    Data-Driven Decision Making: Businesses increasingly rely on data-driven insights to optimize their online strategies. Web analytics provides valuable data on website traffic, user behavior, and conversion rates, enabling data-driven decision-making.

    E-commerce Growth: The rapid growth of e-commerce has fueled the demand for web analytics tools to track online sales, customer behavior, and marketing campaign effectiveness.

    Mobile Dominance: The increasing use of mobile devices for internet browsing has made mobile analytics a crucial aspect of web analytics. Businesses need to understand how users interact with their websites and apps on mobile devices.

    analytics tools can be complex to implement and use, requiring technical expertise.

  3. d

    Product Review Datasets for User Sentiment Analysis

    • datarade.ai
    Updated Sep 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs (2018). Product Review Datasets for User Sentiment Analysis [Dataset]. https://datarade.ai/data-products/product-review-datasets-for-user-sentiment-analysis-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 28, 2018
    Dataset authored and provided by
    Oxylabs
    Area covered
    Libya, Argentina, Antigua and Barbuda, Sudan, South Africa, Barbados, Canada, Hong Kong, Italy, Egypt
    Description

    Product Review Datasets: Uncover user sentiment

    Harness the power of Product Review Datasets to understand user sentiment and insights deeply. These datasets are designed to elevate your brand and product feature analysis, help you evaluate your competitive stance, and assess investment risks.

    Data sources:

    • Trustpilot: datasets encompassing general consumer reviews and ratings across various businesses, products, and services.

    Leave the data collection challenges to us and dive straight into market insights with clean, structured, and actionable data, including:

    • Product name;
    • Product category;
    • Number of ratings;
    • Ratings average;
    • Review title;
    • Review body;

    Choose from multiple data delivery options to suit your needs:

    1. Receive data in easy-to-read formats like spreadsheets or structured JSON files.
    2. Select your preferred data storage solutions, including SFTP, Webhooks, Google Cloud Storage, AWS S3, and Microsoft Azure Storage.
    3. Tailor data delivery frequencies, whether on-demand or per your agreed schedule.

    Why choose Oxylabs?

    1. Fresh and accurate data: Access organized, structured, and comprehensive data collected by our leading web scraping professionals.

    2. Time and resource savings: Concentrate on your core business goals while we efficiently handle the data extraction process at an affordable cost.

    3. Adaptable solutions: Share your specific data requirements, and we'll craft a customized data collection approach to meet your objectives.

    4. Legal compliance: Partner with a trusted leader in ethical data collection. Oxylabs is a founding member of the Ethical Web Data Collection Initiative, aligning with GDPR and CCPA standards.

    Pricing Options:

    Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Join the ranks of satisfied customers who appreciate our meticulous attention to detail and personalized support. Experience the power of Product Review Datasets today to uncover valuable insights and enhance decision-making.

  4. Company Review

    • kaggle.com
    Updated Jan 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rudra (2024). Company Review [Dataset]. https://www.kaggle.com/datasets/rudra2/company-review/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rudra
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Rudra Anand

    Released under CC0: Public Domain

    Contents

  5. sites-reviews.com Website Traffic, Ranking, Analytics [July 2025]

    • stb2.digiseotools.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). sites-reviews.com Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://stb2.digiseotools.com/website/sites-reviews.com/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://sem1.theseowheel.com/company/legal/terms-of-service/https://sem1.theseowheel.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    sites-reviews.com is ranked #46762 in RU with 131.04K Traffic. Categories: Online Services. Learn more about website traffic, market share, and more!

  6. bestproducts.reviews Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). bestproducts.reviews Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/bestproducts.reviews/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    bestproducts.reviews is ranked #23248 in US with 422.96K Traffic. Categories: . Learn more about website traffic, market share, and more!

  7. Web Analytics Software Market by Deployment (Cloud-based, On-premise),...

    • verifiedmarketresearch.com
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Web Analytics Software Market by Deployment (Cloud-based, On-premise), Application (Behavioral Analytics, Performance Monitoring, SEO Tracking), End-user (BFSI, Retail, Healthcare, IT & Telecom), & Region for 2024-2031 [Dataset]. https://www.verifiedmarketresearch.com/product/web-analytics-software-analysis/
    Explore at:
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2031
    Area covered
    Global
    Description

    Web Analytics Software Market size was valued at USD 2.95 Billion in 2024 and is projected to reach USD 9.40 Billion by 2031, growing at a CAGR of 15.60% from 2024 to 2031.

    The Web Analytics Software Market is primarily driven by the increasing need for businesses to optimize their online presence and improve customer experience. As companies focus on data-driven decisions, the demand for advanced analytics tools to track user behavior, measure website performance, and improve digital marketing strategies is growing.

    Additionally, the rise of e-commerce and mobile internet usage is accelerating the adoption of web analytics software. Businesses seek to understand customer preferences, enhance personalization, and boost conversion rates, further propelling market growth. The integration of AI and machine learning into analytics platforms also plays a significant role in enhancing predictive capabilities and automation.

  8. reviews.io Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). reviews.io Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/reviews.io/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    reviews.io is ranked #11243 in US with 2.26M Traffic. Categories: Online Services. Learn more about website traffic, market share, and more!

  9. smarter-reviews.com Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). smarter-reviews.com Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/smarter-reviews.com/overview/?source=trending-websites
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    smarter-reviews.com is ranked #19206 in US with 464.32K Traffic. Categories: Retail, Wellness. Learn more about website traffic, market share, and more!

  10. I

    Global Website Analytics Tool Market Historical Impact Review 2025-2032

    • statsndata.org
    excel, pdf
    Updated Aug 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Website Analytics Tool Market Historical Impact Review 2025-2032 [Dataset]. https://www.statsndata.org/report/website-analytics-tool-market-136119
    Explore at:
    excel, pdfAvailable download formats
    Dataset updated
    Aug 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Website Analytics Tool market has become a cornerstone for businesses and organizations striving to strengthen their online presence and leverage data for informed decision-making. As digital landscapes continue to evolve, these tools offer essential insights into user behavior, website performance, and overall

  11. amazon-reviews-sentiment-analysis

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fastai X Hugging Face Group 2022, amazon-reviews-sentiment-analysis [Dataset]. https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    fastai X Hugging Face Group 2022
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for amazon reviews for sentiment analysis

      Dataset Summary
    

    One of the most important problems in e-commerce is the correct calculation of the points given to after-sales products. The solution to this problem is to provide greater customer satisfaction for the e-commerce site, product prominence for sellers, and a seamless shopping experience for buyers. Another problem is the correct ordering of the comments given to the products. The prominence of misleading… See the full description on the dataset page: https://huggingface.co/datasets/hugginglearners/amazon-reviews-sentiment-analysis.

  12. u-review.in.th Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). u-review.in.th Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/u-review.in.th/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    u-review.in.th is ranked #12478 in TH with 45.54K Traffic. Categories: Education. Learn more about website traffic, market share, and more!

  13. airspace-review.com Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). airspace-review.com Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/airspace-review.com/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    airspace-review.com is ranked #1817 in ID with 623.69K Traffic. Categories: Airlines. Learn more about website traffic, market share, and more!

  14. m

    Amazon Reviews

    • data.mendeley.com
    Updated Sep 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chinmay Khandelwal (2021). Amazon Reviews [Dataset]. http://doi.org/10.17632/3f7ws2cm4y.1
    Explore at:
    Dataset updated
    Sep 13, 2021
    Authors
    Chinmay Khandelwal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Customer reviews of Amazon products on Amazon website

  15. f

    Data_Sheet_1_Topic evolution and sentiment comparison of user reviews on an...

    • frontiersin.figshare.com
    docx
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaoyang Li; Shengyu Li; Jianfeng Yang; Jingmei Wang; Yiqing Lv (2023). Data_Sheet_1_Topic evolution and sentiment comparison of user reviews on an online medical platform in response to COVID-19: taking review data of Haodf.com as an example.DOCX [Dataset]. http://doi.org/10.3389/fpubh.2023.1088119.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Frontiers
    Authors
    Chaoyang Li; Shengyu Li; Jianfeng Yang; Jingmei Wang; Yiqing Lv
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionThroughout the COVID-19 pandemic, many patients have sought medical advice on online medical platforms. Review data have become an essential reference point for supporting users in selecting doctors. As the research object, this study considered Haodf.com, a well-known e-consultation website in China.MethodsThis study examines the topics and sentimental change rules of user review texts from a temporal perspective. We also compared the topics and sentimental change characteristics of user review texts before and after the COVID-19 pandemic. First, 323,519 review data points about 2,122 doctors on Haodf.com were crawled using Python from 2017 to 2022. Subsequently, we employed the latent Dirichlet allocation method to cluster topics and the ROST content mining software to analyze user sentiments. Second, according to the results of the perplexity calculation, we divided text data into five topics: diagnosis and treatment attitude, medical skills and ethics, treatment effect, treatment scheme, and treatment process. Finally, we identified the most important topics and their trends over time.ResultsUsers primarily focused on diagnosis and treatment attitude, with medical skills and ethics being the second-most important topic among users. As time progressed, the attention paid by users to diagnosis and treatment attitude increased—especially during the COVID-19 outbreak in 2020, when attention to diagnosis and treatment attitude increased significantly. User attention to the topic of medical skills and ethics began to decline during the COVID-19 outbreak, while attention to treatment effect and scheme generally showed a downward trend from 2017 to 2022. User attention to the treatment process exhibited a declining tendency before the COVID-19 outbreak, but increased after. Regarding sentiment analysis, most users exhibited a high degree of satisfaction for online medical services. However, positive user sentiments showed a downward trend over time, especially after the COVID-19 outbreak.DiscussionThis study has reference value for assisting user choice regarding medical treatment, decision-making by doctors, and online medical platform design.

  16. Airbnb dataset of barcelona city

    • kaggle.com
    Updated Nov 30, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faguilar-V (2017). Airbnb dataset of barcelona city [Dataset]. https://www.kaggle.com/datasets/fermatsavant/airbnb-dataset-of-barcelona-city/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 30, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Faguilar-V
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Barcelona
    Description

    Context

    The data was taken from http://tomslee.net/airbnb-data-collection-get-the-data. The data was collected from the public Airbnb web site and the code was used is available on https://github.com/tomslee/airbnb-data-collection.

    Content

    room_id: A unique number identifying an Airbnb listing. The listing has a URL on the Airbnb web site of http://airbnb.com/rooms/room_id
    host_id: A unique number identifying an Airbnb host. The host’s page has a URL on the Airbnb web site of http://airbnb.com/users/show/host_id
    room_type: One of “Entire home/apt”, “Private room”, or “Shared room”
    borough: A subregion of the city or search area for which the survey is carried out. The borough is taken from a shapefile of the city that is obtained independently of the Airbnb web site. For some cities, there is no borough information; for others the borough may be a number. If you have better shapefiles for a city of interest, please send them to me.
    neighborhood: As with borough: a subregion of the city or search area for which the survey is carried out. For cities that have both, a neighbourhood is smaller than a borough. For some cities there is no neighbourhood information.
    reviews: The number of reviews that a listing has received. Airbnb has said that 70% of visits end up with a review, so the number of reviews can be used to estimate the number of visits. Note that such an estimate will not be reliable for an individual listing (especially as reviews occasionally vanish from the site), but over a city as a whole it should be a useful metric of traffic.
    overall_satisfaction: The average rating (out of five) that the listing has received from those visitors who left a review.
    accommodates: The number of guests a listing can accommodate.
    bedrooms: The number of bedrooms a listing offers.
    price: The price (in $US) for a night stay. In early surveys, there may be some values that were recorded by month.
    minstay: The minimum stay for a visit, as posted by the host.
    latitude and longitude: The latitude and longitude of the listing as posted on the Airbnb site: this may be off by a few hundred metres. I do not have a way to track individual listing locations with
    last_modified: the date and time that the values were read from the Airbnb web site.
    
  17. Airline Reviews Dataset

    • kaggle.com
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujal Suthar (2024). Airline Reviews Dataset [Dataset]. https://www.kaggle.com/datasets/sujalsuthar/airlines-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sujal Suthar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains reviews of the top 10 rated airlines in 2023 sourced from the Airline Quality (https://www.airlinequality.com) website. The reviews cover various aspects of the flight experience, including seat comfort, staff service, food and beverages, inflight entertainment, value for money, and overall rating. The dataset is suitable for sentiment analysis, customer satisfaction analysis, and other similar tasks.

    Usage - Download the dataset file airlines_reviews.csv. - Use the dataset for analysis, visualization, and machine learning tasks.

    List of Airlines 1. Singapore Airlines 2. Qatar Airways 3. All Nippon Airways 4. Emirates 5. Japan Airlines 6. Turkish Airlines 7. Air France 8. Cathay Pacific Airways 9. EVA Air 10.Korean Air

    This dataset is provided under the MIT License.

  18. D

    Product Comparison Website Market Report | Global Forecast From 2025 To 2033...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Product Comparison Website Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-product-comparison-website-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Product Comparison Website Market Outlook



    The global product comparison website market size was valued at approximately USD 4.5 billion in 2023 and is expected to reach USD 9.3 billion by 2032, growing at a compound annual growth rate (CAGR) of 8.5% during the forecast period. This remarkable growth can be attributed primarily to the increasing reliance of consumers and businesses on digital platforms for making informed purchasing decisions. With the rise of e-commerce, there is a growing need for platforms that can offer comprehensive comparisons of products and services, thus enabling consumers to make better choices. The market's expansion is further fueled by technological advancements, the proliferation of internet connectivity, and the rising trend of digitalization across various sectors.



    The surge in e-commerce activities worldwide is a major growth driver for the product comparison website market. As more consumers turn to online shopping for convenience, the role of comparison websites becomes crucial in helping them choose products that best fit their needs and budgets. These platforms not only aid in price comparison but also provide detailed insights into product features, reviews, and ratings, thereby empowering consumers with all the necessary information to make well-informed purchases. Additionally, businesses leverage these platforms to monitor competitive pricing strategies, optimize their offerings, and improve customer engagement, further driving the demand for comparison websites.



    Technological advancements, particularly in artificial intelligence and machine learning, are significantly contributing to the market's growth. AI-powered algorithms enable these platforms to offer personalized suggestions, improve accuracy in product matching, and enhance user experience through intuitive interfaces. Furthermore, the integration of advanced analytics allows these platforms to process vast amounts of data efficiently, providing users with real-time updates on price changes, product availability, and consumer trends. This technological edge ensures that comparison websites remain relevant and valuable tools for consumers and businesses alike.



    Another critical growth factor is the increasing internet penetration and smartphone adoption globally. With more people getting connected to the internet and using smartphones, access to comparison platforms has become seamless. Mobile applications, in particular, have made it easier for users to compare products on-the-go, thus driving the market's expansion. This trend is especially prominent in emerging markets where mobile internet usage is rapidly growing, offering significant opportunities for market players to tap into a vast pool of potential users who are increasingly relying on digital platforms for their shopping needs.



    Competitor Price Monitoring is a crucial aspect for businesses leveraging product comparison websites. By keeping a close watch on competitor pricing strategies, businesses can dynamically adjust their own prices to maintain competitiveness in the market. This not only helps in attracting price-sensitive consumers but also aids in retaining existing customers by offering them the best possible deals. Furthermore, competitor price monitoring allows businesses to identify pricing trends and patterns, enabling them to anticipate market shifts and respond proactively. In an era where consumers have access to a plethora of options, staying ahead of the competition through effective price monitoring can significantly enhance a company's market position and profitability.



    Regionally, North America holds a significant share in the product comparison website market due to the high adoption rate of digital technologies and the presence of several key players. However, the Asia Pacific region is expected to witness the fastest growth during the forecast period, driven by the rapidly expanding e-commerce sector and increasing internet user base. Additionally, Europe remains a lucrative market with a strong emphasis on online retail, while Latin America and the Middle East & Africa present emerging opportunities with growing digitalization trends and consumer awareness.



    Type Analysis



    In the product comparison website market, the type segment plays a pivotal role, encompassing price comparison, feature comparison, and review-based comparison. Price comparison remains the most popular type, as consumers are increasingly cost-conscious and see

  19. nationalreview.com Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). nationalreview.com Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/nationalreview.com/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    nationalreview.com is ranked #4277 in US with 2.66M Traffic. Categories: Newspapers. Learn more about website traffic, market share, and more!

  20. downloadhub.review Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). downloadhub.review Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/downloadhub.review/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    downloadhub.review is ranked #3253071 in IN with 55 Traffic. Categories: . Learn more about website traffic, market share, and more!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Organization logo

Datasets for Sentiment Analysis

Explore at:
csvAvailable download formats
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of CĂłrdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.

Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------

The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.

Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.

File name: sts_gold_tweet.csv

----------- Amazon Sales Dataset ----------------

This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.

Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)

Features:

  • product_id - Product ID
  • product_name - Name of the Product
  • category - Category of the Product
  • discounted_price - Discounted Price of the Product
  • actual_price - Actual Price of the Product
  • discount_percentage - Percentage of Discount for the Product
  • rating - Rating of the Product
  • rating_count - Number of people who voted for the Amazon rating
  • about_product - Description about the Product
  • user_id - ID of the user who wrote review for the Product
  • user_name - Name of the user who wrote review for the Product
  • review_id - ID of the user review
  • review_title - Short review
  • review_content - Long review
  • img_link - Image Link of the Product
  • product_link - Official Website Link of the Product

License: CC BY-NC-SA 4.0

File name: amazon.csv

----------- Rotten Tomatoes Reviews Dataset ----------------

This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.

This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).

Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics

File name: data_rt.csv

----------- Preprocessed Dataset Sentiment Analysis ----------------

Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.

The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).

DOI: 10.34740/kaggle/dsv/3877817

Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }

This dataset was used in the experimental phase of my research.

File name: EcoPreprocessed.csv

----------- Amazon Earphones Reviews ----------------

This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.

This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)

License: U.S. Government Works

Source: www.amazon.in

File name (original): AllProductReviews.csv (contains 14337 reviews)

File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)

----------- Amazon Musical Instruments Reviews ----------------

This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.

This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.

The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).

Source: http://jmcauley.ucsd.edu/data/amazon/

File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)

File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)

Search
Clear search
Close search
Google apps
Main menu