https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This repository contains performance measures of dataset ranking models.- Usage: from Results/src run Python results m1 m2 ...such that mi can be omitted, or be any element of the list of model labels ['bayesian-12C', 'bayesian-5L', 'bayesian-5L12C', 'cos-12C', 'cos-5L', 'cos-5L5C', 'j48-12C', 'j48-5L', 'j48-5L5C', 'jrip-12C', 'jrip-5L', 'jrip-5L5C', 'sn-12C', 'sn-5L', 'sn-5L12C']. Results of selected models will be plotted in a 2D line plot. If no model is provided all models will be listed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘QS World University Rankings 2017 - 2022’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/padhmam/qs-world-university-rankings-2017-2022 on 13 February 2022.
--- Dataset description provided by original source is as follows ---
QS World University Rankings is an annual publication of global university rankings by Quacquarelli Symonds. The QS ranking receives approval from the International Ranking Expert Group (IREG), and is viewed as one of the three most-widely read university rankings in the world. QS publishes its university rankings in partnership with Elsevier.
This dataset contains university data from the year 2017 to 2022. It has a total of 15 features. - university - name of the university - year - year of ranking - rank_display - rank given to the university - score - score of the university based on the six key metrics mentioned above - link - link to the university profile page on QS website - country - country in which the university is located - city - city in which the university is located - region - continent in which the university is located - logo - link to the logo of the university - type - type of university (public or private) - research_output - quality of research at the university - student_faculty_ratio - number of students assigned to per faculty - international_students - number of international students enrolled at the university - size - size of the university in terms of area - faculty_count - number of faculty or academic staff at the university
This dataset was acquired by scraping the QS World University Rankings website with Python and Selenium. Cover Image: Source
Some of the questions that can be answered with this dataset, 1. What makes a best ranked university? 2. Does the location of a university play a role in its ranking? 3. What do the best universities have in common? 4. How important is academic research for a university? 5. Which country is preferred by international students?
--- Original source retains full ownership of the source dataset ---
This dataset was created by DNS_dataset
https://brightdata.com/licensehttps://brightdata.com/license
Unlock valuable insights with our comprehensive TripAdvisor Dataset, designed for businesses, analysts, and researchers to track customer reviews, ratings, and travel trends. This dataset provides structured and reliable data from TripAdvisor to enhance market research, competitive analysis, and customer satisfaction strategies.
Dataset Features
Business Listings: Access detailed information on hotels, restaurants, attractions, and other businesses, including names, locations, categories, and contact details. Customer Reviews & Ratings: Extract user-generated reviews, star ratings, review dates, and sentiment analysis to understand customer experiences and preferences. Pricing & Booking Data: Track pricing trends, availability, and booking options for hotels, flights, and travel services. Location & Geographical Insights: Analyze travel trends by region, city, or country to identify popular destinations and emerging markets.
Customizable Subsets for Specific Needs Our TripAdvisor Dataset is fully customizable, allowing you to filter data based on location, business type, review sentiment, or specific keywords. Whether you need broad coverage for industry analysis or focused data for customer insights, we tailor the dataset to your needs.
Popular Use Cases
Customer Satisfaction & Brand Monitoring: Track customer feedback, analyze sentiment, and improve service offerings based on real user reviews. Market Research & Competitive Analysis: Compare business performance, monitor competitor reviews, and identify industry trends. Travel & Hospitality Insights: Analyze travel patterns, popular destinations, and seasonal trends to optimize marketing strategies. AI & Machine Learning Applications: Use structured review data to train AI models for sentiment analysis, recommendation engines, and predictive analytics. Pricing Strategy & Revenue Optimization: Monitor pricing trends and customer demand to optimize pricing strategies for hotels, restaurants, and travel services.
Whether you're analyzing customer sentiment, tracking travel trends, or optimizing business strategies, our TripAdvisor Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Samsung Customer Reviews Dataset contains 1,000 customer reviews of various Samsung products, including smartphones, tablets, TVs, and smartwatches. User feedback, ratings, and timestamps are included, which are useful for emotional analysis, customer satisfaction surveys, and product quality assessment.
2) Data Utilization (1) Samsung Customer Reviews Dataset has characteristics that: • This dataset contains structured text and numerical information for each review, including product name, username, rating, review title, review body, and creation date, for detailed analysis by review. (2) Samsung Customer Reviews Dataset can be used to: • Customer Opinion Analysis and Emotional Classification: Review texts and ratings can be used to identify customer positive and negative emotions, major complaints and compliments about Samsung products, and to improve products and develop marketing strategies. • Comparison of satisfaction and trend analysis by product: By analyzing review data by product group and period, market trends such as popular products, changes in customer preferences, and repeatedly mentioned issues can be derived and used for competitor analysis or new product planning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORRUPTION RANK.PHP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract: With the e-commerce growth, more people are buying products over the internet. To increase customer satisfaction, merchants provide spaces for product and service reviews. Products with positive reviews attract customers, while products with negative reviews lose customers. Following this idea, some individuals and corporations write fake reviews to promote their products and services or defame their competitors. The difficulty for finding these reviews was in the large amount of information available. One solution is to use data mining techniques and tools, such as the classification function. Exploring this situation, the present work evaluates classification techniques to identify fake reviews about products and services on the Internet. The research also presents a literature systematic review on fake reviews. The research used 8 classification algorithms. The algorithms were trained and tested with a hotels database. The CONCENSO algorithm presented the best result, with 88% in the precision indicator. After the first test, the algorithms classified reviews on another hotels database. To compare the results of this new classification, the Review Skeptic algorithm was used. The SVM and GLMNET algorithms presented the highest convergence with the Review Skeptic algorithm, classifying 83% of reviews with the same result. The research contributes by demonstrating the algorithms ability to understand consumers’ real reviews to products and services on the Internet. Another contribution is to be the pioneer in the investigation of fake reviews in Brazil and in production engineering.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for GDP reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
2k Ranked Images
This dataset contains roughly two thousand images ranked from most preferred to least preferred based on human feedback on pairwise comparisons (>25k responses). The generated images, which are a sample from the open-image-preferences-v1 dataset from the team @data-is-better-together, are rated purely based on aesthetic preference, disregarding the prompt used for generation. We provide the categories of the original dataset for easy filtering. This is a new… See the full description on the dataset page: https://huggingface.co/datasets/Rapidata/2k-ranked-images-open-image-preferences-v1.
This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
Movehub city ranking as published on http://www.movehub.com/city-rankings
Cities ranked by
Movehub Rating: A combination of all scores for an overall rating for a city or country.
Purchase Power: This compares the average cost of living with the average local wage.
Health Care: Compiled from how citizens feel about their access to healthcare, and its quality.
Pollution: Low is good. A score of how polluted people find a city, includes air, water and noise pollution.
Quality of Life: A balance of healthcare, pollution, purchase power, crime rate to give an overall quality of life score.
Crime Rating: Low is good. The lower the score the safer people feel in this city.
Unit: GBP
City
Cappuccino
Cinema
Wine
Gasoline
Avg Rent
Avg Disposable Income
Cities to countries as parsed from Wikipedia https://en.wikipedia.org/wiki/List_of_towns_and_cities_with_100,000_or_more_inhabitants/cityname:_A (A-Z)
http://www.movehub.com/city-rankings
https://en.wikipedia.org/wiki/List_of_towns_and_cities_with_100,000_or_more_inhabitants/cityname:_A
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Introduction
This dataset is the largest real-world consistency-ensured dataset for peer review, which features the widest range of conferences and the most complete review stages, including initial submissions, reviews, ratings and confidence, aspect ratings, rebuttals, discussions, score changes, meta-reviews, and final decisions.
Comparison with Existing Datasets
The comparison between our proposed dataset and existing peer review datasets is given below. Only the… See the full description on the dataset page: https://huggingface.co/datasets/Daoze/ReviewRebuttal.
The IMDb Movie Reviews dataset is a binary sentiment analysis dataset consisting of 50,000 reviews from the Internet Movie Database (IMDb) labeled as positive or negative. The dataset contains an even number of positive and negative reviews. Only highly polarizing reviews are considered. A negative review has a score ≤ 4 out of 10, and a positive review has a score ≥ 7 out of 10. No more than 30 reviews are included per movie. The dataset contains additional unlabeled data.
A list of some key resources for comparing London with other world cities.
European Union/Eurostat, Urban Audit
Arcadis, Sustainable cities index
AT Kearney, Global Cities Index
McKinsey, Urban world: Mapping the economic power of cities
Knight Frank, Wealth report
OECD, Better Life Index
UNODC, Statistics on drugs, crime and criminal justice at the international level
Economist, Hot Spots
Economist, Global Liveability Ranking and Report August 2014
Mercer, Quality of Living Reports
Forbes, World's most influential cities
Mastercard, Global Destination Cities Index
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data introduction • Apple-iphone-se-reviews dataset is a dataset that scrapes data from the Flipkart website using Selenium and BeautifulSoup links.
2) Data utilization (1)Apple-iphone-se-reviews data has characteristics that: • User ratings for Apple iPhone SE on Indian e-commerce website Flipkart are . We aim at NLP text classification through user ratings, review titles, and review text. (2)Apple-iphone-se-reviews data can be used to: • Rating prediction: You can support automated review analysis and summarization by developing machine learning models to predict ratings based on review text. • Product Improvement: Insights gained from reviews can help us identify common issues and areas for improvement in iPhone SE and guide product development and quality improvements.
The data files can be viewed by Excel.
The datasets are machine learning data, in which queries and urls are represented by IDs. The datasets consist of feature vectors extracted from query-url pairs along with relevance judgment labels:
(1) The relevance judgments are obtained from a retired labeling set of a commercial web search engine (Microsoft Bing), which take 5 values from 0 (irrelevant) to 4 (perfectly relevant).
(2) The features are basically extracted by us, and are those widely used in the research community.
In the data files, each row corresponds to a query-url pair. The first column is relevance label of the pair, the second column is query id, and the following columns are features. The larger value the relevance label has, the more relevant the query-url pair is. A query-url pair is represented by a 136-dimensional feature vector.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The WONDERBREAD dataset contains 2,928 human demonstrations of 598 web navigation workflows across 6 types of BPM tasks. These tasks measure the ability of a model to generate accurate documentation, assist in knowledge transfer, and improve the effeciency of workflows.
Please see our website for more details: https://wonderbread.stanford.edu/
To start, download debug_demos.zip (~1 GB). It contains a subset of 24 demonstrations which can give you a sense of how the dataset is structured.
To reproduce the paper, download gold_demos.zip (~33 GB). It contains 724 demonstrations corresponding to the 162 "Gold" tasks which were used for all the evaluations in the original paper.
To obtain the full dataset, download demos.zip (~133 GB). This contains all 2,928 demonstrations and can be used for training, fine-tuning, and evaluating models.
The dataset contains several files, defined below.
https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html
The entity relatedness problem refers to the question of computing the relationship paths that better describe the connectivity between a given entity pair. This dataset supports the evaluation of approaches that address the entity relatedness problem. It covers two familiar domains, music and movies, and uses data available in IMDb and last.fm, which are popular reference datasets in these domains. The dataset contains 20 entity pairs from each of these domains and, for each entity pair, a ranked list with 50 relationship paths. It also contains entity ratings and property relevance scores for the entities and properties used in the paths.(This version supersedes the previous one)
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.
Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).
Each Dataset contains the following columns:
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This repository contains performance measures of dataset ranking models.- Usage: from Results/src run Python results m1 m2 ...such that mi can be omitted, or be any element of the list of model labels ['bayesian-12C', 'bayesian-5L', 'bayesian-5L12C', 'cos-12C', 'cos-5L', 'cos-5L5C', 'j48-12C', 'j48-5L', 'j48-5L5C', 'jrip-12C', 'jrip-5L', 'jrip-5L5C', 'sn-12C', 'sn-5L', 'sn-5L12C']. Results of selected models will be plotted in a 2D line plot. If no model is provided all models will be listed.