Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive collection of features for building a content-based recommendation system in an ecommerce environment. Content filtering, which relies on users' interests and past activities, is a prevalent method for suggesting products tailored to individual preferences.
Each entry in the dataset represents a product along with various attributes that can be leveraged for recommendation purposes. Here's an overview of the features included:
1) Number of clicks on similar products: Indicates the popularity or engagement level of similar items. 2) Number of similar products purchased so far: Reflects the conversion rate of similar products. 3) Average rating given to similar products: Offers insight into the perceived quality of comparable items. 4) Gender: Allows for gender-specific recommendations. 5) Median purchasing price (in rupees): Provides pricing information for segmentation or pricing strategy analysis. 6) Rating of the product: The rating of the product itself, indicating its overall quality. 7)**Brand of the product**: Brand loyalty or preference can influence recommendations. 8) Customer review sentiment score (overall): Sentiment analysis of customer reviews, indicating overall satisfaction. 9) Price of the product: The actual price of the product. 10) Holiday: Seasonal or holiday-specific buying patterns. 11) Season: Seasonal preferences may influence product choices. 12) Geographical locations: Regional preferences or availability may impact recommendations. 13) Probability for the product to be recommended to the person: The likelihood of recommending the product to a specific user based on their profile and past behavior.
With this rich set of features, businesses can implement sophisticated recommendation algorithms to personalize the shopping experience for users, ultimately leading to increased customer satisfaction, engagement, and sales.
Facebook
TwitterA book recommendation system is a type of recommendation system where we have to recommend similar books to the reader based on his interest. The books recommendation system is used by online websites which provide ebooks like google play books, open library, good Read’s, etc.
During the last few decades, with the rise of Youtube, Amazon, Netflix and many other such web services, recommender systems have taken more and more place in our lives. From e-commerce (suggest to buyers articles that could interest them) to online advertisement (suggest to users the right contents, matching their preferences), recommender systems are today unavoidable in our daily online journeys. In a very general way, recommender systems are algorithms aimed at suggesting relevant items to users (items being movies to watch, text to read, products to buy or anything else depending on industries). Recommender systems are really critical in some industries as they can generate a huge amount of income when they are efficient or also be a way to stand out significantly from competitors. As a proof of the importance of recommender systems, we can mention that, a few years ago, Netflix organised a challenges (the “Netflix prize”) where the goal was to produce a recommender system that performs better than its own algorithm with a prize of 1 million dollars to win. By applying this simple dataset and related tasks and notebooks , we will evolutionary go through different paradigms of recommender algorithms . For each of them, we will present how they work, describe their theoretical basis and discuss their strengths and weaknesses.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
In this study, we built a personalized hybrid course recommendation system (PHCRS) that considers students’ interests, abilities and career development. To meet students’ individual needs, we adopted the five most widely used algorithms, including content-based filtering, popularity-based methods, item-based collaborative filtering, user-based collaborative filtering, and score-based methods, to build a PHCRS. First, we collected course syllabi and labeled each course (e.g., knowledge/skills taught, basic/advanced level). Next, we used course labels and students’ past course selections and grades to train five recommendation models. To evaluate the accuracy of the system, we performed experiments with students in the Department of Electrical and Computer Engineering, which provides 1794 courses for 925 students and utilizes the receiver operating characteristic curve (ROC) and normalized discounted cumulative gain (NDCG) as metrics. The results showed that our proposed system can achieve accuracies of 80% for ROC and 90% for NDCG. We invited 46 participants to test our system and complete a questionnaire. Overall, 60 to 70% of participants were interested in the recommended courses, while the course recommendation lists produced by content-based filtering were in line with 67.40% of students’ actual course preferences. This study also found that students were more interested in courses at the top of the recommendation lists, and more students were autonomously motivated than held extrinsic informational motivation across the five recommendation methods. These findings highlighted that the proposed course recommendation system can help students choose the courses that interest them most.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Wyze Rule Recommendation Dataset
Dataset Summary
The Wyze Rule dataset is a new large-scale dataset designed specifically for smart home rule recommendation research. It contains over 1 million rules generated by 300,000 users from Wyze Labs, offering an extensive collection of real-world automation rules tailored to users' unique smart home setups. The goal of the Wyze Rule dataset is to advance research and development of personalized rule recommendation systems for… See the full description on the dataset page: https://huggingface.co/datasets/wyzelabs/RuleRecommendation.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset for news recommendation was collected from anonymized behavior logs of News website. The data randomly sampled 1 million users who had at least 5 news clicks during 6 weeks from October 12 to November 22, 2019. To protect user privacy, each user is de-linked from the production system when securely hashed into an anonymized ID. Also collected were the news click behaviors of these users in this period, which are formatted into impression logs. The impression logs have been used in the last week for test, and the logs in the fifth week for training. For samples in the training set, the click behaviors in the first four weeks to construct the news click history for user modeling. Among the training data, the samples on the last day of the fifth week were used as a validation set.
behaviors.tsv The behaviors.tsv file contains the impression logs and users' news click histories. It has 5 columns divided by the tab symbol:
news.tsv The docs.tsv contains the detailed information of news articles involved in the behaviors.tsv file. It has 7 columns, which are divided by the tab symbol:
News ID Category SubCategory Title Abstract URL Title Entities (entities contained in the title of this news) Abstract Entities (entites contained in the abstract of this news)
entity_embedding.vec & relation_embedding.vec The entity_embedding.vec and relation_embedding.vec files contain the 100-dimensional embeddings of the entities and relations learned from the subgraph (from WikiData knowledge graph) by TransE method. In both files, the first column is the ID of entity/relation, and the other columns are the embedding vector values. We hope this data can facilitate the research of knowledge-aware news recommendation. An example is shown as follows:
Facebook
Twitter
According to our latest research, the global market size for Recommendation Engine for Media reached USD 4.2 billion in 2024, with a robust compound annual growth rate (CAGR) of 28.4% expected through the forecast period. By 2033, the market is projected to reach USD 34.6 billion, driven by the escalating adoption of AI-powered personalization across media platforms. The surge in digital content consumption, coupled with the necessity for tailored user experiences, is catalyzing unprecedented growth in this sector.
A primary growth factor for the Recommendation Engine for Media market is the exponential increase in digital media consumption worldwide. As consumers gravitate towards online streaming, social media, and digital news platforms, media companies are compelled to deploy advanced recommendation engines to enhance user engagement and retention. The proliferation of smartphones and the availability of high-speed internet have further fueled this trend, making personalized content delivery not just a value-add but a necessity for competitive differentiation. Media giants and emerging platforms alike are investing heavily in recommendation technologies to analyze user preferences, viewing history, and behavior patterns, ensuring that content discovery remains seamless and relevant.
Another significant driver is the rapid advancement and integration of artificial intelligence and machine learning algorithms within recommendation systems. These technologies enable media platforms to process vast datasets in real-time, generating highly accurate and dynamic content suggestions. The transition from traditional rule-based systems to AI-driven models, such as collaborative filtering and hybrid approaches, has revolutionized the way users interact with media content. This evolution has led to increased user satisfaction, longer session durations, and higher conversion rates for premium services, thereby directly impacting the revenue streams of media companies and OTT platforms.
The expanding ecosystem of Over-the-Top (OTT) platforms and the intensifying competition among broadcasters and publishers are also pivotal to market growth. As content libraries become more extensive, the challenge of content discoverability intensifies, making recommendation engines indispensable. Additionally, the rise of targeted online advertising, which relies on user data and behavioral insights, has created new avenues for monetization and audience segmentation. Media organizations are leveraging recommendation engines not only for content curation but also to optimize advertising strategies, driving both user engagement and advertising revenues.
The integration of Media and Entertainment AI is transforming the landscape of recommendation engines, offering unprecedented opportunities for personalization and user engagement. AI technologies are enabling media companies to analyze vast amounts of data, including user preferences, viewing habits, and social interactions, to deliver highly tailored content recommendations. This level of personalization is not only enhancing user satisfaction but also driving significant increases in content consumption and platform loyalty. As AI continues to evolve, its role in shaping media experiences is expected to grow, with potential applications ranging from real-time content adaptation to predictive analytics that anticipate user needs before they even arise.
From a regional perspective, North America continues to dominate the Recommendation Engine for Media market, attributed to the presence of leading technology providers, early adoption of AI, and a mature digital infrastructure. However, the Asia Pacific region is witnessing the fastest growth, fueled by increasing internet penetration, a burgeoning middle class, and the rapid expansion of local digital media platforms. Europe follows closely, with significant investments in media technology and data privacy regulations shaping the deployment of recommendation solutions. Latin America and the Middle East & Africa are also emerging as promising markets, although their growth trajectories are comparatively nascent due to infrastructural and regulatory challenges.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
What does this dataset contain?
This dataset comprises nearly 900 million organically selected, time-stamped listening events from 4 million anonymized Deezer users recorded in 2023. It covers 50,000 anonymized songs, among the platform’s most popular, along with their multimodal pre-trained embedding vectors (Audio and SVD) generated by our internal model. All files are provided in Parquet format, readable with the `pandas.read_parquet` function.
What could this dataset be used for?
This dataset can be applied to multimodal collaborative filtering and multimodal sequential recommendation tasks, including both next-item and next-session prediction.
Citation
If you use this dataset, please cite following paper:
@inproceedings{tran-recsys2025,
title={"Beyond the past": Leveraging Audio and Human Memory for Sequential Music Recommendation},
author={Viet-Anh Tran, Bruno Sguerra, Gabriel Meseguer-Brocal, Lea Briand and Manuel Moussallam},
booktitle = {Proceedings of the 19th ACM Conference on Recommender Systems},
year = {2025}
}
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Machine Learning Recommendation Algorithm market is poised for significant expansion, projected to reach an estimated USD 22,500 million by 2025, with a robust Compound Annual Growth Rate (CAGR) of 18.5% anticipated through 2033. This growth is primarily propelled by the escalating demand for personalized user experiences across diverse sectors, including entertainment and retail. The ability of ML recommendation engines to analyze vast datasets and deliver tailored suggestions enhances customer engagement, drives sales, and optimizes content consumption, making them indispensable tools for businesses seeking to gain a competitive edge. Advancements in AI and natural language processing further fuel this trend, enabling more sophisticated and context-aware recommendations. The increasing adoption of cloud-based solutions and the proliferation of data-generating devices also contribute to the market's upward trajectory. Despite the strong growth, the market faces certain restraints. The complexity of implementing and maintaining sophisticated recommendation systems, coupled with the need for specialized data science expertise, can pose challenges for smaller enterprises. Furthermore, concerns surrounding data privacy and algorithmic bias necessitate careful development and deployment of these technologies. However, the overwhelming benefits of improved customer satisfaction and increased revenue streams are expected to outweigh these challenges, driving continued innovation and adoption. Key players like Microsoft, Recombee, and Alibaba are heavily investing in R&D to refine their offerings, developing advanced algorithms that cater to evolving consumer expectations and emerging technological landscapes. The market is segmented into service and solution types, with the service segment likely to experience higher growth due to the increasing demand for managed recommendation services. This report provides an in-depth analysis of the global Machine Learning Recommendation Algorithm market, forecasting its trajectory from 2019-2033 with a Base Year of 2025 and a Forecast Period spanning 2025-2033. The Study Period covers 2019-2033, with a keen focus on the Estimated Year of 2025 and the Historical Period of 2019-2024. We estimate the market size to be in the tens of millions by the Base Year, with projected growth into the hundreds of millions by the end of the Forecast Period.
Facebook
Twitterhttps://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The AI-based Recommendation Engine market is projected to reach USD 3,226 million by 2033, growing at a CAGR of XX% from 2025 to 2033. This growth is attributed to the increasing adoption of AI technologies by businesses to enhance customer engagement and drive sales. AI-based recommendation engines analyze user data to provide personalized product or content recommendations, leading to increased user satisfaction and conversion rates. Key market drivers include the rise of e-commerce, the growing use of social media, and the need for businesses to deliver personalized experiences to customers. The market is segmented into types (collaborative filtering, content-based filtering, hybrid recommendation), applications (e-commerce platforms, finance, social media, others), and regions. Major players in the market include Microsoft, Google, Andi Search, Metaphor AI, Brave, Phind, Perplexity AI, NeevaAI, Qubit, and Dynamic Yield. North America is the largest regional market, followed by Europe and Asia Pacific. Key trends in the market include the integration of AI-based recommendation engines with machine learning and natural language processing, the adoption of real-time recommendations, and the use of recommendation engines to drive cross-selling and upselling.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Rohit
Released under Apache 2.0
Facebook
Twitterhttps://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
| BASE YEAR | 2024 |
| HISTORICAL DATA | 2019 - 2023 |
| REGIONS COVERED | North America, Europe, APAC, South America, MEA |
| REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
| MARKET SIZE 2024 | 3.75(USD Billion) |
| MARKET SIZE 2025 | 4.25(USD Billion) |
| MARKET SIZE 2035 | 15.0(USD Billion) |
| SEGMENTS COVERED | Application, Technology, Deployment Type, End User, Regional |
| COUNTRIES COVERED | US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA |
| KEY MARKET DYNAMICS | personalization and customer experience, advanced machine learning algorithms, integration with existing systems, increasing data analytics capabilities, growing e-commerce adoption |
| MARKET FORECAST UNITS | USD Billion |
| KEY COMPANIES PROFILED | Algolia, Amazon, Criteo, SAP, Dynamic Yield, Oracle, Google, Microsoft, Salesforce, Adobe, Shopify, Alibaba, IBM, Magento, Bloomreach, Nvidia |
| MARKET FORECAST PERIOD | 2025 - 2035 |
| KEY MARKET OPPORTUNITIES | Personalization and targeted marketing, AI-driven analytics integration, Expansion in mobile commerce, Cross-platform compatibility enhancement, Real-time recommendation capabilities |
| COMPOUND ANNUAL GROWTH RATE (CAGR) | 13.4% (2025 - 2035) |
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Recommendation Engine market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX% during the forecast period.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation results obtained with SARSA on ML-100K.
Facebook
Twitterhttps://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
The size of the Content Recommendation Engine Market was valued at USD 1.4464 Billion in 2023 and is projected to reach USD 8.37 Billion by 2032, with an expected CAGR of 28.50% during the forecast period. Recent developments include:
March 2021: A key player in the enterprise business process intelligence and process management arena, Signavio, was acquired by SAP SE. The products from Signavio are incorporated into SAP's business process intelligence portfolio and work in conjunction with SAP's comprehensive process transformation portfolio.
February 2021: UNBXD Inc. and Google Cloud worked together to provide retail establishments with AI-powered commerce search on Google Cloud. Unbxd intended to use Google Cloud's cutting-edge search, recommendation, and AI capabilities as part of the partnership to enhance product discovery for retail consumers. Also, the business intended to offer its Google Cloud-hosted commerce search service to retail clients.
.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Journal recommendations prepared on results from JANE and whichjournal.com based on 4 abstracts from the disciplines dentistry, psychology and aerosol chemistry.
The factsheets with data for each journal should help to decide for the best journal.
The data is provided as spreadsheet (xls) and factsheet (pdf).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation dataset.
Facebook
TwitterThis Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The default raw data which is source of Hudup dataset here is Movielens 1M. It is possible to consider that Hudup dataset is secondary data whereas Movielens is primary data. The raw rating data Movielens (GroupLens, 1998) 1M has 1,000,209 ratings from 6,040 users on 3,900 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-1m.zip.
Facebook
TwitterHello Guys! This was a dataset collected for building hotel based recommender systems based on geo-tagging,prices and other features the dataset is collected from various resources.This could only be used for academic and research purpose,could not be sold or distributed for commercial purposes.
Facebook
TwitteraExplicit endorsement of STAIR [10],[12].bExplicit endorsement Piper et al. [53].CV, threat to construct validity; EV, threat to external validity; IV, threat to internal validity; O, outcome; PROG, research program recommendations; T, treatment; Ŧ, recommendation imported from an endorsed guideline but not otherwise stated in the endorsing guideline; U, units (animals); Δ, recommendation imported from an endorsed guideline and also explicitly stated in the endorsing guideline; Total, all parts of the experiment; X, recommendation explicitly stated in the guideline.NINDS-NIH, US National Institutes of Health National Institute of Neurological Disorders and Stroke.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive collection of features for building a content-based recommendation system in an ecommerce environment. Content filtering, which relies on users' interests and past activities, is a prevalent method for suggesting products tailored to individual preferences.
Each entry in the dataset represents a product along with various attributes that can be leveraged for recommendation purposes. Here's an overview of the features included:
1) Number of clicks on similar products: Indicates the popularity or engagement level of similar items. 2) Number of similar products purchased so far: Reflects the conversion rate of similar products. 3) Average rating given to similar products: Offers insight into the perceived quality of comparable items. 4) Gender: Allows for gender-specific recommendations. 5) Median purchasing price (in rupees): Provides pricing information for segmentation or pricing strategy analysis. 6) Rating of the product: The rating of the product itself, indicating its overall quality. 7)**Brand of the product**: Brand loyalty or preference can influence recommendations. 8) Customer review sentiment score (overall): Sentiment analysis of customer reviews, indicating overall satisfaction. 9) Price of the product: The actual price of the product. 10) Holiday: Seasonal or holiday-specific buying patterns. 11) Season: Seasonal preferences may influence product choices. 12) Geographical locations: Regional preferences or availability may impact recommendations. 13) Probability for the product to be recommended to the person: The likelihood of recommending the product to a specific user based on their profile and past behavior.
With this rich set of features, businesses can implement sophisticated recommendation algorithms to personalize the shopping experience for users, ultimately leading to increased customer satisfaction, engagement, and sales.