Facebook
TwitterThis dataset was created by Sulabh Shrestha
Facebook
TwitterMost organizations today rely on email campaigns for effective communication with users. Email communication is one of the popular ways to pitch products to users and build trustworthy relationships with them. Email campaigns contain different types of CTA (Call To Action). The ultimate goal of email campaigns is to maximize the Click Through Rate (CTR). CTR = No. of users who clicked on at least one of the CTA / No. of emails delivered. This Dataset contains details of body length, sub length, mean paragraph , day of week, is weekend, etc.
Facebook
TwitterEvgeniaKyriazi/ctr-prediction-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://www.kaggle.com/louischen7/2020-digix-advertisement-ctr-predictionhttps://www.kaggle.com/louischen7/2020-digix-advertisement-ctr-prediction
Advertisement CTR (click-through-rate) prediction is the key problem in the area of computational advertising. Increasing the accuracy of advertisement CTR prediction is critical to improve the effectiveness of precision marketing. Based on the following datasets, a Kaggle competition was run for optimal advertisement CTR prediction models. The datasets contain the advertising behavior data collected from seven consecutive days, including a training dataset and a testing dataset.
Facebook
TwitterCombining a deep neural network with fuzzy theory, this paper proposes an advertising click-through rate (CTR) prediction approach based on a fuzzy deep neural network (FDNN). In this approach, fuzzy Gaussian-Bernoulli restricted Boltzmann machine (FGBRBM) is first applied to input raw data from advertising datasets. Next, fuzzy restricted Boltzmann machine (FRBM) is used to construct the fuzzy deep belief network (FDBN) with the unsupervised method layer by layer. Finally, fuzzy logistic regression (FLR) is utilized for modeling the CTR. The experimental results show that the proposed FDNN model outperforms several baseline models in terms of both data representation capability and robustness in advertising click log datasets with noise.
Facebook
TwitterThe dataset used in this paper is a real-world online sponsor advertising application, containing user click history logs from Baidu’s search engine.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Click-Through Rate is calculated as the number of clicks an ad receives divided by the number of times the ad is shown (impressions), expressed as a percentage. The CTR prediction task involves modeling the likelihood of a click based on ad characteristics, user profile data, and contextual features.
Predicting the click-through Rate (CTR) is crucial for optimizing online advertising campaigns. By accurately estimating the likelihood of a user clicking on an ad, businesses can make informed decisions about ad placement and design, ultimately maximizing their return on investment (ROI).
Facebook
TwitterOverall CTR prediction for Logloss performance in different datasets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created by Anand Panda
Released under Attribution 4.0 International (CC BY 4.0)
Facebook
TwitterDatasets for CTR prediction
Facebook
TwitterThis dataset was created by Gaurav Dutta
Facebook
TwitterThis dataset was created by sambanjie
Facebook
TwitterThis dataset was created by Darrell Cornelius Rivaldo
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overall CTR prediction for RMSE performance in different datasets.
Facebook
TwitterThis dataset was created by Gaurav Dutta
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
📊 Criteo 1TB Click Logs Dataset
This dataset contains feature values and click feedback for millions of display ads. Its primary purpose is to benchmark algorithms for clickthrough rate (CTR) prediction. It is similar, but larger than the dataset released for the Display Advertising Challenge hosted by Kaggle:🔗 Kaggle Criteo Display Advertising Challenge
📁 Full Description
This dataset contains 24 files, each corresponding to one day of data.
🏗️… See the full description on the dataset page: https://huggingface.co/datasets/criteo/CriteoClickLogs.
Facebook
TwitterAd recommendation models are usually built based on historical ad impressions, clicks, and other user behavior data. If only data from the ads domain is used, user behavior data will be sparse, and the user behavior types that can be identified will be limited. However, if a user's behavior data in other domains from the same app is explored, the user's interests and behavior characteristics can be better identified. Of course, introducing user behavior data from other apps can also help enrich the data of user behavior characteristics and ad performance.You are expected to enhance ads click-through rate (CTR) prediction accuracy by leveraging ad logs, user profiles, and cross-domain data. With ads as the target domain and news feeds as the source domain, you should build user interest models through impressions, clicks, and other user behavior data obtained from the news feeds domain, thus improving the CTR prediction performance of the ads domain.
The provided data includes data from the target domain (such as user behavior logs, user profiles, and ad material information) and that from the source domain (such as user behavior data and basic information about news items).
Field | Field Description | Can be empty | Field Type | Value Example | --- | --- | label | User ID | No | int | 0,1 user_id|User ID|No|String|1,2… age|Age|Yes|String|1,2,3… gender|Gender|Yes|String|1,2… residence|Permanent residence (province).|Yes|String|1,2… city|Permanent residence (city ID).|Yes|String|1,2… city_rank|Permanent residence (city level).|Yes|String|1,2… series_dev |设备系列| 是| String| 1,2… series_group |设备系列分组| 是| String |1,2… emui_dev| emui 版本号| 是 |String| 1,2… device_name| 用户使用的手机机型| 是 |String |1,2… device_size |用户使用手机的尺寸| 是 |String| 1,2… net_type |行为发生的网络状态| 是| String| 1,2… task_id| 广告任务唯一标识 |是| String| 1,2… adv_id |广告任务对应的素材 id |是| String |1,2… creat_type_cd|素材的创意类型 id |是 |String |1,2… adv_prim_id|广告任务对应的广告主 id| 是 |String| 1,2… inter_type_cd|广告任务对应的素材的交 互类型| 是 |String |1,2… slot_id| 广告位 id| 是| String |1,2… site_id|媒体 id |是 |String |1,2… spread_app_id| 投放广告任务对应的应用 id |是 |String |1,2… hispace_app_tags|广告任务对应的应用的标 签| 是 |String |1,2… app_second_class|广告任务对应的应用的二 级分类 |是| String| 1,2… app_score| app 得分| 是| Int| 4 ad_click_list_001|用户点击广告任务 id 列表| 是| [string,] |[1^2…] ad_click_list_002|用户点击广告对应广告主 id 列表| 是| [string,]| [1^2…] ad_click_list_003| 用户点击广告推荐应用列 表| 是 |[string,]| [1^2…] ad_close_list_001|用户关闭广告任务列表| 是 |[string,] |[1^2…] ad_close_list_002| 用户关闭广告对应广告主 列表 |是 |[string,] |[1^2…] ad_close_list_003| 用户关闭广告推荐应用列 表| 是| [string,]| [1^2…] pt_d| 时间戳| 否| String| 202205221430 log_id| 样本 id |否 |Int| 12345678
Field | Field Description | Can be empty | Field Type | Value Example | --- | --- | u_userId|User ID|No|String|0001 u_phonePrice|Price of a user's device.|Yes|String|13 u_browserLifeCycle|User engagement on Browser.|Yes|String|10 u_browserMode|Browser service type.|Yes|String|11 u_feedLifeCycle|User engagement on news feeds.|Yes|String|12 u_refreshTimes|Average number of valid news feeds updates per day.|Yes|String|16 u_newsCatInterests|Liked news feeds categories based on the click behavior of a user.|Yes|[String,]|[1^2…] u_newsCatDislike|信息流图文 负反馈 分类 偏好 |是 |[string,]| [1^2…] u_newsCatInterestsST|用户短时 兴趣 分类偏好| 是 |[string,] |[1^2…] u_click_ca2_news|用户图文 类别 点击序列 |是| [string,] |[1^2…] i_docId|文章 docid |是 |String| 0001 i_s_sourceId|文章来源的 sourceid |是| String |0001 i_regionEntity|文章地域词 id |是 |String |0001 i_cat|文章类别 id |是 |String |0001 i_entities|文章实体词 id |是| [string,]| [1^2…] i_dislikeTimes|文章负反馈量 |是 |String| 60 i_upTimes|文章点赞量 |是 |String| 22 I_dtype| 文章展现形式 |是 |String |20 e_ch|频道 |是 |String |1,2… e_m |事件来源设备机型 |是 |String| 1,2… e_po|第几位 |是 |String |9 e_pl|拜访地 |是 |String| 1,2… e_rn| 第几刷 |是 |String |1 e_section|信息流场景类型| 是 |String |13 e_et|时间戳| 否 |String| 202205221430 label|是否点击, -1:否, 1:是 |否| String| 1 cilLabel|是否点赞,-1:否, 1:是 |否| String| 1 pro| 文章浏览进度 |否 |String| 1,2…
Source: DIGIX
Facebook
TwitterThe dataset used in the paper is Alipay, Tmall, and Alimama. These datasets are used for click-through rate (CTR) prediction. The datasets contain user and item features, as well as user behavior sequences.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The results of the AUC and accuracy in CTR prediction.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Criteo contains 7 days of click-through data, which is widely used for CTR prediction benchmarking. There are 26 anonymous categorical fields and 13 continuous fields in Criteo dataset.Display advertising is a billion dollar effort and one of the central uses of machine learning on the Internet. However, its data and methods are usually kept under lock and key. In this research competition, CriteoLabs is sharing a week’s worth of data for you to develop models predicting ad click-through rate (CTR). Given a user and the page he is visiting, what is the probability that he will click on a given ad?The goal of this challenge is to benchmark the most accurate ML algorithms for CTR estimation.
Facebook
TwitterThis dataset was created by Sulabh Shrestha