Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.
This dataset can be used for various analyses such as:
This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
About
This dataset provides insights into user behavior and online advertising, specifically focusing on predicting whether a user will click on an online advertisement. It contains user demographic information, browsing habits, and details related to the display of the advertisement. This dataset is ideal for building binary classification models to predict user interactions with online ads.
Features
Goal
The objective of this dataset is to predict whether a user will click on an online ad based on their demographics, browsing behavior, the context of the ad's display, and the time of day. You will need to clean the data, understand it and then apply machine learning models to predict and evaluate data. It is a really challenging request for this kind of data. This data can be used to improve ad targeting strategies, optimize ad placement, and better understand user interaction with online advertisements.
Facebook
TwitterThis dataset captures how many voter registration applications each agency has distributed, how many applications agency staff sent to the Board of Elections, how many staff each agency trained to distribute voter registration applications, whether or not the agency hosts a link to voting.nyc on its website and if so, how many clicks that link received during the reporting period.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset generated by an E-commerce website which sells a variety of products at its online platform. The records user behaviour of its customers and stores it as a log. However, most of the times, users do not buy the products instantly and there is a time gap during which the customer might surf the internet and maybe visit competitor websites. Now, to improve sales of products, website owner has hired an Adtech company which built a system such that ads are being shown for owner products on its partner websites. If a user comes to owner website and searches for a product, and then visits these partner websites or apps, his/her previously viewed items or their similar items are shown on as an ad. If the user clicks this ad, he/she will be redirected to the owner website and might buy the product.
The task is to predict the probability i.e. probability of user clicking the ad which is shown to them on the partner websites for the next 7 days on the basis of historical view log data, ad impression data and user data.
You are provided with the view log of users (2018/10/15 - 2018/12/11) and the product description collected from the owner website. We also provide the training data and test data containing details for ad impressions at the partner websites(Train + Test). Train data contains the impression logs during 2018/11/15 – 2018/12/13 along with the label which specifies whether the ad is clicked or not. Your model will be evaluated on the test data which have impression logs during 2018/12/12 – 2018/12/18 without the labels. You are provided with the following files:
item_data.csv
The evaluated metric could be "area under the ROC curve" between the predicted probability and the observed target.
Facebook
TwitterThe data on the use of the data sets on the OGD portal BL (data.bl.ch) are collected and published by the specialist and coordination office OGD BL. Contains the day the usage was measured.dataset_title: The title of the dataset_id record: The technical ID of the dataset.visitors: Specifies the number of daily visitors to the record. Visitors are recorded by counting the unique IP addresses that recorded access on the day of the survey. The IP address represents the network address of the device from which the portal was accessed.interactions: Includes all interactions with any record on data.bl.ch. A visitor can trigger multiple interactions. Interactions include clicks on the website (searching datasets, filters, etc.) as well as API calls (downloading a dataset as a JSON file, etc.).RemarksOnly calls to publicly available datasets are shown.IP addresses and interactions of users with a login of the Canton of Basel-Landschaft - in particular of employees of the specialist and coordination office OGD - are removed from the dataset before publication and therefore not shown.Calls from actors that are clearly identifiable as bots by the user agent header are also not shown.Combinations of dataset and date for which no use occurred (Visitors == 0 & Interactions == 0) are not shown.Due to synchronization problems, data may be missing by the day.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for "Controlling the Photon Number Coherence of Solid-state Quantum Light Sources for Quantum Cryptography"
This dataset contains data for
journal website: https://www.nature.com/articles/s41534-024-00811-2
DOI: https://doi.org/10.1038/s41534-024-00811-2
The data is in either .txt or .csv format. The zip file "PNCDataSet" includes the following folders and data:
Additional data for the supplementary materials is available upon a reasonable request.
How to extract data:
Windows:
Locate the .zip file on your computer. In this case, the zip file is named "SUPERDataset".
Right-click on the .zip file and select "Extract All" from the context menu. This will open the extraction wizard.
macOS:
Locate the .zip file on your computer. In this case, the zip file is named "SUPERDataset".
Double-click on the .zip file. macOS will automatically extract the contents of the .zip file to the same location.
Linux:
Open a terminal window.
Navigate to the directory where the .zip file is located using the cd command.
Run the following command to unzip the file:
unzip PNCDataSet.zip
Y.K., F.K., R.S., V.R. and G.W. acknowledge financial support through the Austrian Science Fund FWF projects W1259 (DK-ALM Atoms, Light, and Molecules), FG 5, TAI-556N (DarkEneT), F 7114 (BeyondC) and I4380 (AEQuDot). DAV and TH acknowledge financial support by the German Federal Ministry of Education and Research (BMBF) via projects 13N14876 (‘QuSecure’) and 16KISQ087K (tubLAN Q.0). TKB and DER acknowledge financial support from the German Research Foundation DFG through project 428026575 (AEQuDot). A.R. and SFCdS acknowledge the FWF projects FG 5, P 30459, I 4320, the Linz Institute of Technology (LIT) and the European Union’s Horizon 2020 research, and innovation program under Grant Agreement Nos. 899814 (Qurope), 871130 (ASCENT+) and the QauntERA II Program (project QD-E-QKD). L.M.H., P.W. and J.C.L. acknowledge financial support from the European Union’s Horizon 2020 and Horizon Europe research and innovation program under grant agreement No 899368 (EPIQUS), the Marie Skłodowska-Curie grant agreement No 956071 (AppQInfo), and the QuantERA II Program under Grant Agreement No 101017733 (PhoMemtor); FWF through F7113 (BeyondC), and FG5 (Research Group 5); from the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development and the Christian Doppler Research Association. For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Analyze the marketing spending.
1- Overall ROMI 2- ROMI by campaigns 3- Performance of the campaign depending on the date - on which date did we spend the most money on advertising, when we got the biggest revenue when conversion rates were high and low? What were the average order values? 4- When buyers are more active? What is the average revenue on weekdays and weekends? 5- Which types of campaigns work best - social, banner, influencer, or a search? 6- Which geo locations are better for targeting - tier 1 or tier 2 cities?
Column. Description Date date of spending of the marketing budget Campaign name description of campaign Category type of marketing source Campaign id unique identifier Impressions number of times the ad has been shown Mark. budget money spent on this campaign on this day Clicks how many people clicked on a banner (=visited website) Leads how many people signed up and left their credentials Orders how many people paid for the product Revenue how much money we earned
Clicks, Leads, orders, and revenue are calculated for a specific marketing campaign on a specific date. E.g. For the “facebook_tier1” marketing campaign on the 1st of February, we spent INR 7,307.37, got 148,263 impressions that converted to 1,210 clicks that in turn converted to 13 leads and 1 order. We earned INR 4,981.
This data reflects some facts about what happened - how much we spent, how much we earned, how customers behaved (who clicked on the ad banner, who signed up, who paid). Now we need to calculate marketing metrics that would help us evaluate if we did a good job or not and also identify some parameters of the campaign that would be important for analysis. What are these metrics:
These metrics are actionable and allow us not only to analyze but to make decisions and act to improve the business result.
Let’s dive deeper.
ROMI return on marketing investments, how effective is marketing
campaign, one metric that shows effectiveness of every rupee spent.
It is calculated ( Total earning (Revenue) - Marketing cost ) / Marketing cost )
Click-through rate(CTR). percentage of people who clicked at banner (Clicks/ Impressions)
Conversion 1 conversion from visitors to leads for this campaign (Leads/Click)
Conversion 2 conversion rate from leads to sales (Orders/Leads)
Average order value (AOV) Average order value for this campaign (Revenue/Number of Orders)
Cost per click (CPC) how much does it cost us to attract 1 click (on average) (Marketing spending/Clicks)
Cost per lead (CPL) how much does it cost us to attract 1 lead (on average) (Marketing spending/Leads)
Customer acquisition cost (CAC) -- how much does it cost us to attract 1 order (on average) (marketing spend/ orders) Gross profit Profit or loss after deducting marketing cost (Revenue-Marketing spending)
ROMI is the most important metric and it is used as the ultimate way to evaluate if the campaign is good or bad.
You can use this article to know more about marketing metrics. https://www.owox.com/blog/articles/digital-marketing-metrics-and-kpis/
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Overview This dataset contains fictional or sample retail data gathered from various sources within a retail business environment. It comprises multiple CSV files that capture customer demographics, purchase transactions, marketing campaign details, customer reviews, support tickets, and other interaction logs.
The primary goal of this dataset is to facilitate exploratory data analysis, data mining, machine learning projects, or general data science practice with a realistic but controlled dataset.
Note: All data here is either fictitious or anonymized. Any resemblance to real people, products, or companies is purely coincidental.
File Descriptions customers.csv Contains demographic and basic registration information about each customer.
customer_id (int or string): Unique identifier for each customer. full_name (string): Full name of the customer. age (int): Age of the customer. gender (string): Gender of the customer (e.g., “Male”, “Female”, “Other”). email (string): Email address. phone (string): Phone number. street_address (string): Street address. city (string): City of residence. state (string): State of residence. zip_code (string): Zip code/postal code. registration_date (date): Date the customer registered. preferred_channel (string): Preferred communication channel (e.g., “Email”, “SMS”, “Phone”). transactions.csv Includes purchase information for products bought by customers.
transaction_id (int): Unique ID for each transaction. customer_id (int or string): Links to customer_id in customers.csv. product_name (string): Name of the purchased product. product_category (string): Category or classification of the product. quantity (int): Number of units purchased in this transaction. price (float): Price per unit of the product. transaction_date (date): Date when the transaction was completed. store_location (string): Physical store location or “Online” if purchased via e-commerce. payment_method (string): Payment method (e.g., “Credit Card”, “PayPal”, “Cash”). discount_applied (float or int): Discount amount or percentage applied (if any). interactions.csv Captures various interactions the customer has with the company’s digital channels.
interaction_id (int): Unique ID for each interaction. customer_id (int or string): Links to customer_id in customers.csv. channel (string): Channel of interaction (e.g., “Website”, “Mobile App”, “Social Media”). interaction_type (string): Type of interaction (e.g., “View Product”, “Add to Cart”, “Click Ad”). interaction_date (date): Date of the interaction. duration (int or float): Duration of interaction in seconds or minutes. page_or_product (string): Page visited or product name related to the interaction. session_id (string): Session identifier to track multiple actions within one session. campaigns.csv Details about marketing or advertising campaigns.
campaign_id (int): Unique campaign identifier. campaign_name (string): Name or title of the campaign. campaign_type (string): Type of campaign (e.g., “Email”, “Social Media”, “TV Ad”). start_date (date): Launch date of the campaign. end_date (date): Completion date of the campaign. target_segment (string): Target audience or segment (e.g., “Loyal Customers”, “New Registrations”). budget (float): Allocated budget for the campaign. impressions (int): Number of impressions generated (for digital campaigns). clicks (int): Number of clicks (if applicable). conversions (int): Number of desired conversions (e.g., sign-ups, purchases). conversion_rate (float): Percentage of impressions that resulted in conversions. roi (float): Return on investment for the campaign. customer_reviews_complete.csv Collection of product reviews submitted by customers post-purchase.
review_id (int): Unique identifier for each review. customer_id (int or string): Links to customer_id in customers.csv. product_name (string): Name of the product reviewed. product_category (string): Category of the product reviewed. full_name (string): Name of the customer (could be anonymized). transaction_date (date): Date of the transaction related to this review. review_date (date): Date the review was posted. rating (int): Rating given by the customer (e.g., 1 to 5 stars). review_title (string): Short title or summary of the review. review_text (string): Full text of the customer’s review. support_tickets.csv Tracks customer support interactions and resolution details.
ticket_id (int): Unique ID for each support ticket. customer_id (int or string): Links to customer_id in customers.csv. issue_category (string): General category of the issue (e.g., “Billing”, “Product Defect”, “Returns”). priority (string): Priority level (“Low”, “Medium”, “High”). submission_date (date): Date the ticket was submitted. resolution_date (date): Date the ticket was resolved (if applicable). resolution_status (string): Status of the ticket (“Open”, “Closed”, “Pending”, et...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description: The Marketing Campaign Performance Dataset provides valuable insights into the effectiveness of various marketing campaigns. This dataset captures the performance metrics, target audience, duration, channels used, and other essential factors that contribute to the success of marketing initiatives. With 200000 unique rows of data spanning two years, this dataset offers a comprehensive view of campaign performance across diverse companies and customer segments.
Columns: Company: The company responsible for the campaign, representing a mix of fictional brands. Campaign_Type: The type of campaign employed, including email, social media, influencer, display, or search. Target_Audience: The specific audience segment targeted by the campaign, such as women aged 25-34, men aged 18-24, or all age groups. Duration: The duration of the campaign, expressed in days. Channels_Used: The channels utilized to promote the campaign, which may include email, social media platforms, YouTube, websites, or Google Ads. Conversion_Rate: The percentage of leads or impressions that converted into desired actions, indicating campaign effectiveness. Acquisition_Cost: The cost incurred by the company to acquire customers, presented in monetary format. ROI: Return on Investment, representing the profitability and success of the campaign. Location: The geographical location where the campaign was conducted, encompassing major cities like New York, Los Angeles, Chicago, Houston, or Miami. Language: The language used in the campaign communication, including English, Spanish, French, German, or Mandarin. Clicks: The number of clicks generated by the campaign, indicating user engagement. Impressions: The total number of times the campaign was displayed or viewed by the target audience. Engagement_Score: A score ranging from 1 to 10 that measures the level of engagement generated by the campaign. Customer_Segment: The specific customer segment or audience category that the campaign was tailored for, such as tech enthusiasts, fashionistas, health and wellness enthusiasts, foodies, or outdoor adventurers. Date: The date on which the campaign occurred, providing a chronological perspective to analyze trends and patterns.
Scope: By leveraging this dataset, marketers and data analysts can uncover valuable insights regarding campaign performance, audience preferences, channel effectiveness, and ROI. This dataset serves as a valuable resource for market research, campaign optimization, and data-driven decision-making, enabling businesses to refine their marketing strategies and drive targeted growth.
**Note:** This is a fictional dataset.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides simulated data for user interactions on an e-commerce platform. It includes sequences of events such as page views, clicks, product views, and purchases. Each record captures user activity within sessions, making it suitable for analyzing clickstream paths and transaction sequences.
Features:
UserID: Unique identifier for each user. SessionID: Unique identifier for each session. Timestamp: Date and time of the interaction. EventType: Type of event (e.g., page view, click, product view, add to cart, purchase). ProductID: Unique identifier for products involved in interactions. Amount: Amount of the transaction (for purchases). Outcome: Target event (e.g., purchase).
This dataset can be used to discover patterns and sequences leading to specific outcomes such as product purchases or churn.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
Welcome to the Meneame Popularity Trends dataset! Here, we explore the various ways that users interact with news articles on the social news website throughout April 2021. Our dataset reveals how different article characteristics and user voting behavior can directly influence an article's popularity on the website.
This dataset includes information covering different aspects of a story - from its title to source and total karma score. It also captures a variety of historical user interactions such as comments, anonymous votes, positive & negative feedback as well as how often an article is clicked by others. All these important details provide us with valuable insights into understanding which stories are more likely to be successful on Meneame's platform.
So let us dive deep into this data and discover what makes articles go viral!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides insights into the popularity & user feedback trends on Meneame news articles. With this data, you’ll be able to understand how certain characteristics and user voting behaviors can impact an article’s overall popularity. Let's take a look at how to make the best use of this dataset:
1. Analyze Popularity Trends: Examine the trends in clicks, comments, votes (positive/negative/anonymous) over time to determine which stories were most popular and when they were posted. You can also consider the limitations of the sample by looking at which website is represented in your results - Meneame or its subdomains?
2. Study Story Characteristics & User Voting Behaviors: Use the columns contained in this dataset (title, web source, user who posted article) to look into what types of stories receive more positive feedback from users and whether there is a pattern among sources that publish such content. Additionally, study if different types of interactions (comments vs upvotes vs downvotes) have any effect on article popularity.
3. Explore Subreddit Interactions: You may want to explore how different subreddits interact with posts on Meneame by reviewing unique karma scores per post and its respective subreddit (Sub column). It will also be interesting to compare story charactersitcs between high-traffic subreddits with low-traffic subsReddits in order to see what type o fcontent is most engaging for each type of community or audience group respectively!
- Developing a predictive model to identify article characteristics that increase the popularity of articles on Meneame.
- Comparing the average user feedback rate of different types of news stories, in order to understand which ones generally receive the most engagement from readers.
- Studying the correlation between an article’s popularity and its author’s karma score, in order to understand how popular users can increase their visibility on Meneame
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: MenScraping_12_04_21_10_20_.csv | Column name | Description | |:------------------|:------------------------------------------------------------------------------| | Titular | The title of the article. (String) | | Data Creació | The date the article was created. (Date) | | Web | The web source of the article. (String) | | Usuari | The user who posted the article. (String) | | Meneos | The number of recommended shares (meneos) the article has received. (Integer) | | Clics | The number of clicks the article has received. (Integer) | | Comentaris | The number of comments the article has received. (Integer) | | Vots Positius | The number of positive votes the article has receive...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides a comprehensive view of the online advertising performance for "Company X" over a three-month period in 2020. Here's an overview of its components and potential analyses you can perform:
Dataset Components: Day: Date of the advertising campaign. Campaign: Specific group targeting variable set by Company X. User Engagement: Level of user interaction with the ads. Banner: Ad size served by "Advert Firm A". Placement: Publisher space where ads are served (websites/apps). Displays: Number of ads shown by "Advert Firm A". Cost: Price paid to serve the ads to the publisher. Clicks: Number of times users clicked on the ads. Revenue: Amount
Facebook
Twittercampaign_item_id : unique id of each adevertising campaign no_of_days : number of days campaign has been running time : timestamp on which the data was captured ext_service_id : id of each advertising platforms used ext_service_name : name of each advertising platforms used creative_id : id of the creative images used for ads creative_height : height of the creative image for the ad in pixels creative_width : width of the creative image for the ad in pixels search_tags : search tags used for displaying ads template_id : template used in the creative image landing_page : landing page url on which users clicked or browsed through advertiser_id : id of the advertiser advertiser_name : name of the place of the advertiser ( city , country , state ) network_id : id of the each agency advertiser_currency : currency of the country in which the advertiser operates in channel_id : id of each channel used for placed ads channel_name : name of the channel ( display , search , social , mobile video ) max_bid_cpm : maximum value of bid for optimizing cpm campaign_budget_usd : overall budget of the campaign or the amount of money that the campaign can spend impressions : the number of times an advertisement is displayed on a website or social media platform. clicks : the number of times an advertisement is clicked on by a user, leading them to the advertiser's website or landing page. currency_code : the currency code of the advertiser exchange_rate : a relative price of one currency expressed in terms of another currency. media_cost_usd : the amount of money that the campaign has spent on that particuar day position_in_content : position where the ad was placed on the website page unique_reach : the number of unique users who see your post or page. total_reach : the number of people who saw any content from your page or about your page. search_tags : a word or set of words a person enters when searching on Google or one of our Search Network sites. cmi_currency_code : campaign currency code time_zone : timezone in which the campaign is running weekday_cat : weekday / weekend catgeory keywords : a word or set of words that Google Ads advertisers can add to a given ad group so that your ads are targeting the right audience.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A company wants to know the CTR ( Click Through Rate ) in order to identify whether spending their money on digital advertising is worth or not. A higher CTR represents more interest in that specific campaign, whereas a lower CTR can show that the ad may not be as relevant. High CTRs are important because they show that more people are clicking through the website. Along with this high CTRs also help to get better ad position for less money on online platforms like Google, Bing etc.
The dataset divided to train (463291, 15) and test (128858, 14). Features are clear and target is "is_click" , 0 (No) , 1(Yes).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This repository contains the code and documentation for our ACSAC 2023 paper From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!. With this artifact, we hope to foster future research on this subject.
We provide the screenshots and file hashes of the PDFs in our dataset, allowing inspection of the images and download (from external sources, e.g. VirusTotal) of the same files. Moreover, we also share the URLs contained in the PDFs and the code to reproduce most of the findings of our paper (we are not allowed to share VirusTotal data due to their Terms of Service).
We recommend inspecting our code from the Kaggle platform, as this does not involve any setup nor download of the data. Nonetheless, all our code can be executed in a regular laptop. We used Ubuntu 18.0 and Python 3.6.9. The dependencies for this code are minimal: Pandas 1.3.3 or higher, Numpy 1.21.2 and Matplotlib 3.4.3.
Part of our experiments involve developing and training a deep learning model (based on DeepCluster). We created an additional Github repository containing the scripts that can help reproduce the clustering procedure. This code was developed using Ubuntu 19.0 and run on a TITAN RTX GPU. To support future research, we have shared the input and output data used in the clustering process. We have also provided the pairwise distances of the embeddings used in the second clustering step (input to DBSCAN), uploaded on Kaggle due to size restrictions of files on Github. The results of this specific experiment cannot be repeated due to manual analysis checks, but we have shared the input, output, and code to make it as reproducible as possible.
Please feel free to leave a comment or reach out in case of any question or issue :)
Facebook
TwitterI've been diving into the vibrant world of data for a solid two years, and guess what? I'm finally cracking the code on what it takes to soar in this industry! Early in my data adventures, I was like a kid on Limewire when I found Kaggle, downloading everything that caught my eye. But then, I stumbled upon Spotify's data and... let's just say, it was a bit of a reality check.
I found myself wrestling with duplicate records, scratching my head over inconsistent schemas, and feeling lost in the sauce without any guides. That experience was a game-changer for me. I made a promise to my future self: “When you've got the skills, create a dataset that's not just good, but legendary.” That time has come!
Introducing my unique Spotify dataset – a crystal clear reflection of dedication and clarity. What makes this set stand out? You're not just getting data; you're getting a story. You can literally trace my steps, unraveling the magic behind each table through my script on Github. It's like having a backstage pass to a data concert! (Yes, Swifties will love this dataset too 😉)
I'm all about transparency, and I believe it's the key to trust. With this dataset, I'm laying it all out there – no smoke and mirrors, just pure, unadulterated, CLEAN data. I want you to feel the same excitement I do when data just clicks into place. I encourage you all to checkout the Github repo I linked above to see how this dataset came to life!
If you have any questions, suggestions or simply want to network, reach out to me on LinkedIn
This dataset is created using data sourced from Spotify and adheres to their Terms of Use. The dataset is intended for non-commercial, academic purposes and does not infringe upon Spotify's intellectual property rights. For full details on Spotify's terms, please visit Spotify's Terms and Conditions of Use.
You can find documentation for Spotifys Web APIs here
As of 12/20/2023, this is V1 of my data and I'll most likely release a few more versions after working through kinks from former releases.
Other Datasets: - Zillow
Facebook
TwitterUsing Observed Probability of the events as ratings¶ Hypothesis and assumption: - All the purchases follows defined process: 1. Users first clicks on the product for details 2. Users adds the product to carts 3. Users orders the product As the data we are getting is for clicks and futher acitivities, we assume that the users would definitely click once they visit the website. E.g.: for session_clicks for any particular product/aid : ratings = p(aid|click),
Similarly for session_carts for any particular product/aid : ratings = p(aid|carts|clicks) {This is because of the instances observed where products are already in the Cart}
And for session_orders for any particular product/aid : ratings = p(aid|orders|carts|clicks)
In this notebook- for any given item rating = (cumulative distribution at that point)*(number of times the item triggered the event)/(total number of the particular events in the session)
While it appears to ignore the probability of clicks converting to 'carts'/'orders', this could be included in future to improve performance
The timestamp is only used to get the number of clicks/carts/orders and not breakdown to time of the day/month/year
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.
This dataset can be used for various analyses such as:
This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.