78 datasets found
  1. Website Traffic

    • kaggle.com
    zip
    Updated Aug 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnthonyTherrien (2024). Website Traffic [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/website-traffic/discussion
    Explore at:
    zip(65228 bytes)Available download formats
    Dataset updated
    Aug 5, 2024
    Authors
    AnthonyTherrien
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Overview

    This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.

    Dataset Description

    • Page Views: The number of pages viewed during a session.
    • Session Duration: The total duration of the session in minutes.
    • Bounce Rate: The percentage of visitors who navigate away from the site after viewing only one page.
    • Traffic Source: The origin of the traffic (e.g., Organic, Social, Paid).
    • Time on Page: The amount of time spent on the specific page.
    • Previous Visits: The number of previous visits by the same visitor.
    • Conversion Rate: The percentage of visitors who completed a desired action (e.g., making a purchase).

    Data Summary

    • Total Records: 2000
    • Total Features: 7

    Key Features

    1. Page Views: This feature indicates the engagement level of the visitors by showing how many pages they visit during their session.
    2. Session Duration: This feature measures the length of time a visitor stays on the website, which can indicate the quality of the content.
    3. Bounce Rate: A critical metric for understanding user behavior. A high bounce rate may indicate that visitors are not finding what they are looking for.
    4. Traffic Source: Understanding where your traffic comes from can help in optimizing marketing strategies.
    5. Time on Page: This helps in analyzing which pages are retaining visitors' attention the most.
    6. Previous Visits: This can be used to analyze the loyalty of visitors and the effectiveness of retention strategies.
    7. Conversion Rate: The ultimate metric for measuring the effectiveness of the website in achieving its goals.

    Usage

    This dataset can be used for various analyses such as:

    • Identifying key drivers of engagement and conversion.
    • Analyzing the effectiveness of different traffic sources.
    • Understanding user behavior patterns and optimizing the website accordingly.
    • Improving marketing strategies based on traffic source performance.
    • Enhancing user experience by analyzing time spent on different pages.

    Acknowledgments

    This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.

  2. Google Analytics Sample

    • kaggle.com
    zip
    Updated Sep 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Sep 19, 2019
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Googlehttp://google.com/
    Authors
    Google BigQuery
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.

    Content

    The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:

    Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.

    Fork this kernel to get started.

    Acknowledgements

    Data from: https://bigquery.cloud.google.com/table/bigquery-public-data:google_analytics_sample.ga_sessions_20170801

    Banner Photo by Edho Pratama from Unsplash.

    Inspiration

    What is the total number of transactions generated per device browser in July 2017?

    The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?

    What was the average number of product pageviews for users who made a purchase in July 2017?

    What was the average number of product pageviews for users who did not make a purchase in July 2017?

    What was the average total transactions per user that made a purchase in July 2017?

    What is the average amount of money spent per session in July 2017?

    What is the sequence of pages viewed?

  3. r

    Walmart.com Daily Traffic Statistics 2025

    • redstagfulfillment.com
    html
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Red Stag Fulfillment (2025). Walmart.com Daily Traffic Statistics 2025 [Dataset]. https://redstagfulfillment.com/how-many-daily-visits-does-walmart-receive/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 19, 2025
    Dataset authored and provided by
    Red Stag Fulfillment
    Time period covered
    2020 - 2025
    Area covered
    United States
    Variables measured
    Daily website visits, Session duration metrics, Traffic source breakdown, Geographic traffic patterns, Seasonal traffic variations, Mobile vs desktop traffic distribution
    Description

    Comprehensive dataset analyzing Walmart.com's daily website traffic, including 16.7 million daily visits, device distribution, geographic patterns, and competitive benchmarking data.

  4. Recipe Site Traffic: Analysis & Prediction

    • kaggle.com
    Updated Sep 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Matta (2025). Recipe Site Traffic: Analysis & Prediction [Dataset]. https://www.kaggle.com/datasets/michaelmatta0/recipe-site-traffic-analysis-and-prediction
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 21, 2025
    Dataset provided by
    Kaggle
    Authors
    Michael Matta
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset originates from DataCamp. Many users have reposted copies of the CSV on Kaggle, but most of those uploads omit the original instructions, business context, and problem framing. In this upload, I’ve included that missing context in the About Dataset so the reader of my notebook or any other notebook can fully understand how the data was intended to be used and the intended problem framing.

    Note: I have also uploaded a visualization of the workflow I personally took to tackle this problem, but it is not part of the dataset itself. Additionally, I created a PowerPoint presentation based on my work in the notebook, which you can download from here:
    PPTX Presentation

    Recipe Site Traffic

    From: Head of Data Science
    Received: Today
    Subject: New project from the product team

    Hey!

    I have a new project for you from the product team. Should be an interesting challenge. You can see the background and request in the email below.

    I would like you to perform the analysis and write a short report for me. I want to be able to review your code as well as read your thought process for each step. I also want you to prepare and deliver the presentation for the product team - you are ready for the challenge!

    They want us to predict which recipes will be popular 80% of the time and minimize the chance of showing unpopular recipes. I don't think that is realistic in the time we have, but do your best and present whatever you find.

    You can find more details about what I expect you to do here. And information on the data here.

    I will be on vacation for the next couple of weeks, but I know you can do this without my support. If you need to make any decisions, include them in your work and I will review them when I am back.

    Good Luck!

    From: Product Manager - Recipe Discovery
    To: Head of Data Science
    Received: Yesterday
    Subject: Can you help us predict popular recipes?

    Hi,

    We haven't met before but I am responsible for choosing which recipes to display on the homepage each day. I have heard about what the data science team is capable of and I was wondering if you can help me choose which recipes we should display on the home page?

    At the moment, I choose my favorite recipe from a selection and display that on the home page. We have noticed that traffic to the rest of the website goes up by as much as 40% if I pick a popular recipe. But I don't know how to decide if a recipe will be popular. More traffic means more subscriptions so this is really important to the company.

    Can your team: - Predict which recipes will lead to high traffic? - Correctly predict high traffic recipes 80% of the time?

    We need to make a decision on this soon, so I need you to present your results to me by the end of the month. Whatever your results, what do you recommend we do next?

    Look forward to seeing your presentation.

    About Tasty Bytes

    Tasty Bytes was founded in 2020 in the midst of the Covid Pandemic. The world wanted inspiration so we decided to provide it. We started life as a search engine for recipes, helping people to find ways to use up the limited supplies they had at home.

    Now, over two years on, we are a fully fledged business. For a monthly subscription we will put together a full meal plan to ensure you and your family are getting a healthy, balanced diet whatever your budget. Subscribe to our premium plan and we will also deliver the ingredients to your door.

    Example Recipe

    This is an example of how a recipe may appear on the website, we haven't included all of the steps but you should get an idea of what visitors to the site see.

    Tomato Soup

    Servings: 4
    Time to make: 2 hours
    Category: Lunch/Snack
    Cost per serving: $

    Nutritional Information (per serving) - Calories 123 - Carbohydrate 13g - Sugar 1g - Protein 4g

    Ingredients: - Tomatoes - Onion - Carrot - Vegetable Stock

    Method: 1. Cut the tomatoes into quarters….

    Data Information

    The product manager has tried to make this easier for us and provided data for each recipe, as well as whether there was high traffic when the recipe was featured on the home page.

    As you will see, they haven't given us all of the information they have about each recipe.

    You can find the data here.

    I will let you decide how to process it, just make sure you include all your decisions in your report.

    Don't forget to double check the data really does match what they say - it might not.

    Column NameDetails
    recipeNumeric, unique identifier of recipe
    caloriesNumeric, number of calories
    carbohydrateNumeric, amount of carbohydrates in grams
    sugarNumeric, amount of sugar in grams
    proteinNumeric, amount of prote...
  5. d

    Website Analytics

    • catalog.data.gov
    • data.nola.gov
    • +4more
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nola.gov (2025). Website Analytics [Dataset]. https://catalog.data.gov/dataset/website-analytics
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.nola.gov
    Description

    This data about nola.gov provides a window into how people are interacting with the the City of New Orleans online. The data comes from a unified Google Analytics account for New Orleans. We do not track individuals and we anonymize the IP addresses of all visitors.

  6. r

    Amazon Daily Traffic Statistics 2025

    • redstagfulfillment.com
    html
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Red Stag Fulfillment (2025). Amazon Daily Traffic Statistics 2025 [Dataset]. https://redstagfulfillment.com/how-many-daily-visits-does-amazon-receive/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 19, 2025
    Dataset authored and provided by
    Red Stag Fulfillment
    Time period covered
    2019 - 2025
    Area covered
    Global
    Variables measured
    Daily website visits, Monthly traffic volume, Geographic distribution, Seasonal traffic patterns, Traffic sources breakdown, Mobile vs desktop traffic split
    Description

    Comprehensive dataset analyzing Amazon's daily website visits, traffic patterns, seasonal trends, and comparative analysis with other ecommerce platforms based on May 2025 data.

  7. Daily website visitors (time series regression)

    • kaggle.com
    zip
    Updated Aug 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bob Nau (2020). Daily website visitors (time series regression) [Dataset]. https://www.kaggle.com/bobnau/daily-website-visitors
    Explore at:
    zip(35736 bytes)Available download formats
    Dataset updated
    Aug 20, 2020
    Authors
    Bob Nau
    Description

    Context

    This file contains 5 years of daily time series data for several measures of traffic on a statistical forecasting teaching notes website whose alias is statforecasting.com. The variables have complex seasonality that is keyed to the day of the week and to the academic calendar. The patterns you you see here are similar in principle to what you would see in other daily data with day-of-week and time-of-year effects. Some good exercises are to develop a 1-day-ahead forecasting model, a 7-day ahead forecasting model, and an entire-next-week forecasting model (i.e., next 7 days) for unique visitors.

    Content

    The variables are daily counts of page loads, unique visitors, first-time visitors, and returning visitors to an academic teaching notes website. There are 2167 rows of data spanning the date range from September 14, 2014, to August 19, 2020. A visit is defined as a stream of hits on one or more pages on the site on a given day by the same user, as identified by IP address. Multiple individuals with a shared IP address (e.g., in a computer lab) are considered as a single user, so real users may be undercounted to some extent. A visit is classified as "unique" if a hit from the same IP address has not come within the last 6 hours. Returning visitors are identified by cookies if those are accepted. All others are classified as first-time visitors, so the count of unique visitors is the sum of the counts of returning and first-time visitors by definition. The data was collected through a traffic monitoring service known as StatCounter.

    Inspiration

    This file and a number of other sample datasets can also be found on the website of RegressIt, a free Excel add-in for linear and logistic regression which I originally developed for use in the course whose website generated the traffic data given here. If you use Excel to some extent as well as Python or R, you might want to try it out on this dataset.

  8. Google Analytics Sample

    • console.cloud.google.com
    Updated Jul 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Obfuscated%20Google%20Analytics%20360%20data&hl=en_GB (2017). Google Analytics Sample [Dataset]. https://console.cloud.google.com/marketplace/product/obfuscated-ga360-data/obfuscated-ga360-data?hl=en_GB
    Explore at:
    Dataset updated
    Jul 15, 2017
    Dataset provided by
    Googlehttp://google.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, in BigQuery. It’s a great way analyze business data and learn the benefits of using BigQuery to analyze Analytics 360 data Learn more about the data The data includes The data is typical of what an ecommerce website would see and includes the following information:Traffic source data: information about where website visitors originate, including data about organic traffic, paid search traffic, and display trafficContent data: information about the behavior of users on the site, such as URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions on the Google Merchandise Store website.Limitations: All users have view access to the dataset. This means you can query the dataset and generate reports but you cannot complete administrative tasks. Data for some fields is obfuscated such as fullVisitorId, or removed such as clientId, adWordsClickInfo and geoNetwork. “Not available in demo dataset” will be returned for STRING values and “null” will be returned for INTEGER values when querying the fields containing no data.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery

  9. g

    Website Metrics

    • gimi9.com
    • datasets.ai
    • +2more
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Website Metrics [Dataset]. https://gimi9.com/dataset/data-gov_website-metrics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    Description

    Per the Federal Digital Government Strategy, the Department of Homeland Security Metrics Plan, and the Open FEMA Initiative, FEMA is providing the following web performance metrics with regards to FEMA.gov.rnrnInformation in this dataset includes total visits, avg visit duration, pageviews, unique visitors, avg pages/visit, avg time/page, bounce ratevisits by source, visits by Social Media Platform, and metrics on new vs returning visitors.rnrnExternal Affairs strives to make all communications accessible. If you have any challenges accessing this information, please contact FEMAWebTeam@fema.dhs.gov.

  10. website_visit_webalizer

    • kaggle.com
    zip
    Updated Mar 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erin ÇOBAN (2024). website_visit_webalizer [Dataset]. https://www.kaggle.com/datasets/erinoban/website-visit-webalizer
    Explore at:
    zip(1082 bytes)Available download formats
    Dataset updated
    Mar 24, 2024
    Authors
    Erin ÇOBAN
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset was obtained from website visit data. These are real data. It contains monthly visit information of the tr-metaverse.com website hosted on Linux. Day Hit Hit% Files Files% Pages Pages% Visit Visit% Sites Sites% Kbytes Kbytes% It consists of fields. Values with a % sign next to them are numbers in percent. 30-day visit data from the beginning of the month to the end of the month. Day: Day index number, which day of the month Hit: How much reach there is in general Hit%: How much access there is overall in percentage Files: How many visits have been made as files Files%: Percentage in files Pages Pages% Visit: Number of unique visitors Visit%: Unique visitor rate sites sites% Kbytes: how much data has been downloaded Kbytes%: percentage in data

  11. d

    Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant

    • datarade.ai
    .csv, .xls
    Updated Jun 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swash (2023). Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant [Dataset]. https://datarade.ai/data-products/swash-blockchain-bitcoin-and-web3-enthusiasts-swash
    Explore at:
    .csv, .xlsAvailable download formats
    Dataset updated
    Jun 27, 2023
    Dataset authored and provided by
    Swash
    Area covered
    Liechtenstein, Russian Federation, Saint Vincent and the Grenadines, Belarus, Jamaica, Uzbekistan, India, Monaco, Jordan, Latvia
    Description

    Unlock the Power of Behavioural Data with GDPR-Compliant Clickstream Insights.

    Swash clickstream data offers a comprehensive and GDPR-compliant dataset sourced from users worldwide, encompassing both desktop and mobile browsing behaviour. Here's an in-depth look at what sets us apart and how our data can benefit your organisation.

    User-Centric Approach: Unlike traditional data collection methods, we take a user-centric approach by rewarding users for the data they willingly provide. This unique methodology ensures transparent data collection practices, encourages user participation, and establishes trust between data providers and consumers.

    Wide Coverage and Varied Categories: Our clickstream data covers diverse categories, including search, shopping, and URL visits. Whether you are interested in understanding user preferences in e-commerce, analysing search behaviour across different industries, or tracking website visits, our data provides a rich and multi-dimensional view of user activities.

    GDPR Compliance and Privacy: We prioritise data privacy and strictly adhere to GDPR guidelines. Our data collection methods are fully compliant, ensuring the protection of user identities and personal information. You can confidently leverage our clickstream data without compromising privacy or facing regulatory challenges.

    Market Intelligence and Consumer Behaviuor: Gain deep insights into market intelligence and consumer behaviour using our clickstream data. Understand trends, preferences, and user behaviour patterns by analysing the comprehensive user-level, time-stamped raw or processed data feed. Uncover valuable information about user journeys, search funnels, and paths to purchase to enhance your marketing strategies and drive business growth.

    High-Frequency Updates and Consistency: We provide high-frequency updates and consistent user participation, offering both historical data and ongoing daily delivery. This ensures you have access to up-to-date insights and a continuous data feed for comprehensive analysis. Our reliable and consistent data empowers you to make accurate and timely decisions.

    Custom Reporting and Analysis: We understand that every organisation has unique requirements. That's why we offer customisable reporting options, allowing you to tailor the analysis and reporting of clickstream data to your specific needs. Whether you need detailed metrics, visualisations, or in-depth analytics, we provide the flexibility to meet your reporting requirements.

    Data Quality and Credibility: We take data quality seriously. Our data sourcing practices are designed to ensure responsible and reliable data collection. We implement rigorous data cleaning, validation, and verification processes, guaranteeing the accuracy and reliability of our clickstream data. You can confidently rely on our data to drive your decision-making processes.

  12. s

    Reports of non-emergency problems submitted by users of Get It Done

    • data.sandiego.gov
    Updated Feb 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Reports of non-emergency problems submitted by users of Get It Done [Dataset]. https://data.sandiego.gov/datasets/get-it-done-311/
    Explore at:
    csv csv is tabular data. excel, google docs, libreoffice calc or any plain text editor will open files with this format. learn moreAvailable download formats
    Dataset updated
    Feb 24, 2022
    Description

    The Get It Done program allows residents and visitors to report certain types of non-emergency problems to the City using the Get It Done mobile app, web app, or by telephone. This dataset contains all Get It Done reports the City has received since the program launched in May 2016. New! We have reorganized the data into a single file of currently open reports and closed reports by year. Users who would prefer to get reports by problem type should refer to the datasets for: 72-hour parking violations Graffiti Illegal Dumping Potholes The scope of this data is limited to information from the reports citizen users submit through Get It Done. The data includes fields for the date and time a report was submitted, what the problem was, the location of the problem, and the date when the user was notified that the City addressed the problem. This data does not include details about any work performed to fix a problem or the date and time work was completed. Reports that are referred outside of the Get It Done system have a status of “Referred”. Please note that this data includes every user-submitted report and should not be considered an official record of City maintenance work. For example, users might submit problems that have already been reported, that are the responsibility of another government agency or private business, that cannot be found or verified, or that are already scheduled to be fixed in a long-term maintenance plan. The details about how the City addressed each report are outside of the scope of this dataset. If you have any questions about this data, please contact pandatech@sandiego.gov. If you have questions about your Get It Done report, please refer to your confirmation email.

  13. d

    GreenThumb Site Visits

    • catalog.data.gov
    • data.cityofnewyork.us
    • +1more
    Updated Nov 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2025). GreenThumb Site Visits [Dataset]. https://catalog.data.gov/dataset/greenthumb-site-visits
    Explore at:
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    data.cityofnewyork.us
    Description

    Data Dictionary: https://docs.google.com/spreadsheets/d/1ItvGzNG8O_Yj97Tf6am4T-QyhnxP-BeIRjm7ZaUeAxs/edit#gid=1499621902 GreenThumb provides programming and material support to over 550 community gardens in New York City. NYC Parks GreenThumb staff visit all active community gardens under the jurisdiction of NYC Parks once each calendar year, subject to staff capacity. These site visits typically occur during the summer months and representatives of licensed garden groups are invited to attend. During these site visits, NYC Parks GreenThumb staff observe and record quantitative and qualitative information related to the physical status of the garden, as well as its ongoing operation, maintenance, and programming. This information is used by NYC Parks GreenThumb to inform maintenance needs at the garden and to help NYC Parks GreenThumb understand the needs of garden groups so that we can plan accordingly. In addition, this information is necessary for NYC Parks GreenThumb to confirm that publicly accessible community gardens under its jurisdiction are being operated in safe manner and in accordance with the NYC Parks GreenThumb License Agreement and applicable NYS and NYC laws and regulations. NYC Parks GreenThumb may conduct additional site visits as deemed necessary.

  14. A web tracking data set of online browsing behavior of 2,148 users

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, txt +1
    Updated Oct 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juhi Kulshrestha; Juhi Kulshrestha; Marcos Oliveira; Marcos Oliveira; Orkut Karacalik; Denis Bonnay; Claudia Wagner; Orkut Karacalik; Denis Bonnay; Claudia Wagner (2025). A web tracking data set of online browsing behavior of 2,148 users [Dataset]. http://doi.org/10.5281/zenodo.4757574
    Explore at:
    zip, txt, application/gzipAvailable download formats
    Dataset updated
    Oct 9, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Juhi Kulshrestha; Juhi Kulshrestha; Marcos Oliveira; Marcos Oliveira; Orkut Karacalik; Denis Bonnay; Claudia Wagner; Orkut Karacalik; Denis Bonnay; Claudia Wagner
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This anonymized data set consists of one month's (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.

    We acknowledge the support of Respondi AG, which provided the web tracking and survey data free of charge for research purposes, with special thanks to François Erner and Luc Kalaora at Respondi for their insights and help with data extraction.

    The data set is analyzed in the following paper:

    • Kulshrestha, J., Oliveira, M., Karacalik, O., Bonnay, D., Wagner, C. "Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data." Proceedings of the International AAAI Conference on Web and Social Media. 2021. https://arxiv.org/abs/2012.15112.

    The code used to analyze the data is also available at https://github.com/gesiscss/web_tracking.

    If you use data or code from this repository, please cite the paper above and the Zenodo link.

    Users are advised that some domains in this data set may link to potentially questionable or inappropriate content. The domains have not been individually reviewed, as content verification was not the primary objective of this data set. Therefore, user discretion is strongly recommended when accessing or scraping any content from these domains.

  15. s

    Traffic Exchange Analysis Dataset 2024

    • sparktraffic.com
    Updated Jun 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SparkTraffic (2024). Traffic Exchange Analysis Dataset 2024 [Dataset]. https://www.sparktraffic.com/blog/reason-not-to-use-traffic-exchanges
    Explore at:
    Dataset updated
    Jun 10, 2024
    Dataset authored and provided by
    SparkTraffic
    Description

    Research data on traffic exchange limitations including low-quality traffic characteristics, search engine penalty risks, and comparison with effective alternatives like SEO and content marketing strategies.

  16. d

    Revenue Generated by Measure ULA

    • catalog.data.gov
    • data.lacity.org
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.lacity.org (2025). Revenue Generated by Measure ULA [Dataset]. https://catalog.data.gov/dataset/revenue-generated-by-measure-ula
    Explore at:
    Dataset updated
    Nov 8, 2025
    Dataset provided by
    data.lacity.org
    Description

    Disclaimer: PLEASE READ THIS AGREEMENT CAREFULLY BEFORE USING THIS DATA SET. BY USING THIS DATA SET, YOU ARE CONSENTING TO BE OBLIGATED AND BECOME A PARTY TO THIS AGREEMENT. IF YOU DO NOT AGREE TO THE TERMS AND CONDITIONS BELOW YOU SHOULD NOT ACCESS OR USE THIS DATA SET. This data set is presented as a public service that provides Internet accessibility to information provided by the City of Los Angeles and to other City, State, and Federal information. Due to the dynamic nature of the information contained within this data set and the data set’s reliance on information from outside sources, the City of Los Angeles does not guarantee the accuracy or reliability of the information transmitted from this data set. This data set and all materials contained on it are distributed and transmitted on an “as is” and “as available” basis without any warranties of any kind, whether expressed or implied, including without limitation, warranties of title or implied warranties of merchantability or fitness for a particular purpose. The City of Los Angeles is not responsible for any special, indirect, incidental, punitive, or consequential damages that may arise from the use of, or the inability to use the data set and/or materials contained on the data set, or that result from mistakes, omissions, interruptions, deletion of files, errors, defects, delays in operation, or transmission, or any failure of performance, whether the material is provided by the City of Los Angeles or a third-party. The City of Los Angeles reserves the right to modify, update, or alter these Terms and Conditions of use at any time. Your continued use of this Site constitutes your agreement to comply with such modifications. The information provided on this data set, and its links to other related web sites, are provided as a courtesy to our web site visitors only, and are in no manner an endorsement, recommendation, or approval of any person, any product, or any service contained on any other web site. Description: Monthly revenue generated by conveyances of real property over $5 million, from when applicable transfer tax collection began on April 1, 2023 to present. Consistent with the ULA ordinance, the property sale value thresholds and their corresponding tax rates will be adjusted annually based on the Bureau of Labor Statistics Chained Consumer Price Index.

  17. a

    Traffic Crashes Resulting in Injury (from DataSF, pulled monthly)

    • hub.arcgis.com
    Updated Nov 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City and County of San Francisco (2025). Traffic Crashes Resulting in Injury (from DataSF, pulled monthly) [Dataset]. https://hub.arcgis.com/maps/a24788281a484e08bd662828b4e0718e
    Explore at:
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    City and County of San Francisco
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Redirect Notice: The website https://transbase.sfgov.org/ is no longer in operation. Visitors to Transbase will be redirected to this page where they can view, visualize, and download Traffic Crash data.A. SUMMARYThis table contains all crashes resulting in an injury in the City of San Francisco. Fatality year-to-date crash data is obtained from the Office of the Chief Medical Examiner (OME) death records, and only includes those cases that meet the San Francisco Vision Zero Fatality Protocol maintained by the San Francisco Department of Public Health (SFDPH), San Francisco Police Department (SFPD), and San Francisco Municipal Transportation Agency (SFMTA). Injury crash data is obtained from SFPD’s Interim Collision System for 2018 through the current year-to-date, Crossroads Software Traffic Collision Database (CR) for years 2013-2017 and the Statewide Integrated Transportation Record System (SWITRS) maintained by the California Highway Patrol for all years prior to 2013. Only crashes with valid geographic information are mapped. All geocodable crash data is represented on the simplified San Francisco street centerline model maintained by the Department of Public Works (SFDPW). Collision injury data is queried and aggregated on a quarterly basis. Crashes occurring at complex intersections with multiple roadways are mapped onto a single point and injury and fatality crashes occurring on highways are excluded.The crash, party, and victim tables have a relational structure. The traffic crashes table contains information on each crash, one record per crash. The party table contains information from all parties involved in the crashes, one record per party. Parties are individuals involved in a traffic crash including drivers, pedestrians, bicyclists, and parked vehicles. The victim table contains information about each party injured in the collision, including any passengers. Injury severity is included in the victim table. For example, a crash occurs (1 record in the crash table) that involves a driver party and a pedestrian party (2 records in the party table). Only the pedestrian is injured and thus is the only victim (1 record in the victim table). To learn more about the traffic injury datasets, see the TIMS documentationB. HOW THE DATASET IS CREATEDTraffic crash injury data is collected from the California Highway Patrol 555 Crash Report as submitted by the police officer within 30 days after the crash occurred. All fields that match the SWITRS data schema are programmatically extracted, de-identified, geocoded, and loaded into TransBASE. See Section D below for details regarding TransBASE. C. UPDATE PROCESSAfter review by SFPD and SFDPH staff, the data is made publicly available approximately a month after the end of the previous quarter (May for Q1, August for Q2, November for Q3, and February for Q4). D. HOW TO USE THIS DATASETThis data is being provided as public information as defined under San Francisco and California public records laws. SFDPH, SFMTA, and SFPD cannot limit or restrict the use of this data or its interpretation by other parties in any way. Where the data is communicated, distributed, reproduced, mapped, or used in any other way, the user should acknowledge TransBASE.sfgov.org as the source of the data, provide a reference to the original data source where also applicable, include the date the data was pulled, and note any caveats specified in the associated metadata documentation provided. However, users should not attribute their analysis or interpretation of this data to the City of San Francisco. While the data has been collected and/or produced for the use of the City of San Francisco, it cannot guarantee its accuracy or completeness. Accordingly, the City of San Francisco, including SFDPH, SFMTA, and SFPD make no representation as to the accuracy of the information or its suitability for any purpose and disclaim any liability for omissions or errors that may be contained therein. As all data is associated with methodological assumptions and limitations, the City recommends that users review methodological documentation associated with the data prior to its analysis, interpretation, or communication.This dataset can also be queried on the TransBASE Dashboard. TransBASE is a geospatially enabled database maintained by SFDPH that currently includes over 200 spatially referenced variables from multiple agencies and across a range of geographic scales, including infrastructure, transportation, zoning, sociodemographic, and collision data, all linked to an intersection or street segment. TransBASE facilitates a data-driven approach to understanding and addressing transportation-related health issues,informed by a large and growing evidence base regarding the importance of transportation system design and land use decisions for health. TransBASE’s purpose is to inform public and private efforts to improve transportation system safety, sustainability, community health and equity in San Francisco.E. RELATED DATASETSTraffic Crashes Resulting in Injury: Parties InvolvedTraffic Crashes Resulting in Injury: Victims InvolvedTransBASE DashboardiSWITRSTIMSData pushed to ArcGIS Online on November 5, 2025 at 4:19 PM by SFGIS.Data from: https://data.sfgov.org/d/ubvf-ztfxDescription of dataset columns:

     unique_id
     unique table row identifier
    
    
     cnn_intrsctn_fkey
     nearest intersection centerline node key
    
    
     cnn_sgmt_fkey
     nearest street centerline segment key (empty if crash occurred at intersection)
    
    
     case_id_pkey
     unique crash report number
    
    
     tb_latitude
     latitude of crash (WGS 84)
    
    
     tb_longitude
     longitude of crash (WGS 84)
    
    
     geocode_source
     geocode source
    
    
     geocode_location
     geocode location
    
    
     collision_datetime
     the date and time when the crash occurred
    
    
     collision_date
     the date when the crash occurred
    
    
     collision_time
     the time when the crash occurred (24 hour time)
    
    
     accident_year
     the year when the crash occurred
    
    
     month
     month crash occurred
    
    
     day_of_week
     day of the week crash occurred
    
    
     time_cat
     generic time categories
    
    
     juris
     jurisdiction
    
    
     officer_id
     officer ID
    
    
     reporting_district
     SFPD reporting district
    
    
     beat_number
     SFPD beat number
    
    
     primary_rd
     the road the crash occurred on
    
    
     secondary_rd
     a secondary reference road that DISTANCE and DIRECT are measured from
    
    
     distance
     offset distance from secondary road
    
    
     direction
     direction of offset distance
    
    
     weather_1
     the weather condition at the time of the crash
    
    
     weather_2
     the weather condition at the time of the crash, if a second description is necessary
    
    
     collision_severity
     the injury level severity of the crash (highest level of injury in crash)
    
    
     type_of_collision
     type of crash
    
    
     mviw
     motor vehicle involved with
    
    
     ped_action
     pedestrian action involved
    
    
     road_surface
     road surface
    
    
     road_cond_1
     road condition
    
    
     road_cond_2
     road condition, if a second description is necessary
    
    
     lighting
     lighting at time of crash
    
    
     control_device
     control device status
    
    
     intersection
     indicates whether the crash occurred in an intersection
    
    
     vz_pcf_code
     California vehicle code primary collision factor violated
    
    
     vz_pcf_group
     groupings of similar vehicle codes violated
    
    
     vz_pcf_description
     description of vehicle code violated
    
    
     vz_pcf_link
     link to California vehicle code section
    
    
     number_killed
     counts victims in the crash with degree of injury of fatal
    
    
     number_injured
     counts victims in the crash with degree of injury of severe, visible, or complaint of pain
    
    
     street_view
     link to Google Streetview
    
    
     dph_col_grp
     generic crash groupings based on parties involved
    
    
     dph_col_grp_description
     description of crash groupings
    
    
     party_at_fault
     party number indicated as being at fault
    
    
     party1_type
     party 1 vehicle type
    
    
     party1_dir_of_travel
     party 1 direction of travel
    
    
     party1_move_pre_acc
     party 1 movement preceding crash
    
    
     party2_type
     party 2 vehicle type (empty if no party 2)
    
    
     party2_dir_of_travel
     party 2 direction of travel (empty if no party 2)
    
    
     party2_move_pre_acc
     party 2 movement preceding crash (empty if no party 2)
    
    
     point
     geometry type of crash location
    
    
     data_as_of
     date data added to the source system
    
    
     data_updated_at
     date data last updated the source system
    
    
     data_loaded_at
     date data last loaded here (in the open data portal)
    
    
     analysis_neighborhood
    
    
    
     supervisor_district
    
    
    
     police_district
    
    
    
     Current Police Districts
     This column was automatically created in order to record in what polygon from the dataset 'Current Police Districts' (qgnn-b9vv) the point in column 'point' is located. This enables the creation of region maps (choropleths) in the visualization canvas and data lens.
    
    
     Current Supervisor Districts
     This column was automatically created in order to record in what polygon from the dataset 'Current Supervisor Districts' (26cr-cadq) the point in column 'point' is located. This
    
  18. MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vanessa Isabell Jurtz; Julia Villarroel; Ole Lund; Mette Voldby Larsen; Morten Nielsen (2023). MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets [Dataset]. http://doi.org/10.1371/journal.pone.0163111
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Vanessa Isabell Jurtz; Julia Villarroel; Ole Lund; Mette Voldby Larsen; Morten Nielsen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.

  19. Virginia Springs/Groundwater Layers - 2023

    • data.virginia.gov
    • hub.arcgis.com
    • +3more
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Environmental Quality (2025). Virginia Springs/Groundwater Layers - 2023 [Dataset]. https://data.virginia.gov/dataset/virginia-springs-groundwater-layers-2023
    Explore at:
    html, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Jul 29, 2025
    Dataset authored and provided by
    Virginia Department of Environmental Qualityhttps://deq.virginia.gov/
    Area covered
    Hot Springs
    Description
    The VDEQ Spring SITES database contains data describing the geographic locations and site attributes of natural springs throughout the commonwealth. This data coverage continues to evolve and contains only spring locations known to exist with a reasonable degree of certainty on the date of publication. The dataset does not replace site specific inventorying or receptor surveys but can be used as a starting point. VDEQ's initial geospatial dataset of approximately 325 springs was formed in 2008 by digitizing historical spring information sheets created by State Water Control Board geologists in the 1970s through early 1990s. Additional data has been consolidated from the EPA STORET database, the U.S. Geological Survey's Ground Water Site Inventory (GWSI) and Geographic Names Inventory System (GNIS), the Virginia Department of Health SDWIS database, the Virginia DEQ Virginia Water Use Data Set (VWUDS), the Commonwealth of Virginia Division of Water Resources and Power Bulletin No. 1: "Springs of Virginia" by Collins et al., 1930 as well as several VDWR&P Surface Water Supply bulletins from the 1940's - 1950's. A 1992 Virginia Department of Game and Inland Fisheries / Virginia Tech sponsored study by Helfrich et al. titled "Evaluation of the Natural Springs of Virginia: Fisheries Management Implications", a 2004 Rockbridge County groundwater resources report written by Frits van der Leeden, and several smaller datasets from consultants and citizens were evaluated and added to the database when confidence in locational accuracy was high or could be verified with aerial or LIDAR imagery. Significant contributions have been made throughout the years by VDEQ Groundwater Characterization staff site visits as well as other geologists working in the region including: Matt Heller at Virginia Division of Geology and Mineral Resources (VDMME), Wil Orndorff at the Virginia Department of Conservation and Recreation Karst Program (VDCR), and David Nelms and Dan Doctor of the U.S. Geological Survey (USGS). Substantial effort has been made to improve locational accuracy and remove duplication present between data sources. Hundreds of spring locations that were originally obtained using topographic maps or unknown methods were updated to sub-meter locational accuracy using post-processed differential GPS (PPGPS) and through the use of several generations of aerial imagery (2002-2017) obtained from Virginia's Geographic Information Network (VGIN) and 1-meter LIDAR, where available. Scores of new spring locations were also obtained by systematic quadrangle by quadrangle analysis in areas of the Shenandoah Valley where 1-meter LIDAR datasets where obtained from the U.S. Geological Survey. Future improvements to the dataset will result when statewide 1-meter LIDAR datasets becomes available and through continued field work by DEQ staff and other contributors working in the region. Please do not hesitate to contact the author to correct mistakes or to contribute to the database.

    The VDEQ Spring FIELD MEASUREMENTS database contains data describing field derived physio-chemical properties of spring discharges measured throughout the Commonwealth of Virginia. Field visits compiled in this dataset were performed from 1928 to 2019 by geologists with the State Water Control Board, the Virginia Division of Water and Power, the Virginia Department of Environmental Quality, and the U.S. Geological Survey with contributions from other sources as noted. Values of -9999 indicate that measurements were not performed for the referenced parameter. Please do not hesitate to contact the author to add data to the database or correct errors.


    The VDEQ_Spring_WQ database is a geodatabase containing groundwater sample information collected from springs throughout Virginia. Sample specific information include: location and site information, measured field parameters, and lab verified quantifications of major ionic concentrations, trace element concentrations, nutrient concentrations, and radiological data. The VDEQ_Spring_WQ database is a subset of the VDEQ GWCHEM database which is a flat-file geodatabase containing groundwater sample information from groundwater wells and springs throughout Virginia. Sample information has been correlated via DEQ Well # and projected using coordinates in VDEQ_Spring_SITES database. The GWCHEM database is comprised of historic groundwater sample data originally archived in the United States Geological Survey (USGS) National Water Information System (NWIS) and the Environmental Protection Agency (EPA) Storage and Retrieval (STORET) data warehouse. Archived STORET data originated as groundwater sample data collected and uploaded by Virginia State Water Control Board Personnel. While groundwater sample data in the STORET data warehouse are static, new groundwater sample data are periodically uploaded to NWIS and spring laboratory WQ data reflect NWIS downloaded on 9/30/2019. Recent groundwater sample data collected by Virginia Department of Environmental Quality (DEQ) personnel as part of the Ambient Groundwater Sampling Program are entered into the database as lab results are made available by the Division of Consolidated Laboratory Services (DCLS). When possible, charge balances were calculated for samples with reported values for major ions including (at a minimum) calcium, magnesium, potassium, sodium, bicarbonate, chloride, and sulfate. Reported values for Nitrate as N, carbonate, and fluoride were included in the charge balance calculation when available. Field determined values for bicarbonate and carbonate were used in the charge balance calculation when available. For much of the legacy DEQ groundwater sample data, bicarbonate values were derived from lab reported values of alkalinity (as mg/CaCO3) under the assumption that there was no contribution by carbonate to the reported alkalinity value. Charge balance values are reported in the "Charge Balance" column of the GWCHEM geodatabase. The closer the charge balance value is to unity (1), the lower the assumed charge balance error.In order to preserve the numerical capabilities of the database, non- numeric lab qualifiers were given the following numeric identifiers:- (minus sign) = less than the concentration specified to the right of the sign-11110 = estimated-22220 = presence verified but not quantified-33330 = radchem non-detect, below sslc-4440 = analyzed for but not detected-55550 = greater than the concentration to the right of the zero-66660 = sample held beyond normal holding time-77770 = quality control failure. Data not valid.-88880 = sample held beyond normal holding time. Sample analyzed for but not detected. Value stored is limit of detection for proces in use.-11120 = Value reported is less than the criteria of detection.-9999 = no data (parameter not quantified)

    A more in depth descprition and hydrogeologic analysis of the database can be found here
    An in Depth data fact sheet can be found here
  20. a

    Water Supply Node

    • data-waikatolass.opendata.arcgis.com
    • hub.arcgis.com
    Updated Jun 4, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Co-Lab Waikato Data Portal (2021). Water Supply Node [Dataset]. https://data-waikatolass.opendata.arcgis.com/datasets/water-supply-node/about
    Explore at:
    Dataset updated
    Jun 4, 2021
    Dataset authored and provided by
    Co-Lab Waikato Data Portal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Asset inventory data for a variety of structures and infrastructure relating to water systems or drainage in urban areas. The features in this dataset are measured by length and represent linear features such as pipe networks or open drains. The information is extracted from the asset inventory database on a daily basis. Items identified have been geolocated over a long period of time and through various methods, including information provided by 3rd parties. In general, asset locations are obtained from as built diagrams and as such may not be validated in all circumstances. The asset inventory is frequently updated and modification can be made to the asset data structure (asset hierarchy) without prior notification. Due to a wide range of source information all asset locations should be verified through the Asset Information Officers and or site visits. This is an incomplete dataset, other information is held and maintained independently.The primary purpose of this inventory is for asset valuations. The inventory is utilised in forward works and capital work planning. Information on Water Supply assets for service requests is displayed on 3 Waters map. The Water Supply network is an integral part of the land use and consents process, however site visits should be done to validate the status, position and condition of assets.Waikato OneView does not make any representation or give any warranty as to the accuracy or exhaustiveness of the data released for public download. Locations and dimensions of assets depicted in the data may not be accurate due to circumstances not notified to Council. While you are free to crop, export and re-purpose the data, we ask that you attribute the Waikato OneView and clearly state that your work is a derivative and not the authoritative data source.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
AnthonyTherrien (2024). Website Traffic [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/website-traffic/discussion
Organization logo

Website Traffic

Website Traffic and User Engagement Metrics

Explore at:
zip(65228 bytes)Available download formats
Dataset updated
Aug 5, 2024
Authors
AnthonyTherrien
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Dataset Overview

This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.

Dataset Description

  • Page Views: The number of pages viewed during a session.
  • Session Duration: The total duration of the session in minutes.
  • Bounce Rate: The percentage of visitors who navigate away from the site after viewing only one page.
  • Traffic Source: The origin of the traffic (e.g., Organic, Social, Paid).
  • Time on Page: The amount of time spent on the specific page.
  • Previous Visits: The number of previous visits by the same visitor.
  • Conversion Rate: The percentage of visitors who completed a desired action (e.g., making a purchase).

Data Summary

  • Total Records: 2000
  • Total Features: 7

Key Features

  1. Page Views: This feature indicates the engagement level of the visitors by showing how many pages they visit during their session.
  2. Session Duration: This feature measures the length of time a visitor stays on the website, which can indicate the quality of the content.
  3. Bounce Rate: A critical metric for understanding user behavior. A high bounce rate may indicate that visitors are not finding what they are looking for.
  4. Traffic Source: Understanding where your traffic comes from can help in optimizing marketing strategies.
  5. Time on Page: This helps in analyzing which pages are retaining visitors' attention the most.
  6. Previous Visits: This can be used to analyze the loyalty of visitors and the effectiveness of retention strategies.
  7. Conversion Rate: The ultimate metric for measuring the effectiveness of the website in achieving its goals.

Usage

This dataset can be used for various analyses such as:

  • Identifying key drivers of engagement and conversion.
  • Analyzing the effectiveness of different traffic sources.
  • Understanding user behavior patterns and optimizing the website accordingly.
  • Improving marketing strategies based on traffic source performance.
  • Enhancing user experience by analyzing time spent on different pages.

Acknowledgments

This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.

Search
Clear search
Close search
Google apps
Main menu