37 datasets found

Website Traffic
kaggle.com
zip
Updated Aug 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). Website Traffic [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/website-traffic/discussion
Explore at:
zip(65228 bytes)Available download formats
Dataset updated
Aug 5, 2024
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview

This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.

Dataset Description

Page Views: The number of pages viewed during a session.

Session Duration: The total duration of the session in minutes.

Bounce Rate: The percentage of visitors who navigate away from the site after viewing only one page.

Traffic Source: The origin of the traffic (e.g., Organic, Social, Paid).

Time on Page: The amount of time spent on the specific page.

Previous Visits: The number of previous visits by the same visitor.

Conversion Rate: The percentage of visitors who completed a desired action (e.g., making a purchase).

Data Summary

Total Records: 2000

Total Features: 7

Key Features

Page Views: This feature indicates the engagement level of the visitors by showing how many pages they visit during their session.

Session Duration: This feature measures the length of time a visitor stays on the website, which can indicate the quality of the content.

Bounce Rate: A critical metric for understanding user behavior. A high bounce rate may indicate that visitors are not finding what they are looking for.

Traffic Source: Understanding where your traffic comes from can help in optimizing marketing strategies.

Time on Page: This helps in analyzing which pages are retaining visitors' attention the most.

Previous Visits: This can be used to analyze the loyalty of visitors and the effectiveness of retention strategies.

Conversion Rate: The ultimate metric for measuring the effectiveness of the website in achieving its goals.

Usage

This dataset can be used for various analyses such as:

Identifying key drivers of engagement and conversion.

Analyzing the effectiveness of different traffic sources.

Understanding user behavior patterns and optimizing the website accordingly.

Improving marketing strategies based on traffic source performance.

Enhancing user experience by analyzing time spent on different pages.

Acknowledgments

This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.
Google Analytics Sample
kaggle.com
zip
Updated Sep 19, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Sep 19, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Googlehttp://google.com/
Authors
Google BigQuery
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.

Content

The sample dataset contains Google Analytics 360 data from the Google Merchandise Store, a real ecommerce store. The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website. It includes the following kinds of information:

Traffic source data: information about where website visitors originate. This includes data about organic traffic, paid search traffic, display traffic, etc. Content data: information about the behavior of users on the site. This includes the URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions that occur on the Google Merchandise Store website.

Fork this kernel to get started.

Acknowledgements

Data from: https://bigquery.cloud.google.com/table/bigquery-public-data:google_analytics_sample.ga_sessions_20170801

Banner Photo by Edho Pratama from Unsplash.

Inspiration

What is the total number of transactions generated per device browser in July 2017?

The real bounce rate is defined as the percentage of visits with a single pageview. What was the real bounce rate per traffic source?

What was the average number of product pageviews for users who made a purchase in July 2017?

What was the average number of product pageviews for users who did not make a purchase in July 2017?

What was the average total transactions per user that made a purchase in July 2017?

What is the average amount of money spent per session in July 2017?

What is the sequence of pages viewed?
Daily website visitors (time series regression)
kaggle.com
zip
Updated Aug 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bob Nau (2020). Daily website visitors (time series regression) [Dataset]. https://www.kaggle.com/bobnau/daily-website-visitors
Explore at:
zip(35736 bytes)Available download formats
Dataset updated
Aug 20, 2020
Authors
Bob Nau
Description
Context

This file contains 5 years of daily time series data for several measures of traffic on a statistical forecasting teaching notes website whose alias is statforecasting.com. The variables have complex seasonality that is keyed to the day of the week and to the academic calendar. The patterns you you see here are similar in principle to what you would see in other daily data with day-of-week and time-of-year effects. Some good exercises are to develop a 1-day-ahead forecasting model, a 7-day ahead forecasting model, and an entire-next-week forecasting model (i.e., next 7 days) for unique visitors.

Content

The variables are daily counts of page loads, unique visitors, first-time visitors, and returning visitors to an academic teaching notes website. There are 2167 rows of data spanning the date range from September 14, 2014, to August 19, 2020. A visit is defined as a stream of hits on one or more pages on the site on a given day by the same user, as identified by IP address. Multiple individuals with a shared IP address (e.g., in a computer lab) are considered as a single user, so real users may be undercounted to some extent. A visit is classified as "unique" if a hit from the same IP address has not come within the last 6 hours. Returning visitors are identified by cookies if those are accepted. All others are classified as first-time visitors, so the count of unique visitors is the sum of the counts of returning and first-time visitors by definition. The data was collected through a traffic monitoring service known as StatCounter.

Inspiration

This file and a number of other sample datasets can also be found on the website of RegressIt, a free Excel add-in for linear and logistic regression which I originally developed for use in the course whose website generated the traffic data given here. If you use Excel to some extent as well as Python or R, you might want to try it out on this dataset.
r
Walmart.com Daily Traffic Statistics 2025
redstagfulfillment.com
html
Updated May 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Red Stag Fulfillment (2025). Walmart.com Daily Traffic Statistics 2025 [Dataset]. https://redstagfulfillment.com/how-many-daily-visits-does-walmart-receive/
Explore at:
htmlAvailable download formats
Dataset updated
May 19, 2025
Dataset authored and provided by
Red Stag Fulfillment
Time period covered
2020 - 2025
Area covered
United States
Variables measured
Daily website visits, Session duration metrics, Traffic source breakdown, Geographic traffic patterns, Seasonal traffic variations, Mobile vs desktop traffic distribution
Description
Comprehensive dataset analyzing Walmart.com's daily website traffic, including 16.7 million daily visits, device distribution, geographic patterns, and competitive benchmarking data.

Recipe Site Traffic: Analysis & Prediction

kaggle.com

Updated Sep 21, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Michael Matta (2025). Recipe Site Traffic: Analysis & Prediction [Dataset]. https://www.kaggle.com/datasets/michaelmatta0/recipe-site-traffic-analysis-and-prediction

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 21, 2025

Dataset provided by

Kaggle

Authors

Michael Matta

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

This dataset originates from DataCamp. Many users have reposted copies of the CSV on Kaggle, but most of those uploads omit the original instructions, business context, and problem framing. In this upload, I’ve included that missing context in the About Dataset so the reader of my notebook or any other notebook can fully understand how the data was intended to be used and the intended problem framing.

Note: I have also uploaded a visualization of the workflow I personally took to tackle this problem, but it is not part of the dataset itself. Additionally, I created a PowerPoint presentation based on my work in the notebook, which you can download from here:
PPTX Presentation

Recipe Site Traffic

From: Head of Data Science
Received: Today
Subject: New project from the product team

Hey!

I have a new project for you from the product team. Should be an interesting challenge. You can see the background and request in the email below.

I would like you to perform the analysis and write a short report for me. I want to be able to review your code as well as read your thought process for each step. I also want you to prepare and deliver the presentation for the product team - you are ready for the challenge!

They want us to predict which recipes will be popular 80% of the time and minimize the chance of showing unpopular recipes. I don't think that is realistic in the time we have, but do your best and present whatever you find.

You can find more details about what I expect you to do here. And information on the data here.

I will be on vacation for the next couple of weeks, but I know you can do this without my support. If you need to make any decisions, include them in your work and I will review them when I am back.

Good Luck!

From: Product Manager - Recipe Discovery
To: Head of Data Science
Received: Yesterday
Subject: Can you help us predict popular recipes?

Hi,

We haven't met before but I am responsible for choosing which recipes to display on the homepage each day. I have heard about what the data science team is capable of and I was wondering if you can help me choose which recipes we should display on the home page?

At the moment, I choose my favorite recipe from a selection and display that on the home page. We have noticed that traffic to the rest of the website goes up by as much as 40% if I pick a popular recipe. But I don't know how to decide if a recipe will be popular. More traffic means more subscriptions so this is really important to the company.

Can your team: - Predict which recipes will lead to high traffic? - Correctly predict high traffic recipes 80% of the time?

We need to make a decision on this soon, so I need you to present your results to me by the end of the month. Whatever your results, what do you recommend we do next?

Look forward to seeing your presentation.

About Tasty Bytes

Tasty Bytes was founded in 2020 in the midst of the Covid Pandemic. The world wanted inspiration so we decided to provide it. We started life as a search engine for recipes, helping people to find ways to use up the limited supplies they had at home.

Now, over two years on, we are a fully fledged business. For a monthly subscription we will put together a full meal plan to ensure you and your family are getting a healthy, balanced diet whatever your budget. Subscribe to our premium plan and we will also deliver the ingredients to your door.

Example Recipe

This is an example of how a recipe may appear on the website, we haven't included all of the steps but you should get an idea of what visitors to the site see.

Tomato Soup

Servings: 4
Time to make: 2 hours
Category: Lunch/Snack
Cost per serving: $

Nutritional Information (per serving) - Calories 123 - Carbohydrate 13g - Sugar 1g - Protein 4g

Ingredients: - Tomatoes - Onion - Carrot - Vegetable Stock

Method: 1. Cut the tomatoes into quarters….

Data Information

The product manager has tried to make this easier for us and provided data for each recipe, as well as whether there was high traffic when the recipe was featured on the home page.

As you will see, they haven't given us all of the information they have about each recipe.

You can find the data here.

I will let you decide how to process it, just make sure you include all your decisions in your report.

Don't forget to double check the data really does match what they say - it might not.

Column Name	Details
recipe	Numeric, unique identifier of recipe
calories	Numeric, number of calories
carbohydrate	Numeric, amount of carbohydrates in grams
sugar	Numeric, amount of sugar in grams
protein	Numeric, amount of prote...

Google Analytics Sample
console.cloud.google.com
Updated Jul 15, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:Obfuscated%20Google%20Analytics%20360%20data&hl=en_GB (2017). Google Analytics Sample [Dataset]. https://console.cloud.google.com/marketplace/product/obfuscated-ga360-data/obfuscated-ga360-data?hl=en_GB
Explore at:
Dataset updated
Jul 15, 2017
Dataset provided by
Googlehttp://google.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, in BigQuery. It’s a great way analyze business data and learn the benefits of using BigQuery to analyze Analytics 360 data Learn more about the data The data includes The data is typical of what an ecommerce website would see and includes the following information:Traffic source data: information about where website visitors originate, including data about organic traffic, paid search traffic, and display trafficContent data: information about the behavior of users on the site, such as URLs of pages that visitors look at, how they interact with content, etc. Transactional data: information about the transactions on the Google Merchandise Store website.Limitations: All users have view access to the dataset. This means you can query the dataset and generate reports but you cannot complete administrative tasks. Data for some fields is obfuscated such as fullVisitorId, or removed such as clientId, adWordsClickInfo and geoNetwork. “Not available in demo dataset” will be returned for STRING values and “null” will be returned for INTEGER values when querying the fields containing no data.This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery
r
Amazon Daily Traffic Statistics 2025
redstagfulfillment.com
html
Updated May 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Red Stag Fulfillment (2025). Amazon Daily Traffic Statistics 2025 [Dataset]. https://redstagfulfillment.com/how-many-daily-visits-does-amazon-receive/
Explore at:
htmlAvailable download formats
Dataset updated
May 19, 2025
Dataset authored and provided by
Red Stag Fulfillment
Time period covered
2019 - 2025
Area covered
Global
Variables measured
Daily website visits, Monthly traffic volume, Geographic distribution, Seasonal traffic patterns, Traffic sources breakdown, Mobile vs desktop traffic split
Description
Comprehensive dataset analyzing Amazon's daily website visits, traffic patterns, seasonal trends, and comparative analysis with other ecommerce platforms based on May 2025 data.
s
Traffic Exchange Analysis Dataset 2024
sparktraffic.com
Updated Jun 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SparkTraffic (2024). Traffic Exchange Analysis Dataset 2024 [Dataset]. https://www.sparktraffic.com/blog/reason-not-to-use-traffic-exchanges
Explore at:
Dataset updated
Jun 10, 2024
Dataset authored and provided by
SparkTraffic
Description
Research data on traffic exchange limitations including low-quality traffic characteristics, search engine penalty risks, and comparison with effective alternatives like SEO and content marketing strategies.
s
Ardgillan Demesne Traffic Data 2018-2023 FCC - Dataset - data.smartdublin.ie...
data.smartdublin.ie
Updated Nov 9, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Ardgillan Demesne Traffic Data 2018-2023 FCC - Dataset - data.smartdublin.ie [Dataset]. https://data.smartdublin.ie/dataset/ardgillan-demesne-traffic-data-2018-2023-fcc2
Explore at:
Dataset updated
Nov 9, 2021
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Ardgillan Demesne
Description
Data on Traffic Volume entering to Ardgillan Demesne - 2018 to 2023 - see new 2024 onward data setArdgillan park is unique among Dublin’s regional parks for the magnificent views it enjoys of the coastline. A panorama, taking in Rockabill Lighthouse, Colt Church, Shenick and Lambay Islands may be seen, including Sliabh Foy, the highest of the Cooley Mountains, and of course the Mourne Mountains can be seen sweeping down to the sea.The park area is the property of Fingal County Council and was opened to the public as a regional park in June 1985. Preliminary works were carried out prior to the opening in order to transform what had been an arable farm, into a public park. Five miles of footpaths were provided throughout the demesne, some by opening old avenues, while others were newly constructed. They now provide a system of varied and interesting woodland, walks and vantage points from which to enjoy breath-taking views of the sea, the coastline and surrounding countryside. A signposted cycle route through the park since June 2009 means that cyclists can share the miles of walking paths with pedestriansAttractions within the DemesnePlay GroundRose GardensFair TrailPollinator Areas ( Approx. 40 Acres on whole Demesne)CafeCycle Track Walking Routes See further details on web site www.ardgillancastle.ie/
Personal Ecommerce Website Ad cost & viewer count
kaggle.com
zip
Updated Apr 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Micheal_Knight (2025). Personal Ecommerce Website Ad cost & viewer count [Dataset]. https://www.kaggle.com/datasets/michealknight/personal-ecommerce-website-ad-cost-and-viewer-count
Explore at:
zip(29323 bytes)Available download formats
Dataset updated
Apr 18, 2025
Authors
Micheal_Knight
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
📊 Dataset Description: Daily Website Traffic and Engagement Metrics

This dataset contains daily web traffic and user engagement information for a live website, recorded over an extended period. It provides a comprehensive view of how user activity on the platform varies in response to marketing initiatives and temporal factors such as weekends and holidays.

The dataset is particularly suited for time series forecasting, seasonality analysis, and marketing effectiveness studies. It is valuable for both academic and practical applications in fields such as digital analytics, marketing strategy, and predictive modeling.

🧾 Use Case Scenarios:

Forecasting future page views using past behavior and external influencing factors

Evaluating the impact of advertising spend on web traffic and ROI

Detecting seasonality and weekly/cyclical patterns in user engagement

Developing time-aware models for resource planning (e.g., server load, content drops)

Training and benchmarking time series models such as ARIMA, SARIMA, RNN, LSTM, and GRU
RÉ Logs Dataset
zenodo.org
Updated Oct 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mac Aodhgáin Pádraig; Mac Aodhgáin Pádraig (2025). RÉ Logs Dataset [Dataset]. http://doi.org/10.5281/zenodo.17249231
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.17249231
Dataset updated
Oct 2, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mac Aodhgáin Pádraig; Mac Aodhgáin Pádraig
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Collation of data from Radio Éireann log books, at RTÉ, Donnybrook, Dublin 4.

Dataset originally created 2016 UPDATE: Packaged on 02/10/2025

I. About this Data Set

This data set is a result of close reading conducted by Patrick Egan (Pádraig Mac Aodhgáin) at Radio Teilifís Éireann log books relating to Seán Ó Riada.

Research was conducted between 2014-2018. It contains a combination of metadata from searches of the Boole Library catalogue and Seán Ó Riada Collection finding aid (or "descriptive list"), relating to music-related projects that were involving Seán Ó Riada. The PhD project was published in 2020, entitled, “Exploring ethnography and digital visualisation: a study of musical practice through the contextualisation of music related projects from the Seán Ó Riada Collection”, and a full listing of radio broadcasts is added to the dataset named "The Ó Riada Projects" at https://doi.org/10.5281/zenodo.15348617

You are invited to use and re-use this data with appropriate attribution.

The "RÉ Logs Dataset" dataset consists of 90 rows.

II. What’s included? This data set includes:

A search of log books of radio broadcasts to find all instances of shows that involved Seán Ó Riada.

III. How Was It Created? These data were created by daily visits to Radio Teilifís Éireann in Dublin, Ireland.

IV. Data Set Field Descriptions

Column headings have not been added to the dataset.

Column A - blank
Column B - type of broadcast
Column C - blank
Column D - date of broadcast
Column E - blank
Column F - blank
Column G - blank
Column H - blank
Column I - description of broadcast
Column J - blank
Column K - blank
Column J - length of broadcast

V. Rights statement The text in this data set was created by the researcher and can be used in many different ways under creative commons with attribution. All contributions to this PhD project are released into the public domain as they are created. Anyone is free to use and re-use this data set in any way they want, provided reference is given to the creator of this dataset.

VI. Creator and Contributor Information

Creator: Patrick Egan (Pádraig Mac Aodhgáin)

VII. Contact Information Please direct all questions and comments to Patrick Egan via his website at www.patrickegan.org. You can also get in touch with the Library via UCC website.
g
Michigan Public Policy Survey Restricted Use Datasets
datasearch.gesis.org
Updated Aug 27, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Center for Local, State, and Urban Policy (2016). Michigan Public Policy Survey Restricted Use Datasets [Dataset]. http://doi.org/10.3886/E55175V2
Explore at:
Unique identifier
https://doi.org/10.3886/E55175V2
Dataset updated
Aug 27, 2016
Dataset provided by
da|ra (Registration agency for social science and economic data)
Authors
Center for Local, State, and Urban Policy
Area covered
Michigan
Description
The Michigan Public Policy Survey (MPPS) is a program of state-wide surveys of local government leaders in Michigan. The MPPS is designed to fill an important information gap in the policymaking process. While there are ongoing surveys of the business community and of the citizens of Michigan, before the MPPS there were no ongoing surveys of local government officials that were representative of all general purpose local governments in the state. Therefore, while we knew the policy priorities and views of the state's businesses and citizens, we knew very little about the views of the local officials who are so important to the economies and community life throughout Michigan. The MPPS was launched in 2009 by the Center for Local, State, and Urban Policy (CLOSUP) at the University of Michigan and is conducted in partnership with the Michigan Association of Counties, Michigan Municipal League, and Michigan Townships Association. The associations provide CLOSUP with contact information for the survey's respondents, and consult on survey topics. CLOSUP makes all decisions on survey design, data analysis, and reporting, and receives no funding support from the associations. The surveys investigate local officials' opinions and perspectives on a variety of important public policy issues and solicit factual information about their localities relevant to policymaking. Over time, the program has covered issues such as fiscal, budgetary and operational policy, fiscal health, public sector compensation, workforce development, local-state governmental relations, intergovernmental collaboration, economic development strategies and initiatives such as placemaking and economic gardening, the role of local government in environmental sustainability, energy topics such as hydraulic fracturing ("fracking") and wind power, trust in government, views on state policymaker performance, opinions on the impacts of the Federal Stimulus Program (ARRA), and more. The program will investigate many other issues relevant to local and state policy in the future. A searchable database of every question the MPPS has asked is available on CLOSUP's website. Results of MPPS surveys are currently available as reports, and via online data tables. The MPPS datasets are being released in two forms: public-use datasets and restricted-use datasets. Unlike the public-use datasets, the restricted-use datasets represent full MPPS survey waves, and include all of the survey questions from a wave. Restricted-use datasets also allow for multiple waves to be linked together for longitudinal analysis. The MPPS staff do still modify these restricted-use datasets to remove jurisdiction and respondent identifiers and to recode other variables in order to protect confidentiality. However, it is theoretically possible that a researcher might be able, in some rare cases, to use enough variables from a full dataset to identify a unique jurisdiction, so access to these datasets is restricted and approved on a case-by-case basis. CLOSUP encourages researchers interested in the MPPS to review the codebooks included in this data collection to see the full list of variables including those not found in the public-use datasets, and to explore the MPPS data using the public-use datasets. On 2016-08-20, the openICPSR web site was moved to new software. In the migration process, some projects were not published in the new system because the decisions made in the old site did not map easily to the new setup. This project is temporarily available as restricted data while ICPSR verifies that all files were migrated correctly.
d
Revenue Generated by Measure ULA
catalog.data.gov
data.lacity.org
Updated Nov 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.lacity.org (2025). Revenue Generated by Measure ULA [Dataset]. https://catalog.data.gov/dataset/revenue-generated-by-measure-ula
Explore at:
Dataset updated
Nov 8, 2025
Dataset provided by
data.lacity.org
Description
Disclaimer: PLEASE READ THIS AGREEMENT CAREFULLY BEFORE USING THIS DATA SET. BY USING THIS DATA SET, YOU ARE CONSENTING TO BE OBLIGATED AND BECOME A PARTY TO THIS AGREEMENT. IF YOU DO NOT AGREE TO THE TERMS AND CONDITIONS BELOW YOU SHOULD NOT ACCESS OR USE THIS DATA SET. This data set is presented as a public service that provides Internet accessibility to information provided by the City of Los Angeles and to other City, State, and Federal information. Due to the dynamic nature of the information contained within this data set and the data set’s reliance on information from outside sources, the City of Los Angeles does not guarantee the accuracy or reliability of the information transmitted from this data set. This data set and all materials contained on it are distributed and transmitted on an “as is” and “as available” basis without any warranties of any kind, whether expressed or implied, including without limitation, warranties of title or implied warranties of merchantability or fitness for a particular purpose. The City of Los Angeles is not responsible for any special, indirect, incidental, punitive, or consequential damages that may arise from the use of, or the inability to use the data set and/or materials contained on the data set, or that result from mistakes, omissions, interruptions, deletion of files, errors, defects, delays in operation, or transmission, or any failure of performance, whether the material is provided by the City of Los Angeles or a third-party. The City of Los Angeles reserves the right to modify, update, or alter these Terms and Conditions of use at any time. Your continued use of this Site constitutes your agreement to comply with such modifications. The information provided on this data set, and its links to other related web sites, are provided as a courtesy to our web site visitors only, and are in no manner an endorsement, recommendation, or approval of any person, any product, or any service contained on any other web site. Description: Monthly revenue generated by conveyances of real property over $5 million, from when applicable transfer tax collection began on April 1, 2023 to present. Consistent with the ULA ordinance, the property sale value thresholds and their corresponding tax rates will be adjusted annually based on the Bureau of Labor Statistics Chained Consumer Price Index.
d
Top-1000 HHS Open Data Resources
catalog.data.gov
data.virginia.gov
+1more
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Chief Data Officer (2025). Top-1000 HHS Open Data Resources [Dataset]. https://catalog.data.gov/dataset/top-1000-hhs-open-data-resources
Explore at:
Dataset updated
Jul 30, 2025
Dataset provided by
Office of Chief Data Officer
Description
HHS responsibly shares “open by default” data with the public to democratize access to information, demystify the Department, and increase transparency through data sharing. HHS Open Data is non-sensitive data, meaning thousands of health and human services datasets are publicly available to fuel new business models, enable emerging technologies like AI, accelerate scientific discoveries, and inspire American innovation. This top-1000 HHS Open Data websites and resources page, dynamically generated from the Digital Analytics Program (DAP) provided by the U.S. General Services Administration (GSA), is driven by near-real-time user demand. GSA’s DAP helps federal agencies and the public see how visitors find, access, and use government websites, data, and services online. The below list filters DAP for only resources from HHS and includes all HHS Divisions. You may filter by individual HHS Divisions and columns.
b
The PanAf-FGBG Dataset - Datasets - data.bris
data.bris.ac.uk
Updated Jul 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). The PanAf-FGBG Dataset - Datasets - data.bris [Dataset]. https://data.bris.ac.uk/data/dataset/3g8dm9c6z4tfm2ht6c7l0t43ul
Explore at:
Dataset updated
Jul 22, 2025
Description
DESCRIPTION. The PanAf-FGBG dataset comprises behaviour-annotated video footage of wild chimpanzees from more than 350 camera locations across tropical Africa, collected by the Pan African Programme: The Cultured Chimpanzee. It includes paired foreground (with chimpanzees) and background (without chimpanzees) videos, allowing controlled analysis of background influence on behaviour recognition models. The dataset is split into overlapping and disjoint camera location views to support evaluation under both in-distribution and out-of-distribution conditions. Each entry is accompanied by metadata and multi-label annotations for 14 distinct behaviours, enabling robust model training and testing. This resource aims to enhance AI models for wildlife behaviour understanding and supports broader conservation efforts for endangered great ape species. CITATION. When using this data please cite this dataset deposit and the associated paper where the dataset and baselines are explained in detail: "The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition" published in the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) available here: https://openaccess.thecvf.com/content/CVPR2025/papers/Brookes_The_PanAf-FGBG_Dataset_Understanding_the_Impact_of_Backgrounds_in_Wildlife_CVPR_2025_paper.pdf. For BIBTEX citation details please see the project website: https://obrookes.github.io/panaf-fgbg.github.io/ ACKNOWLEDGEMENTS. We thank the Pan African Programme: 'The Cultured Chimpanzee' team and its collaborators for allowing the use of their data for this paper. We thank Amelie Pettrich, Antonio Buzharevski, Eva Martinez Garcia, Ivana Kirchmair, Sebastian Schütte, Linda Gerlach and Fabina Haas. We also thank management and support staff across all sites; specifically Yasmin Moebius, Geoffrey Muhanguzi, Martha Robbins, Henk Eshuis, Sergio Marrocoli and John Hart. Thanks to the team at https://www.chimpandsee.org particularly Briana Harder, Anja Landsmann, Laura K. Lynn, Zuzana Macháčková, Heidi Pfund, Kristeena Sigler and Jane Widness. The work that allowed for the collection of the dataset was funded by the Max Planck Society, Max Planck Society Innovation Fund, and Heinz L. Krekeler. In this respect we would like to thank: Ministre des Eaux et Forêts, Ministère de l'Enseignement supérieur et de la Recherche scientifique in Côte d'Ivoire; Institut Congolais pour la Conservation de la Nature, Ministère de la Recherche Scientifique in Democratic Republic of Congo; Forestry Development Authority in Liberia; Direction Des Eaux Et Forêts, Chasses Et Conservation Des Sols in Senegal; Makerere University Biological Field Station, Uganda National Council for Science and Technology, Uganda Wildlife Authority, National Forestry Authority in Uganda; National Institute for Forestry Development and Protected Area Management, Ministry of Agriculture and Forests, Ministry of Fisheries and Environment in Equatorial Guinea. This work was supported by the UKRI CDT in Interactive AI (grant EP/S022937/1). This work was in part supported by the US National Science Foundation Awards No. 2118240 "HDR Institute: Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning" and Award No. 2330423 and Natural Sciences and Engineering Research Council of Canada under Award No. 585136 for the "AI and Biodiversity Change (ABC) Global Center". WEBSITE. Further materials are available at the project website at: https://obrookes.github.io/panaf-fgbg.github.io/ Complete download (zip, 20.1 GiB)
Z
PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music...
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Long, Phillip; Novack, Zachary; McAuley, Julian; Berg-Kirkpatrick, Taylor (2025). PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_13763755
Explore at:
Dataset updated
Mar 17, 2025
Dataset provided by
University of California, San Diego
UCSD
Authors
Long, Phillip; Novack, Zachary; McAuley, Julian; Berg-Kirkpatrick, Taylor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We introduce PDMX: a Public Domain MusicXML dataset for symbolic music processing. Refer to our paper for more information, and our GitHub repository for any code-related details. Please cite both our paper and our collaborators' paper if you use this dataset (see our GitHub for more information).

Upon further use of the PDMX dataset, we discovered a discrepancy between the public-facing copyright metadata on the MuseScore website and the internal copyright data of the MuseScore files themselves, which affected 31,221 (12.29% of) songs. We have decided to proceed with the former given its public visibility on Musescore (i.e. this is what the MuseScore website presents its users with). We have noted files with conflicting internal licenses in the license_conflict column of PDMX. We recommend using the no_license_conflict subset of PDMX (which still includes 222,856 songs) moving forward.

Additionally, for each song in PDMX, we not only provide the MusicRender and metadata JSON files, but we also try to include the associated compressed MusicXML (MXL), sheet music (PDF), and MIDI (MID) files when available. Due to the corruption of 42 of the original MuseScore files, these songs lack those associated files (since they could not be converted to those formats) and only include the MusicRender and metadata JSON files. The all_valid subset of PDMX describes the songs where all associated files are valid.
User Interaction , Ad Click Behavior on a Website
kaggle.com
zip
Updated Aug 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Youssef Elhewag (2024). User Interaction , Ad Click Behavior on a Website [Dataset]. https://www.kaggle.com/datasets/ahmedyoussefelhewag/user-interaction-ad-click-behavior-on-a-website
Explore at:
zip(40335 bytes)Available download formats
Dataset updated
Aug 10, 2024
Authors
Ahmed Youssef Elhewag
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset represents a collection of records for users' visits to a website, where certain variables related to these users are studied to determine whether they clicked on a particular ad or not. Here’s a detailed description of the data:

Daily Time Spent on Site: The number of minutes the user spends on the website daily.

Age: The age of the user in years.

Area Income: The average annual income of the area where the user resides, measured in U.S. dollars.

Daily Internet Usage: The number of minutes the user spends on the internet daily.

Ad Topic Line: The headline or main topic of the ad that was shown to the user.

City: The city where the user resides.

Male: An indicator of the user's gender, where 1 represents male and 0 represents female.

Country: The country where the user resides.

Timestamp: The date and time when this record was logged.

Clicked on Ad: An indicator of whether the user clicked on the ad, where 1 means the user clicked on the ad, and 0 means they did not.

In summary, this data is used to analyze users' behavior on the website based on a set of demographic and usage factors, with a focus on whether they clicked on a particular ad or not.
Riga Data Science Club
kaggle.com
zip
Updated Mar 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dmitry Yemelyanov (2021). Riga Data Science Club [Dataset]. https://www.kaggle.com/datasets/dmitryyemelyanov/rigadsclub
Explore at:
zip(494849 bytes)Available download formats
Dataset updated
Mar 29, 2021
Authors
Dmitry Yemelyanov
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Riga
Description
Context

Riga Data Science Club is a non-profit organisation to share ideas, experience and build machine learning projects together. Data Science community should known own data, so this is a dataset about ourselves: our website analytics, social media activity, slack statistics and even meetup transcriptions!

Content

Dataset is split up in several folders by the context: * linkedin - company page visitor, follower and post stats * slack - messaging and member activity * typeform - new member responses * website - website visitors by country, language, device, operating system, screen resolution * youtube - meetup transcriptions

Inspiration

Let's make Riga Data Science Club better! We expect this data to bring lots of insights on how to improve.

"Know your c̶u̶s̶t̶o̶m̶e̶r̶ member" - Explore member interests by analysing sign-up survey (typeform) responses - Explore messaging patterns in Slack to understand how members are retained and when they are lost

Social media intelligence * Define LinkedIn posting strategy based on historical engagement data * Define target user profile based on LinkedIn page attendance data

Website * Define website localisation strategy based on data about visitor countries and languages * Define website responsive design strategy based on data about visitor devices, operating systems and screen resolutions

Have some fun * NLP analysis of meetup transcriptions: word frequencies, question answering, something else?
website_visit_webalizer
kaggle.com
zip
Updated Mar 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erin ÇOBAN (2024). website_visit_webalizer [Dataset]. https://www.kaggle.com/datasets/erinoban/website-visit-webalizer
Explore at:
zip(1082 bytes)Available download formats
Dataset updated
Mar 24, 2024
Authors
Erin ÇOBAN
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset was obtained from website visit data. These are real data. It contains monthly visit information of the tr-metaverse.com website hosted on Linux. Day Hit Hit% Files Files% Pages Pages% Visit Visit% Sites Sites% Kbytes Kbytes% It consists of fields. Values with a % sign next to them are numbers in percent. 30-day visit data from the beginning of the month to the end of the month. Day: Day index number, which day of the month Hit: How much reach there is in general Hit%: How much access there is overall in percentage Files: How many visits have been made as files Files%: Percentage in files Pages Pages% Visit: Number of unique visitors Visit%: Unique visitor rate sites sites% Kbytes: how much data has been downloaded Kbytes%: percentage in data
a
Surface Water Monitoring Sites
hub.arcgis.com
data-ndwr.hub.arcgis.com
+1more
Updated Nov 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nevada Division of Water Resources (2021). Surface Water Monitoring Sites [Dataset]. https://hub.arcgis.com/datasets/NDWR::surface-water-monitoring-sites/explore?showTable=true
Explore at:
Dataset updated
Nov 24, 2021
Dataset authored and provided by
Nevada Division of Water Resources
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered

Description
The WellNet database contains information related to sites for surface water measurements. These data are used by NDWR to assess the condition of the groundwater and surface water systems over time and are available to the public on NDWR’s website. Surface water measurement sites are chosen based on availability of dedicated measurement equipment, permit terms, and where additional flow information is required. This dataset is updated every day from a non-spatial SQL Server database using lat/long coordinates to display location. The feature class participates in a relationship class with a surface water measure table joined using the sitename field. This dataset contains both active and inactive sites. Measurement data is provided by reporting agencies and by regular site visits from NDWR staff. For website access, please see the Stream/Spring site at http://water.nv.gov/SpringAndStreamFlow.aspx.

Facebook

Twitter

Click to copy link

Link copied

Cite

AnthonyTherrien (2024). Website Traffic [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/website-traffic/discussion

Website Traffic

Website Traffic and User Engagement Metrics

Explore at:

zip(65228 bytes)Available download formats

Dataset updated

Aug 5, 2024

Authors

AnthonyTherrien

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Dataset Overview

This dataset provides detailed information on website traffic, including page views, session duration, bounce rate, traffic source, time spent on page, previous visits, and conversion rate.

Dataset Description

Page Views: The number of pages viewed during a session.
Session Duration: The total duration of the session in minutes.
Bounce Rate: The percentage of visitors who navigate away from the site after viewing only one page.
Traffic Source: The origin of the traffic (e.g., Organic, Social, Paid).
Time on Page: The amount of time spent on the specific page.
Previous Visits: The number of previous visits by the same visitor.
Conversion Rate: The percentage of visitors who completed a desired action (e.g., making a purchase).

Data Summary

Total Records: 2000
Total Features: 7

Key Features

Page Views: This feature indicates the engagement level of the visitors by showing how many pages they visit during their session.
Session Duration: This feature measures the length of time a visitor stays on the website, which can indicate the quality of the content.
Bounce Rate: A critical metric for understanding user behavior. A high bounce rate may indicate that visitors are not finding what they are looking for.
Traffic Source: Understanding where your traffic comes from can help in optimizing marketing strategies.
Time on Page: This helps in analyzing which pages are retaining visitors' attention the most.
Previous Visits: This can be used to analyze the loyalty of visitors and the effectiveness of retention strategies.
Conversion Rate: The ultimate metric for measuring the effectiveness of the website in achieving its goals.

Usage

This dataset can be used for various analyses such as:

Identifying key drivers of engagement and conversion.
Analyzing the effectiveness of different traffic sources.
Understanding user behavior patterns and optimizing the website accordingly.
Improving marketing strategies based on traffic source performance.
Enhancing user experience by analyzing time spent on different pages.

Acknowledgments

This dataset was generated for educational purposes and is not from a real website. It serves as a tool for learning data analysis and machine learning techniques.

Clear search

Close search

Google apps

Main menu

Website Traffic

Dataset Overview

Dataset Description

Data Summary

Key Features

Usage

Acknowledgments

Google Analytics Sample

Context

Content

Acknowledgements

Inspiration

Daily website visitors (time series regression)

Context

Content

Inspiration

Walmart.com Daily Traffic Statistics 2025

Recipe Site Traffic: Analysis & Prediction

Recipe Site Traffic

About Tasty Bytes

Example Recipe

Data Information

Google Analytics Sample

Amazon Daily Traffic Statistics 2025

Traffic Exchange Analysis Dataset 2024

Ardgillan Demesne Traffic Data 2018-2023 FCC - Dataset - data.smartdublin.ie...

Personal Ecommerce Website Ad cost & viewer count

RÉ Logs Dataset

Michigan Public Policy Survey Restricted Use Datasets

Revenue Generated by Measure ULA

Top-1000 HHS Open Data Resources

The PanAf-FGBG Dataset - Datasets - data.bris

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music...

User Interaction , Ad Click Behavior on a Website

Riga Data Science Club

Context

Content

Inspiration

website_visit_webalizer

Surface Water Monitoring Sites

Website Traffic

Website Traffic and User Engagement Metrics

Dataset Overview

Dataset Description

Data Summary

Key Features

Usage

Acknowledgments