100+ datasets found

g
Website Traffic Dataset
gts.ai
json
Updated Aug 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). Website Traffic Dataset [Dataset]. https://gts.ai/dataset-download/website-traffic-dataset/
Explore at:
jsonAvailable download formats
Dataset updated
Aug 23, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Explore our detailed website traffic dataset featuring key metrics like page views, session duration, bounce rate, traffic source, and conversion rates.
d
Open Data Website Traffic
catalog.data.gov
data.lacity.org
+1more
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.lacity.org (2025). Open Data Website Traffic [Dataset]. https://catalog.data.gov/dataset/open-data-website-traffic
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.lacity.org
Description
Daily utilization metrics for data.lacity.org and geohub.lacity.org. Updated monthly
d
Website Analytics
catalog.data.gov
data.brla.gov
+3more
Updated Aug 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.brla.gov (2025). Website Analytics [Dataset]. https://catalog.data.gov/dataset/website-analytics-89ba5
Explore at:
Dataset updated
Aug 11, 2025
Dataset provided by
data.brla.gov
Description
Web traffic statistics for the several City-Parish websites, brla.gov, city.brla.gov, Red Stick Ready, GIS, Open Data etc. Information provided by Google Analytics.
Data from: Analysis of the Quantitative Impact of Social Networks General...
figshare.com
produccioncientifica.ucm.es
doc
Updated Oct 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Parra; Santiago Martínez Arias; Sergio Mena Muñoz (2022). Analysis of the Quantitative Impact of Social Networks General Data.doc [Dataset]. http://doi.org/10.6084/m9.figshare.21329421.v1
Explore at:
docAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21329421.v1
Dataset updated
Oct 14, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
David Parra; Santiago Martínez Arias; Sergio Mena Muñoz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
General data recollected for the studio " Analysis of the Quantitative Impact of Social Networks on Web Traffic of Cybermedia in the 27 Countries of the European Union". Four research questions are posed: what percentage of the total web traffic generated by cybermedia in the European Union comes from social networks? Is said percentage higher or lower than that provided through direct traffic and through the use of search engines via SEO positioning? Which social networks have a greater impact? And is there any degree of relationship between the specific weight of social networks in the web traffic of a cybermedia and circumstances such as the average duration of the user's visit, the number of page views or the bounce rate understood in its formal aspect of not performing any kind of interaction on the visited page beyond reading its content? To answer these questions, we have first proceeded to a selection of the cybermedia with the highest web traffic of the 27 countries that are currently part of the European Union after the United Kingdom left on December 31, 2020. In each nation we have selected five media using a combination of the global web traffic metrics provided by the tools Alexa (https://www.alexa.com/), which ceased to be operational on May 1, 2022, and SimilarWeb (https:// www.similarweb.com/). We have not used local metrics by country since the results obtained with these first two tools were sufficiently significant and our objective is not to establish a ranking of cybermedia by nation but to examine the relevance of social networks in their web traffic. In all cases, cybermedia whose property corresponds to a journalistic company have been selected, ruling out those belonging to telecommunications portals or service providers; in some cases they correspond to classic information companies (both newspapers and televisions) while in others they refer to digital natives, without this circumstance affecting the nature of the research proposed.
Below we have proceeded to examine the web traffic data of said cybermedia. The period corresponding to the months of October, November and December 2021 and January, February and March 2022 has been selected. We believe that this six-month stretch allows possible one-time variations to be overcome for a month, reinforcing the precision of the data obtained. To secure this data, we have used the SimilarWeb tool, currently the most precise tool that exists when examining the web traffic of a portal, although it is limited to that coming from desktops and laptops, without taking into account those that come from mobile devices, currently impossible to determine with existing measurement tools on the market. It includes:

Web traffic general data: average visit duration, pages per visit and bounce rate Web traffic origin by country Percentage of traffic generated from social media over total web traffic Distribution of web traffic generated from social networks Comparison of web traffic generated from social netwoks with direct and search procedures
Z
Extended Wikipedia Web Traffic Daily Dataset (with Missing Values)
data.niaid.nih.gov
Updated Nov 28, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Webb, Geoff (2022). Extended Wikipedia Web Traffic Daily Dataset (with Missing Values) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7370976
Explore at:
Dataset updated
Nov 28, 2022
Dataset provided by
Webb, Geoff
Hyndman, Rob
Montero-Manso, Pablo
Godahewa, Rakshitha
Bergmeir, Christoph
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
This dataset contains 145063 time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2022-06-30. This is an extended version of the dataset that was used in the Kaggle Wikipedia Web Traffic forecasting competition. For consistency, the same Wikipedia pages that were used in the competition have been used in this dataset as well. The colons (:) in article names have been replaced by dashes (-) to make the .tsf file readable using our data loaders.

The data were downloaded from the Wikimedia REST API. According to the conditions of the API, this dataset is licensed under CC-BY-SA 3.0 and GFDL licenses.
A
‘Popular Website Traffic Over Time ’ analyzed by Analyst-2
analyst-2.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Popular Website Traffic Over Time ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-popular-website-traffic-over-time-62e4/62549059/?iid=003-357&v=presentation
Explore at:
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Popular Website Traffic Over Time ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/popular-website-traffice on 13 February 2022.

--- Dataset description provided by original source is as follows ---

About this dataset

Background

Have you every been in a conversation and the question comes up, who uses Bing? This question comes up occasionally because people wonder if these sites have any views. For this research study, we are going to be exploring popular website traffic for many popular websites.

Methodology

The data collected originates from SimilarWeb.com.

Source

For the analysis and study, go to The Concept Center

This dataset was created by Chase Willden and contains around 0 samples along with 1/1/2017, Social Media, technical information and other features such as: - 12/1/2016 - 3/1/2017 - and more.

How to use this dataset

Analyze 11/1/2016 in relation to 2/1/2017

Study the influence of 4/1/2017 on 1/1/2017

More datasets

Acknowledgements

If you use this dataset in your research, please credit Chase Willden

Start A New Notebook!

--- Original source retains full ownership of the source dataset ---

Network traffic datasets created by Single Flow Time Series Analysis

zenodo.org
explore.openaire.eu
+1more

csv, pdf

Updated Jul 11, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka (2024). Network traffic datasets created by Single Flow Time Series Analysis [Dataset]. http://doi.org/10.5281/zenodo.8035724

Explore at:

csv, pdfAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.8035724

Dataset updated

Jul 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Network traffic datasets created by Single Flow Time Series Analysis

Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

In the following table is a description of each dataset file:

File name	Detection problem	Citation of original raw dataset
botnet_binary.csv	Binary detection of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
botnet_multiclass.csv	Multi-class classification of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
cryptomining_design.csv	Binary detection of cryptomining; the design part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
cryptomining_evaluation.csv	Binary detection of cryptomining; the evaluation part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
dns_malware.csv	Binary detection of malware DNS	Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
doh_cic.csv	Binary detection of DoH	Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
doh_real_world.csv	Binary detection of DoH	Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
dos.csv	Binary detection of DoS	Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv	Binary detection of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv	Multi-class classification of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv	Binary detection of HTTPS Brute Force	Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv	Binary detection of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv	Multi-class classification of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_unsw_nb_15_binary.csv	Binary detection of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
ids_unsw_nb_15_multiclass.csv	Multi-class classification of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv	Binary detection of IoT malware	Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv	Binary detection of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv	Multi-class classification of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
tor_binary.csv	Binary detection of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
tor_multiclass.csv	Multi-class classification of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
vpn_iscx_binary.csv	Binary detection of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_iscx_multiclass.csv	Multi-class classification of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_vnat_binary.csv	Binary detection of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
vpn_vnat_multiclass.csv	Multi-class classification of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

d
Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B...
datarade.ai
.csv
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Consumer Edge (2025). Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B Shopper Insights | 59 Countries, 3-Day Lag, Daily Delivery [Dataset]. https://datarade.ai/data-products/click-global-data-web-traffic-data-transaction-data-con-consumer-edge
Explore at:
.csvAvailable download formats
Dataset updated
Mar 13, 2025
Dataset authored and provided by
Consumer Edge
Area covered
Marshall Islands, Congo, Bermuda, Bosnia and Herzegovina, El Salvador, Sri Lanka, Nauru, Finland, Montserrat, South Africa
Description
Click Web Traffic Combined with Transaction Data: A New Dimension of Shopper Insights

Consumer Edge is a leader in alternative consumer data for public and private investors and corporate clients. Click enhances the unparalleled accuracy of CE Transact by allowing investors to delve deeper and browse further into global online web traffic for CE Transact companies and more. Leverage the unique fusion of web traffic and transaction datasets to understand the addressable market and understand spending behavior on consumer and B2B websites. See the impact of changes in marketing spend, search engine algorithms, and social media awareness on visits to a merchant’s website, and discover the extent to which product mix and pricing drive or hinder visits and dwell time. Plus, Click uncovers a more global view of traffic trends in geographies not covered by Transact. Doubleclick into better forecasting, with Click.

Consumer Edge’s Click is available in machine-readable file delivery and enables: • Comprehensive Global Coverage: Insights across 620+ brands and 59 countries, including key markets in the US, Europe, Asia, and Latin America. • Integrated Data Ecosystem: Click seamlessly maps web traffic data to CE entities and stock tickers, enabling a unified view across various business intelligence tools. • Near Real-Time Insights: Daily data delivery with a 5-day lag ensures timely, actionable insights for agile decision-making. • Enhanced Forecasting Capabilities: Combining web traffic indicators with transaction data helps identify patterns and predict revenue performance.

Use Case: Analyze Year Over Year Growth Rate by Region

Problem A public investor wants to understand how a company’s year-over-year growth differs by region.

Solution The firm leveraged Consumer Edge Click data to: • Gain visibility into key metrics like views, bounce rate, visits, and addressable spend • Analyze year-over-year growth rates for a time period • Breakout data by geographic region to see growth trends

Metrics Include: • Spend • Items • Volume • Transactions • Price Per Volume

Inquire about a Click subscription to perform more complex, near real-time analyses on public tickers and private brands as well as for industries beyond CPG like: • Monitor web traffic as a leading indicator of stock performance and consumer demand • Analyze customer interest and sentiment at the brand and sub-brand levels

Consumer Edge offers a variety of datasets covering the US, Europe (UK, Austria, France, Germany, Italy, Spain), and across the globe, with subscription options serving a wide range of business needs.

Consumer Edge is the Leader in Data-Driven Insights Focused on the Global Consumer
i
Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...
ieee-dataport.org
Updated Oct 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamad Amar Irsyad Mohd Aminuddin (2024). Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages [Dataset]. https://ieee-dataport.org/documents/website-fingerprinting-dataset-browsing-network-traffic-desktop-and-mobile-webpages
Explore at:
Dataset updated
Oct 21, 2024
Authors
Mohamad Amar Irsyad Mohd Aminuddin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.Each webpage URL is visited 90 times for each deskop and mobile browsing mode.
Website traffic strategies by industry and size of enterprise
datasets.ai
www150.statcan.gc.ca
+3more
21, 55, 8
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada | Statistique Canada (2024). Website traffic strategies by industry and size of enterprise [Dataset]. https://datasets.ai/datasets/a7882acc-a647-4fa6-a58b-6dae889de472
Explore at:
8, 55, 21Available download formats
Dataset updated
Aug 8, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Authors
Statistics Canada | Statistique Canada
Description
Digital technology and Internet use, website traffic strategies, by North American Industry Classification System (NAICS) and size of enterprise for Canada from 2012 to 2013.
Z
Kaggle Wikipedia Web Traffic Weekly Dataset
data.niaid.nih.gov
zenodo.org
Updated Apr 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bergmeir, Christoph (2021). Kaggle Wikipedia Web Traffic Weekly Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892976
Explore at:
Dataset updated
Apr 2, 2021
Dataset provided by
Webb, Geoff
Hyndman, Rob
Montero-Manso, Pablo
Godahewa, Rakshitha
Bergmeir, Christoph
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the aggregated version of the daily dataset used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-05, after aggregating them into weekly.

The original dataset contains missing values. They have been simply replaced by zeros before aggregation.
Z
Network Traffic Analysis: Data and Code
data.niaid.nih.gov
zenodo.org
Updated Jun 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Honig, Joshua (2024). Network Traffic Analysis: Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11479410
Explore at:
Dataset updated
Jun 12, 2024
Dataset provided by
Homan, Sophia
Soni, Shreena
Honig, Joshua
Moran, Madeline
Chan-Tin, Eric
Ferrell, Nathan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Code:

Packet_Features_Generator.py & Features.py

To run this code:

pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j

-h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j

Purpose:

Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.

Uses Features.py to calcualte the features.

startMachineLearning.sh & machineLearning.py

To run this code:

bash startMachineLearning.sh

This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags

Options (to be edited within this file):

--evaluate-only to test 5 fold cross validation accuracy

--test-scaling-normalization to test 6 different combinations of scalers and normalizers

Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use

--grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'

Purpose:

Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.

Data

Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.

Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:

First number is a classification number to denote what website, query, or vr action is taking place.

The remaining numbers in each line denote:

The size of a packet,

and the direction it is traveling.

negative numbers denote incoming packets

positive numbers denote outgoing packets

Figure 4 Data

This data uses specific lines from the Virtual Reality.txt file.

The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.

The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.

The .xlsx and .csv file are identical

Each file includes (from right to left):

The origional packet data,

each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,

and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.
Data from: Web Traffic data set
kaggle.com
Updated Jul 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhinaba Saha (2020). Web Traffic data set [Dataset]. https://www.kaggle.com/datasets/sahaabhi/web-traffic-data-set
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 31, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abhinaba Saha
Description
Dataset

This dataset was created by Abhinaba Saha

Contents
e
A comprehensive dataset of website traffic
data.europa.eu
gimi9.com
zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Universitätsbibliothek der Technischen Universität München (2024). A comprehensive dataset of website traffic [Dataset]. https://data.europa.eu/data/datasets/https-open-bydata-de-api-hub-repo-datasets-https-mediatum-ub-tum-de-1700647-dataset?locale=cs
Explore at:
zipAvailable download formats
Dataset updated
Jul 12, 2024
Dataset authored and provided by
Universitätsbibliothek der Technischen Universität München
License
http://dcat-ap.de/def/licenses/cc-byhttp://dcat-ap.de/def/licenses/cc-by
Description
The dataset contains traffic collected for 96 websites located in
Internet Traffic Data Set
kaggle.com
Updated May 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asfand Yar (2023). Internet Traffic Data Set [Dataset]. http://doi.org/10.34740/kaggle/dsv/5658579
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/5658579
Dataset updated
May 10, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Asfand Yar
Description
This data set contains internet traffic data captured by an Internet Service Provider (ISP) using Mikrotik SDN Controller and packet sniffer tools. The data set includes traffic from over 2000 customers who use Fibre to the Home (FTTH) and Gpon internet connections. The data was collected over a period of several months and contains all traffic in its original format with headers and packets.

The data set contains information on inbound and outbound traffic, including web browsing, email, file transfers, and more. The data set can be used for research in areas such as network security, traffic analysis, and machine learning.

**Data Collection Method: ** The data was captured using Mikrotik SDN Controller and packet sniffer tools. These tools capture traffic data by monitoring network traffic in real-time. The data set contains all traffic data in its original format, including headers and packets.

**Data Set Content: ** The data set is provided in a CSV format and includes the following fields:

Timestamp: The date and time the traffic was captured

Source IP Address: The IP address of the device that sent the traffic Destination IP Address: The IP address of the device that received the traffic Protocol: The network protocol used for the traffic (e.g. TCP, UDP) Source Port: The port used by the source device for the traffic Destination Port: The port used by the destination device for the traffic Packet Size: The size of the packet in bytes Payload: The payload data of the packet The data set contains a large volume of traffic data from over 2000 customers. The data is organized by timestamp and includes all traffic data in its original format, including headers and packets. The data set contains both inbound and outbound traffic, and covers various types of internet traffic, including web browsing, email, file transfers, and more. one of listed protocols: ipsec-ah - IPsec AH protocol *ipsec-esp - IPsec ESP protocol ddp - datagram delivery protocol egp - exterior gateway protocol ggp - gateway-gateway protocol gre - general routing encapsulation hmp - host monitoring protocol idpr-cmtp - idpr control message transport icmp - internet control message protocol icmpv6 - internet control message protocol v6 igmp - internet group management protocol ipencap - ip encapsulated in ip ipip - ip encapsulation encap - ip encapsulation iso-tp4 - iso transport protocol class 4 ospf - open shortest path first pup - parc universal packet protocol pim - protocol independent multicast rspf - radio shortest path first rdp - reliable datagram protocol st - st datagram mode tcp - transmission control protocol udp - user datagram protocol vmtp - versatile message transport vrrp - virtual router redundancy protocol xns-idp - xerox xns idp xtp - xpress transfer protocol

MAC Protocol Examples 802.2 - 802.2 Frames (0x0004) arp - Address Resolution Protocol (0x0806) homeplug-av - HomePlug AV MME (0x88E1) ip - Internet Protocol version 4 (0x0800) ipv6 - Internet Protocol Version 6 (0x86DD) ipx - Internetwork Packet Exchange (0x8137) lldp - Link Layer Discovery Protocol (0x88CC) loop-protect - Loop Protect Protocol (0x9003) mpls-multicast - MPLS multicast (0x8848) mpls-unicast - MPLS unicast (0x8847) packing-compr - Encapsulated packets with compressed IP packing (0x9001) packing-simple - Encapsulated packets with simple IP packing (0x9000) pppoe - PPPoE Session Stage (0x8864) pppoe-discovery - PPPoE Discovery Stage (0x8863) rarp - Reverse Address Resolution Protocol (0x8035) service-vlan - Provider Bridging (IEEE 802.1ad) & Shortest Path Bridging IEEE 802.1aq (0x88A8) vlan - VLAN-tagged frame (IEEE 802.1Q) and Shortest Path Bridging IEEE 802.1aq with NNI compatibility (0x8100)

**Data Usage: ** The data set can be used for research in areas such as network security, traffic analysis, and machine learning. Researchers can use the data to develop new algorithms for detecting and preventing cyber attacks, analyzing internet traffic patterns, and more.

**Data Availability: ** If you are interested in using this data set for research purposes, please contact us at asfandyar250@gmail.com for more information and references. The data set is available for download on Kaggle and can be accessed by researchers who have obtained permission from the ISP.

We hope this data set will be useful for researchers in the field of network security and traffic analysis. If you have any questions or need further information, please do not hesitate to contact us. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5985737%2F61c81ce9eb393f8fc7c15540c9819b95%2FData.PNG?generation=1683750473536727&alt=media" alt=""> You can use Wireshark or other software's to view files
u
Data from: Google Analytics & Twitter dataset from a movies, TV series and...
portalcientificovalencia.univeuropea.com
figshare.com
Updated 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yeste, Víctor; Yeste, Víctor (2024). Google Analytics & Twitter dataset from a movies, TV series and videogames website [Dataset]. https://portalcientificovalencia.univeuropea.com/documentos/67321ed3aea56d4af0485dc8
Explore at:
Dataset updated
2024
Authors
Yeste, Víctor; Yeste, Víctor
Description
Author: Víctor Yeste. Universitat Politècnica de Valencia.The object of this study is the design of a cybermetric methodology whose objectives are to measure the success of the content published in online media and the possible prediction of the selected success variables.In this case, due to the need to integrate data from two separate areas, such as web publishing and the analysis of their shares and related topics on Twitter, has opted for programming as you access both the Google Analytics v4 reporting API and Twitter Standard API, always respecting the limits of these.The website analyzed is hellofriki.com. It is an online media whose primary intention is to solve the need for information on some topics that provide daily a vast number of news in the form of news, as well as the possibility of analysis, reports, interviews, and many other information formats. All these contents are under the scope of the sections of cinema, series, video games, literature, and comics.This dataset has contributed to the elaboration of the PhD Thesis:Yeste Moreno, VM. (2021). Diseño de una metodología cibermétrica de cálculo del éxito para la optimización de contenidos web [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/176009Data have been obtained from each last-minute news article published online according to the indicators described in the doctoral thesis. All related data are stored in a database, divided into the following tables:tesis_followers: User ID list of media account followers.tesis_hometimeline: data from tweets posted by the media account sharing breaking news from the web.status_id: Tweet IDcreated_at: date of publicationtext: content of the tweetpath: URL extracted after processing the shortened URL in textpost_shared: Article ID in WordPress that is being sharedretweet_count: number of retweetsfavorite_count: number of favoritestesis_hometimeline_other: data from tweets posted by the media account that do not share breaking news from the web. Other typologies, automatic Facebook shares, custom tweets without link to an article, etc. With the same fields as tesis_hometimeline.tesis_posts: data of articles published by the web and processed for some analysis.stats_id: Analysis IDpost_id: Article ID in WordPresspost_date: article publication date in WordPresspost_title: title of the articlepath: URL of the article in the middle webtags: Tags ID or WordPress tags related to the articleuniquepageviews: unique page viewsentrancerate: input ratioavgtimeonpage: average visit timeexitrate: output ratiopageviewspersession: page views per sessionadsense_adunitsviewed: number of ads viewed by usersadsense_viewableimpressionpercent: ad display ratioadsense_ctr: ad click ratioadsense_ecpm: estimated ad revenue per 1000 page viewstesis_stats: data from a particular analysis, performed at each published breaking news item. Fields with statistical values can be computed from the data in the other tables, but total and average calculations are saved for faster and easier further processing.id: ID of the analysisphase: phase of the thesis in which analysis has been carried out (right now all are 1)time: "0" if at the time of publication, "1" if 14 days laterstart_date: date and time of measurement on the day of publicationend_date: date and time when the measurement is made 14 days latermain_post_id: ID of the published article to be analysedmain_post_theme: Main section of the published article to analyzesuperheroes_theme: "1" if about superheroes, "0" if nottrailer_theme: "1" if trailer, "0" if notname: empty field, possibility to add a custom name manuallynotes: empty field, possibility to add personalized notes manually, as if some tag has been removed manually for being considered too generic, despite the fact that the editor put itnum_articles: number of articles analysednum_articles_with_traffic: number of articles analysed with traffic (which will be taken into account for traffic analysis)num_articles_with_tw_data: number of articles with data from when they were shared on the media’s Twitter accountnum_terms: number of terms analyzeduniquepageviews_total: total page viewsuniquepageviews_mean: average page viewsentrancerate_mean: average input ratioavgtimeonpage_mean: average duration of visitsexitrate_mean: average output ratiopageviewspersession_mean: average page views per sessiontotal: total of ads viewedadsense_adunitsviewed_mean: average of ads viewedadsense_viewableimpressionpercent_mean: average ad display ratioadsense_ctr_mean: average ad click ratioadsense_ecpm_mean: estimated ad revenue per 1000 page viewsTotal: total incomeretweet_count_mean: average incomefavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesterms_ini_num_tweets: total tweets on the terms on the day of publicationterms_ini_retweet_count_total: total retweets on the terms on the day of publicationterms_ini_retweet_count_mean: average retweets on the terms on the day of publicationterms_ini_favorite_count_total: total of favorites on the terms on the day of publicationterms_ini_favorite_count_mean: average of favorites on the terms on the day of publicationterms_ini_followers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the terms on the day of publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms on the day of publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who spoke about the terms on the day of publicationterms_ini_user_age_mean: average age in days of users who have spoken of the terms on the day of publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms on the day of publicationterms_end_num_tweets: total tweets on terms 14 days after publicationterms_ini_retweet_count_total: total retweets on terms 14 days after publicationterms_ini_retweet_count_mean: average retweets on terms 14 days after publicationterms_ini_favorite_count_total: total bookmarks on terms 14 days after publicationterms_ini_favorite_count_mean: average of favorites on terms 14 days after publicationterms_ini_followers_talking_rate: ratio of media Twitter account followers who have recently posted a tweet talking about the terms 14 days after publicationterms_ini_user_num_followers_mean: average followers of users who have spoken of the terms 14 days after publicationterms_ini_user_num_tweets_mean: average number of tweets published by users who have spoken about the terms 14 days after publicationterms_ini_user_age_mean: the average age in days of users who have spoken of the terms 14 days after publicationterms_ini_ur_inclusion_rate: URL inclusion ratio of tweets talking about terms 14 days after publication.tesis_terms: data of the terms (tags) related to the processed articles.stats_id: Analysis IDtime: "0" if at the time of publication, "1" if 14 days laterterm_id: Term ID (tag) in WordPressname: Name of the termslug: URL of the termnum_tweets: number of tweetsretweet_count_total: total retweetsretweet_count_mean: average retweetsfavorite_count_total: total of favoritesfavorite_count_mean: average of favoritesfollowers_talking_rate: ratio of followers of the media Twitter account who have recently published a tweet talking about the termuser_num_followers_mean: average followers of users who were talking about the termuser_num_tweets_mean: average number of tweets published by users who were talking about the termuser_age_mean: average age in days of users who were talking about the termurl_inclusion_rate: URL inclusion ratio
Kaggle Wikipedia Web Traffic Daily Dataset (with Missing Values)
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb; Rob Hyndman; Rob Hyndman; Pablo Montero-Manso; Pablo Montero-Manso (2021). Kaggle Wikipedia Web Traffic Daily Dataset (with Missing Values) [Dataset]. http://doi.org/10.5281/zenodo.4656080
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4656080
Dataset updated
Apr 1, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb; Rob Hyndman; Rob Hyndman; Pablo Montero-Manso; Pablo Montero-Manso
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 daily time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-10.
d
Website Analytics
catalog.data.gov
data.nola.gov
+4more
Updated Jun 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nola.gov (2025). Website Analytics [Dataset]. https://catalog.data.gov/dataset/website-analytics
Explore at:
Dataset updated
Jun 28, 2025
Dataset provided by
data.nola.gov
Description
This data about nola.gov provides a window into how people are interacting with the the City of New Orleans online. The data comes from a unified Google Analytics account for New Orleans. We do not track individuals and we anonymize the IP addresses of all visitors.
i
Netflix
ieee-dataport.org
Updated Oct 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Danil Shamsimukhametov (2021). Netflix [Dataset]. https://ieee-dataport.org/documents/youtube-netflix-web-dataset-encrypted-traffic-classification
Explore at:
Dataset updated
Oct 1, 2021
Authors
Danil Shamsimukhametov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
YouTube flows
A
‘Website Analytics’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Website Analytics’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-website-analytics-e2f0/efa1110a/?iid=003-677&v=presentation
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Website Analytics’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/ecee4df3-8149-4b74-8927-428ea920b758 on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Web traffic statistics for the several City-Parish websites, brla.gov, city.brla.gov, Red Stick Ready, GIS, Open Data etc. Information provided by Google Analytics.

--- Original source retains full ownership of the source dataset ---

Facebook

Twitter

Click to copy link

Link copied

Cite

GTS (2024). Website Traffic Dataset [Dataset]. https://gts.ai/dataset-download/website-traffic-dataset/

Website Traffic Dataset

Explore at:

jsonAvailable download formats

Dataset updated

Aug 23, 2024

Dataset provided by

GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED

Authors

GTS

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Explore our detailed website traffic dataset featuring key metrics like page views, session duration, bounce rate, traffic source, and conversion rates.

Clear search

Close search

Google apps

Main menu

Website Traffic Dataset

Open Data Website Traffic

Website Analytics

Data from: Analysis of the Quantitative Impact of Social Networks General...

Extended Wikipedia Web Traffic Daily Dataset (with Missing Values)

‘Popular Website Traffic Over Time ’ analyzed by Analyst-2

About this dataset

Background

Methodology

Source

How to use this dataset

Acknowledgements

Start A New Notebook!

Network traffic datasets created by Single Flow Time Series Analysis

Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B...

Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...

Website traffic strategies by industry and size of enterprise

Kaggle Wikipedia Web Traffic Weekly Dataset

Network Traffic Analysis: Data and Code

Data from: Web Traffic data set

Dataset

Contents

A comprehensive dataset of website traffic

Internet Traffic Data Set

Data from: Google Analytics & Twitter dataset from a movies, TV series and...

Kaggle Wikipedia Web Traffic Daily Dataset (with Missing Values)

Website Analytics

Netflix

‘Website Analytics’ analyzed by Analyst-2

Website Traffic Dataset