As of the last quarter of 2023, 31.57 percent of web traffic in the United States originated from mobile devices, down from 49.51 percent in the fourth quarter of 2022. In comparison, over half of web traffic worldwide was generated via mobile in the last examined period.
Across popular online marketplace websites visited by users in Australia in February 2025, temu.com registered the highest growth in its website traffic compared to the previous year, with an annual growth of over 56 percent. In comparison, ebay.com.au saw a decrease in its website traffic compared to the previous year, with an annual decrease of around 11.9 percent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General data recollected for the studio " Analysis of the Quantitative Impact of Social Networks on Web Traffic of Cybermedia in the 27 Countries of the European Union".
Four research questions are posed: what percentage of the total web traffic generated by cybermedia in the European Union comes from social networks? Is said percentage higher or lower than that provided through direct traffic and through the use of search engines via SEO positioning? Which social networks have a greater impact? And is there any degree of relationship between the specific weight of social networks in the web traffic of a cybermedia and circumstances such as the average duration of the user's visit, the number of page views or the bounce rate understood in its formal aspect of not performing any kind of interaction on the visited page beyond reading its content?
To answer these questions, we have first proceeded to a selection of the cybermedia with the highest web traffic of the 27 countries that are currently part of the European Union after the United Kingdom left on December 31, 2020. In each nation we have selected five media using a combination of the global web traffic metrics provided by the tools Alexa (https://www.alexa.com/), which ceased to be operational on May 1, 2022, and SimilarWeb (https:// www.similarweb.com/). We have not used local metrics by country since the results obtained with these first two tools were sufficiently significant and our objective is not to establish a ranking of cybermedia by nation but to examine the relevance of social networks in their web traffic.
In all cases, cybermedia whose property corresponds to a journalistic company have been selected, ruling out those belonging to telecommunications portals or service providers; in some cases they correspond to classic information companies (both newspapers and televisions) while in others they refer to digital natives, without this circumstance affecting the nature of the research proposed.
Below we have proceeded to examine the web traffic data of said cybermedia. The period corresponding to the months of October, November and December 2021 and January, February and March 2022 has been selected. We believe that this six-month stretch allows possible one-time variations to be overcome for a month, reinforcing the precision of the data obtained.
To secure this data, we have used the SimilarWeb tool, currently the most precise tool that exists when examining the web traffic of a portal, although it is limited to that coming from desktops and laptops, without taking into account those that come from mobile devices, currently impossible to determine with existing measurement tools on the market.
It includes:
Web traffic general data: average visit duration, pages per visit and bounce rate Web traffic origin by country Percentage of traffic generated from social media over total web traffic Distribution of web traffic generated from social networks Comparison of web traffic generated from social netwoks with direct and search procedures
Mobile accounts for approximately half of web traffic worldwide. In the last quarter of 2024, mobile devices (excluding tablets) generated 62.54 percent of global website traffic. Mobiles and smartphones consistently hoovered around the 50 percent mark since the beginning of 2017, before surpassing it in 2020. Mobile traffic Due to low infrastructure and financial restraints, many emerging digital markets skipped the desktop internet phase entirely and moved straight onto mobile internet via smartphone and tablet devices. India is a prime example of a market with a significant mobile-first online population. Other countries with a significant share of mobile internet traffic include Nigeria, Ghana and Kenya. In most African markets, mobile accounts for more than half of the web traffic. By contrast, mobile only makes up around 45.49 percent of online traffic in the United States. Mobile usage The most popular mobile internet activities worldwide include watching movies or videos online, e-mail usage and accessing social media. Apps are a very popular way to watch video on the go and the most-downloaded entertainment apps in the Apple App Store are Netflix, Tencent Video and Amazon Prime Video.
In June 2025, DoorDash's website, doordash.com, had just under 72 million visitors globally, recording a bounce rate of approximately 34.2 percent. For comparison, web traffic figures of UberEats show lower monthly visits.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Network traffic datasets created by Single Flow Time Series Analysis
Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:
J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.
This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf
In the following table is a description of each dataset file:
File name | Detection problem | Citation of original raw dataset |
botnet_binary.csv | Binary detection of botnet | S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014. |
botnet_multiclass.csv | Multi-class classification of botnet | S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014. |
cryptomining_design.csv | Binary detection of cryptomining; the design part | Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022 |
cryptomining_evaluation.csv | Binary detection of cryptomining; the evaluation part | Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022 |
dns_malware.csv | Binary detection of malware DNS | Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021. |
doh_cic.csv | Binary detection of DoH |
Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020 |
doh_real_world.csv | Binary detection of DoH | Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022 |
dos.csv | Binary detection of DoS | Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019. |
edge_iiot_binary.csv | Binary detection of IoT malware | Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022. |
edge_iiot_multiclass.csv | Multi-class classification of IoT malware | Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022. |
https_brute_force.csv | Binary detection of HTTPS Brute Force | Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020 |
ids_cic_binary.csv | Binary detection of intrusion in IDS | Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018. |
ids_cic_multiclass.csv | Multi-class classification of intrusion in IDS | Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018. |
ids_unsw_nb_15_binary.csv | Binary detection of intrusion in IDS | Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015. |
ids_unsw_nb_15_multiclass.csv | Multi-class classification of intrusion in IDS | Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015. |
iot_23.csv | Binary detection of IoT malware | Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23 |
ton_iot_binary.csv | Binary detection of IoT malware | Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021 |
ton_iot_multiclass.csv | Multi-class classification of IoT malware | Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021 |
tor_binary.csv | Binary detection of TOR | Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017. |
tor_multiclass.csv | Multi-class classification of TOR | Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017. |
vpn_iscx_binary.csv | Binary detection of VPN | Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016. |
vpn_iscx_multiclass.csv | Multi-class classification of VPN | Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016. |
vpn_vnat_binary.csv | Binary detection of VPN | Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022 |
vpn_vnat_multiclass.csv | Multi-class classification of VPN | Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022 |
Árukereső was the most popular price comparison portal in Hungary in 2021, based on the traffic share measured by SimilarWeb. Árgép was the second most visited price comparison site over the same time period.
This statistic shows a comparison of webpage traffic sources of Slack and Salesforce in April 2019. According to data collected by GP Bullhound, ninety-six percent of Slack's webpage traffic during the measured period was direct, compared to Salesforce's more mixed traffic strategy.
In January 2025, Wolt's website, wolt.com, had just under 11 million visitors globally, recording a bounce rate of approximately 32 percent. Wolt was acquired by DoorDash in May 2022. For comparison, web traffic figures of DoorDash show nearly 72 monthly visitors.
A. SUMMARY This dataset consists of San Francisco International Airport (SFO) The aircraft landing dataset contains data about aircraft landings at SFO with monthly landing counts and landed weight by airline, region and aircraft model and type. B. HOW THE DATASET IS CREATED Data is self-reported by airlines and is only available at a monthly level. C. UPDATE PROCESS Data is available starting in July 1999 and will be updated monthly. D. HOW TO USE THIS DATASET Airport data is seasonal in nature; therefore, any comparative analyses should be done on a period-over-period basis (i.e. January 2010 vs. January 2009) as opposed to period-to-period (i.e. January 2010 vs. February 2010). It is also important to note that fact and attribute field relationships are not always 1-to-1. For example, Cargo Statistics belonging to United Airlines will appear in multiple attribute fields and are additive, which provides flexibility for the user to derive categorical Cargo Statistics as desired. E. RELATED DATASETS A summary of monthly comparative air-traffic statistics is also available on SFO’s internet site at https://www.flysfo.com/about/media/facts-statistics/air-traffic-statistics
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
What is a high quality website? Over the years the whole SEO industry is talking about the need of producing high quality content and top experts came up with the clever quote ‘Content is king’, meaning that content is the success factor of any website. While this is true, does it mean that a website with good content is also a high quality website? The answer is NO. Good content is not enough. It is one of the factors (the most important) that separates low from high quality sites but good content alone does not complete the puzzle of what is considered by Google as a high quality website. Now you can get the high quality on high quality sites like Techtimes, Vanguardngr, Nytimes, Forbes etc. You can also buy Techtimes guest Post at a reasonable price from the best guest post service. What is SEO SEO is short for ‘Search Engine Optimization’. It refers to the process of increasing a websites traffic flow by optimizing several aspects of a website; such as your on-page SEO, technical SEO & off-site SEO,. Your SEO strategy should ideally be planned around your content strategy. For this you will require three elements, 1.) keywords, 2.) links and 3.) substance to piece your content strategy together. Guest Post on High quality sites can improve your SEO ranking. To improve ranking and boost ranking, buy Guest Post on Techtimes from the high quality guest post service. Characteristics of a high quality website A high quality website has the following characteristics: Unique content Content is unique both within the website itself (i.e. each page has unique content and not similar to other pages), but also compared to other websites. Demonstrate Expertise Content is produced by experts based on research and or experience. If for example the subject is health related, then the advice should be provided by qualified authors who can professionally give advice for the particular subject. Unbiased content Content is detail and describes both sides of a story and is not promoting a single product, idea or service. Accessibility A high quality website has versions for non PC users as well. It is important that mobile and tablet users can access the website without any usability issues. Usability Can the user navigate the website easily; is the website user friendly? Attention to detail Content is easy to read with images (if applicable) and free of spelling and grammar mistakes. Does it seem that the owner cares on what is published on the website or is it for the purpose of having content in order to run ads? SEO Optimized Optimizing a web site for search engines has many benefits but it is important not to overdo it. A good quality web site needs to have non-optimized content as well. This is my opinion and although some people may disagree it is a fact that over-optimization can sometimes generate the opposite results. The reason is that algorithms can sometimes interpret over-optimization as an attempt to game the system and they may take measures to prevent this from happening. Balance between content and ads It is not something bad for a website to have ads or promotions but these should not distract the users from finding the information they need. Speed A high quality website loads fast. A fast website will rank higher and create more conventions, sales and loyal readers. Social Social media changed our lives, the way we communicate but also the way we assess quality. It is expected for a good product to have good reviews, Facebook likes and Tweets. Before you make a decision to buy or not, you may examine these social factors as well. Likewise, It is also expected for a good website to be socially accepted and recognized i.e. have Facebook followers, RSS subscribers etc. User Engagement and Interaction Do users spend enough time on the site and read more than one pages before they leave? Do they interact with the content by adding comments, making suggestions, getting into conversations etc.? Better than the competition When you take a specific keyword, is your website better than your competitors? Does it deserve one of the top positions if judged without bias?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is available on Brisbane City Council’s open data website – data.brisbane.qld.gov.au. The site provides additional features for viewing and interacting with the data and for downloading the data in various formats.
Traffic Volume for Key Brisbane Corridors. Includes traffic volumes, travel times and incidents.
This dataset will no longer be updated. Data is being published in a new format in a new dataset called Traffic Management — Key Corridor — Monthly Performance Report.
Information on Traffic Management is available on the Brisbane City Council website.
This dataset contains the following resources:1. Traffic Volume for Key Brisbane Corridors.
Excel file containing: * 6-Month Average Daily, AM & PM Peak Traffic Volume * Network Daily Traffic Volume Comparison * 6-Month Average AM & PM Peak Travel Time * Network Travel Time Comparison * Incident Data * Note: volume day of the week and TT day of week was discontinued and is not included from Jul-Dec 2015
Excel file containing: * 6-Month Average Daily, AM & PM Peak Traffic Volume * Network Daily Traffic Volume Comparison * 6-Month Average AM & PM Peak Travel Time * Network Travel Time Comparison * Incident Data * Average daily traffic volume for each day of the week (veh/day) * Travel time per kilometre by day of the week (mm:ss/km)
Important Note: This item is in mature support as of June 2023 and will be retired in December 2025.This map shows traffic counts in the United States, collected through 2022 in a multiscale map. Traffic counts are widely used for site selection by real estate firms and franchises. Traffic counts are also used by departments of transportation for highway funding. This map is best viewed at large scales where you can click on each point to access up to five different traffic counts over time. At medium to small scales, comparisons along major roads are possible. The Business Basemap has been added to provide context at medium and small scales. It shows the location of businesses in the United States and helps to understand where and why traffic counts are collected and used. The pop-up is configured to display the following information:The most recent traffic countThe street name where the count was collectedThey type of count that was taken. See the methodology document for definitions of count types such as AADT - Average Annual Daily Traffic. Traffic Counts seasonally adjusted to represent the average day of the year. AADT counts represent counts taken Sunday—Saturday.A graph displaying up to five traffic counts taken at the same location over time. Permitted use of this data is covered in the DATA section of the Esri Master Agreement (E204CW) and these supplemental terms.
https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy
Landing Page Statistics: Landing pages are dedicated web pages designed to convert visitors into leads or customers by focusing on a single, clear call to action. In 2024, the median landing page conversion rate across industries is 6.6%, with top-performing pages exceeding 20%. Email-driven traffic achieves the highest average conversion rate at 19.3%, outperforming paid search (10.9%) and paid social (12%).
Mobile devices account for 82.9% of landing page traffic, yet desktop users exhibit a higher average conversion rate of 12.1% compared to 11.2% for mobile users. Speed is crucial; a one-second delay in page load time can reduce conversions by 7%. Incorporating videos can boost conversions by 86%, and personalized landing pages can convert 202% better than generic ones.
Design elements significantly impact performance. Landing pages with five or fewer form fields convert 120% better than those with more fields. Pages with a single, clear call to action achieve a 13.5% conversion rate, compared to 11.9% for pages with multiple CTAs. Additionally, 38.6% of marketers report that videos enhance landing page conversion rates more than any other element.
Let us check out some of the Landing page statistics concerning landing page performance and the secrets of landing page success.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
What is a high quality website? Over the years the whole SEO industry is talking about the need of producing high quality content and top experts came up with the clever quote ‘Content is king’, meaning that content is the success factor of any website. While this is true, does it mean that a website with good content is also a high quality website? The answer is NO. Good content is not enough. It is one of the factors (the most important) that separates low from high quality sites but good content alone does not complete the puzzle of what is considered by Google as a high quality website. Now you can get the high quality on high quality sites like Nytimes, Jpost, Huffpost and Forbes etc. You can also buy Jpost guest Post at a reasonable price from the best guest post service. What is SEO SEO is short for ‘Search Engine Optimization’. It refers to the process of increasing a websites traffic flow by optimizing several aspects of a website; such as your on-page SEO, technical SEO & off-site SEO,. Your SEO strategy should ideally be planned around your content strategy. For this you will require three elements, 1.) keywords, 2.) links and 3.) substance to piece your content strategy together. Guest Post on High quality sites can improve your SEO ranking. To improve ranking and boost ranking, buy Guest Post on Jpost from the high quality guest post service. Characteristics of a high quality website A high quality website has the following characteristics: Unique content Content is unique both within the website itself (i.e. each page has unique content and not similar to other pages), but also compared to other websites. Demonstrate Expertise Content is produced by experts based on research and or experience. If for example the subject is health related, then the advice should be provided by qualified authors who can professionally give advice for the particular subject. Unbiased content Content is detail and describes both sides of a story and is not promoting a single product, idea or service. Accessibility A high quality website has versions for non PC users as well. It is important that mobile and tablet users can access the website without any usability issues. Usability Can the user navigate the website easily; is the website user friendly? Attention to detail Content is easy to read with images (if applicable) and free of spelling and grammar mistakes. Does it seem that the owner cares on what is published on the website or is it for the purpose of having content in order to run ads? SEO Optimized Optimizing a web site for search engines has many benefits but it is important not to overdo it. A good quality web site needs to have non-optimized content as well. This is my opinion and although some people may disagree it is a fact that over-optimization can sometimes generate the opposite results. The reason is that algorithms can sometimes interpret over-optimization as an attempt to game the system and they may take measures to prevent this from happening. Balance between content and ads It is not something bad for a website to have ads or promotions but these should not distract the users from finding the information they need. Speed A high quality website loads fast. A fast website will rank higher and create more conventions, sales and loyal readers. Social Social media changed our lives, the way we communicate but also the way we assess quality. It is expected for a good product to have good reviews, Facebook likes and Tweets. Before you make a decision to buy or not, you may examine these social factors as well. Likewise, It is also expected for a good website to be socially accepted and recognized i.e. have Facebook followers, RSS subscribers etc. User Engagement and Interaction Do users spend enough time on the site and read more than one pages before they leave? Do they interact with the content by adding comments, making suggestions, getting into conversations etc.? Better than the competition When you take a specific keyword, is your website better than your competitors? Does it deserve one of the top positions if judged without bias?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Valuation Office Website Traffic and Stats . Published by Tailte Éireann – Surveying. Available under the license Creative Commons Attribution 4.0 (CC-BY-4.0).This dataset provides the number of users to the Valuation Office website....
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Puff Bar, a disposable electronic nicotine delivery system (ENDS), was the ENDS brand most commonly used by U.S. youth in 2021. We explored whether Puff Bar’s rise in marketplace prominence was detectable through advertising, retail sales, social media, and web traffic data sources. We retrospectively documented potential signals of interest in and uptake of Puff Bar in the United States using metrics based on advertising (Numerator and Comperemedia), retail sales (NielsenIQ), social media (Twitter, via Sprinklr), and web traffic (Similarweb) data from January 2019 to June 2022. We selected metrics based on (1) data availability, (2) potential to graph metric longitudinally, and (3) variability in metric. We graphed metrics and assessed data patterns compared to data for Vuse, a comparator product, and in the context of regulatory events significant to Puff Bar. The number of Twitter posts that contained a Puff Bar term (social media), Puff Bar product sales measured in dollars (sales), and the number of visits to the Puff Bar website (web traffic) exhibited potential for surveilling Puff Bar due to ease of calculation, comprehensibility, and responsiveness to events. Advertising tracked through Numerator and Comperemedia did not appear to capture marketing from Puff Bar’s manufacturer or drive change in marketplace prominence. This study demonstrates how quantitative changes in metrics developed using advertising, retail sales, social media, and web traffic data sources detected changes in Puff Bar’s marketplace prominence. We conclude that low-effort, scalable, rapid signal detection capabilities can be an important part of a multi-component tobacco surveillance program.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Traffic Volume for Key Brisbane Corridors. Includes traffic volumes, travel times and incidents.This dataset will no longer be updated. Data is being published in a new format in a new dataset called Traffic Management — Key Corridor — Monthly Performance Report.Information on Traffic Management is available on the Brisbane City Council website.This dataset contains the following resources:Traffic Volume for Key Brisbane Corridors.
Excel file containing:
6-Month Average Daily, AM & PM Peak Traffic Volume Network Daily Traffic Volume Comparison 6-Month Average AM & PM Peak Travel Time Network Travel Time Comparison Incident Data Note: volume day of the week and TT day of week was discontinued and is not included from Jul-Dec 2015Traffic Volume for Key Brisbane Corridors.
Excel file containing:
6-Month Average Daily, AM & PM Peak Traffic Volume Network Daily Traffic Volume Comparison 6-Month Average AM & PM Peak Travel Time Network Travel Time Comparison Incident Data Average daily traffic volume for each day of the week (veh/day) Travel time per kilometre by day of the week (mm:ss/km)
As of December 2024, around ** percent of the web traffic in Indonesia was accessed with mobile phones. By comparison, around ** percent of the web traffic in the country came from laptop and desktop computers that year.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The DNS over HTTPS (DoH) is becoming a default option for domain resolution in modern privacy-aware software. Therefore, research has already focused on various aspects; however, a comprehensive dataset from an actual production network is still missing. In this paper, we present a novel dataset, which comprises multiple PCAP files of DoH traffic. The captured traffic is generated towards various DoH providers to cover differences of various DoH server implementations and configurations. In addition to generated traffic, we also provide real network traffic captured on high-speed backbone lines of a large Internet Service Provider with around half a million users. Network identifiers (excluding network identifiers of DoH resolvers) in the real network traffic (e.g., IP addresses and transmitted content) were anonymized, but still, the important characteristics of the traffic can still be obtained from the data that can be used, e.g., for network traffic classification research. The real network traffic dataset contains DoH and also non-DoH HTTPS traffic as observed at the collection points in the network.
This repository provides supplementary files for the "Collection of Datasets with DNS over HTTPS Traffic" :
─── supplementary_files | - Directory with supplementary files (scripts, DoH resolver list) used for dataset creation ├── chrome | - Generation scripts for Chrome browser and visited websites during generation ├── doh_resolvers | - The list of DoH resolvers used for filter creation during ISP backbone capture ├── firefox | - Generation scripts for Firefox browser and visited websites during generation └── pcap-anonymizer | - Anonymization script of real backbone captures
Collection of datasets:
DoH-Gen-F-AABBC --- https://doi.org/10.5281/zenodo.5957277
DoH-Gen-F-FGHOQS --- https://doi.org/10.5281/zenodo.5957121
DoH-Gen-F-CCDDD --- https://doi.org/10.5281/zenodo.5957420
DoH-Gen-C-AABBCC --- https://doi.org/10.5281/zenodo.5957465
DoH-Gen-C-DDD -- https://doi.org/10.5281/zenodo.5957676
DoH-Gen-C-CFGHOQS --- https://doi.org/10.5281/zenodo.5957659
DoH-Real-world --- https://doi.org/10.5281/zenodo.5956043
As of the last quarter of 2023, 31.57 percent of web traffic in the United States originated from mobile devices, down from 49.51 percent in the fourth quarter of 2022. In comparison, over half of web traffic worldwide was generated via mobile in the last examined period.