100+ datasets found

Z
Network Traffic Analysis: Data and Code
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Jun 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moran, Madeline; Honig, Joshua; Ferrell, Nathan; Soni, Shreena; Homan, Sophia; Chan-Tin, Eric (2024). Network Traffic Analysis: Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11479410
Explore at:
Dataset updated
Jun 12, 2024
Dataset provided by
Loyola University Chicago
Authors
Moran, Madeline; Honig, Joshua; Ferrell, Nathan; Soni, Shreena; Homan, Sophia; Chan-Tin, Eric
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Code:

Packet_Features_Generator.py & Features.py

To run this code:

pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j

-h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j

Purpose:

Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.

Uses Features.py to calcualte the features.

startMachineLearning.sh & machineLearning.py

To run this code:

bash startMachineLearning.sh

This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags

Options (to be edited within this file):

--evaluate-only to test 5 fold cross validation accuracy

--test-scaling-normalization to test 6 different combinations of scalers and normalizers

Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use

--grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'

Purpose:

Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.

Data

Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.

Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:

First number is a classification number to denote what website, query, or vr action is taking place.

The remaining numbers in each line denote:

The size of a packet,

and the direction it is traveling.

negative numbers denote incoming packets

positive numbers denote outgoing packets

Figure 4 Data

This data uses specific lines from the Virtual Reality.txt file.

The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.

The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.

The .xlsx and .csv file are identical

Each file includes (from right to left):

The origional packet data,

each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,

and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.
Impact of AI on website traffic anticipated by digital marketers worldwide...
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Impact of AI on website traffic anticipated by digital marketers worldwide 2023 [Dataset]. https://www.statista.com/statistics/1410386/impact-ai-website-traffic-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
According to the results of a survey conducted worldwide in 2023, nearly **** of responding digital marketers believed artificial intelligence (AI) would have a positive impact on website search traffic in the next five years. Some ** percent stated AI would have a neutral effect, while ** percent agreed that the technology would negatively impact search traffic.
W
Website Visitor Tracking Software Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Website Visitor Tracking Software Report [Dataset]. https://www.marketresearchforecast.com/reports/website-visitor-tracking-software-27553
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Mar 5, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Discover the booming website visitor tracking software market! Our analysis reveals a $5 billion market in 2025, projected to reach $15 billion by 2033, driven by digital marketing, data-driven decisions, and AI-powered analytics. Learn about key players, market trends, and regional insights.
Leading K12 and test preparation platforms in India 2022, by website traffic...
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Leading K12 and test preparation platforms in India 2022, by website traffic [Dataset]. https://www.statista.com/statistics/1413860/india-k12-and-test-preparation-platforms-by-website-traffic/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jul 2022 - Sep 2022
Area covered
India
Description
Between July and September 2022, BYJU's emerged as the top Ed Tech platform for K12 and test preparation In India. It recorded approximately *** million website visits. Following closely behind was Toppr.com, with around *** million visits during the same period.
Total global visitor traffic to Google.com 2024
statista.com
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Total global visitor traffic to Google.com 2024 [Dataset]. https://www.statista.com/statistics/268252/web-visitor-traffic-to-googlecom/
Explore at:
Dataset updated
Aug 20, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Oct 2023 - Mar 2024
Area covered
Worldwide
Description
In March 2024, search platform Google.com generated approximately 85.5 billion visits, down from 87 billion platform visits in October 2023. Google is a global search platform and one of the biggest online companies worldwide.
e
testing-library.com Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). testing-library.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/testing-library.com
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank
Description
Traffic analytics, rankings, and competitive metrics for testing-library.com as of September 2025
test-velocidad.com Website Traffic, Ranking, Analytics [October 2025]
semrush.ebundletools.com
Updated Nov 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). test-velocidad.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/test-velocidad.com/overview/
Explore at:
Dataset updated
Nov 11, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/
Time period covered
Nov 11, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
test-velocidad.com is ranked #27469 in ES with 62.33K Traffic. Categories: Information Technology, Telecom. Learn more about website traffic, market share, and more!
d
Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant
datarade.ai
.csv, .xls
Updated Jun 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swash (2023). Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant [Dataset]. https://datarade.ai/data-products/swash-blockchain-bitcoin-and-web3-enthusiasts-swash
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
Jun 27, 2023
Dataset authored and provided by
Swash
Area covered
Jordan, Saint Vincent and the Grenadines, Latvia, Belarus, Jamaica, India, Uzbekistan, Liechtenstein, Russian Federation, Monaco
Description
Unlock the Power of Behavioural Data with GDPR-Compliant Clickstream Insights.

Swash clickstream data offers a comprehensive and GDPR-compliant dataset sourced from users worldwide, encompassing both desktop and mobile browsing behaviour. Here's an in-depth look at what sets us apart and how our data can benefit your organisation.

User-Centric Approach: Unlike traditional data collection methods, we take a user-centric approach by rewarding users for the data they willingly provide. This unique methodology ensures transparent data collection practices, encourages user participation, and establishes trust between data providers and consumers.

Wide Coverage and Varied Categories: Our clickstream data covers diverse categories, including search, shopping, and URL visits. Whether you are interested in understanding user preferences in e-commerce, analysing search behaviour across different industries, or tracking website visits, our data provides a rich and multi-dimensional view of user activities.

GDPR Compliance and Privacy: We prioritise data privacy and strictly adhere to GDPR guidelines. Our data collection methods are fully compliant, ensuring the protection of user identities and personal information. You can confidently leverage our clickstream data without compromising privacy or facing regulatory challenges.

Market Intelligence and Consumer Behaviuor: Gain deep insights into market intelligence and consumer behaviour using our clickstream data. Understand trends, preferences, and user behaviour patterns by analysing the comprehensive user-level, time-stamped raw or processed data feed. Uncover valuable information about user journeys, search funnels, and paths to purchase to enhance your marketing strategies and drive business growth.

High-Frequency Updates and Consistency: We provide high-frequency updates and consistent user participation, offering both historical data and ongoing daily delivery. This ensures you have access to up-to-date insights and a continuous data feed for comprehensive analysis. Our reliable and consistent data empowers you to make accurate and timely decisions.

Custom Reporting and Analysis: We understand that every organisation has unique requirements. That's why we offer customisable reporting options, allowing you to tailor the analysis and reporting of clickstream data to your specific needs. Whether you need detailed metrics, visualisations, or in-depth analytics, we provide the flexibility to meet your reporting requirements.

Data Quality and Credibility: We take data quality seriously. Our data sourcing practices are designed to ensure responsible and reliable data collection. We implement rigorous data cleaning, validation, and verification processes, guaranteeing the accuracy and reliability of our clickstream data. You can confidently rely on our data to drive your decision-making processes.
testing.com Website Traffic, Ranking, Analytics [October 2025]
semrush.ebundletools.com
Updated Nov 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). testing.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/testing.com/overview/
Explore at:
Dataset updated
Nov 11, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/
Time period covered
Nov 11, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
testing.com is ranked #23656 in US with 496.78K Traffic. Categories: Healthcare, Wellness. Learn more about website traffic, market share, and more!
e
e-verify.gov Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). e-verify.gov Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/e-verify.gov
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank, Government Category Rank
Description
Traffic analytics, rankings, and competitive metrics for e-verify.gov as of September 2025
Leading websites worldwide 2025, by monthly visits
statista.com
boostndoto.org
Updated Oct 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Leading websites worldwide 2025, by monthly visits [Dataset]. https://www.statista.com/statistics/1201880/most-visited-websites-worldwide/
Explore at:
Dataset updated
Oct 29, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Aug 2025
Area covered
Worldwide
Description
In August 2025, Google.com was the most visited website worldwide, with an average of 98.2 billion monthly visits. The platform has maintained its leading position since June 2010, when it surpassed Yahoo to take first place. YouTube ranked second during the same period, recording over 48 billion monthly visits. The internet leaders: search, social, and e-commerce Social networks, search engines, and e-commerce websites shape the online experience as we know it. While Google leads the global online search market by far, YouTube and Facebook have become the world’s most popular websites for user generated content, solidifying Alphabet’s and Meta’s leadership over the online landscape. Meanwhile, websites such as Amazon and eBay generate millions in profits from the sale and distribution of goods, making the e-market sector an integral part of the global retail scene. What is next for online content? Powering social media and websites like Reddit and Wikipedia, user-generated content keeps moving the internet’s engines. However, the rise of generative artificial intelligence will bring significant changes to how online content is produced and handled. ChatGPT is already transforming how online search is performed, and news of Google's 2024 deal for licensing Reddit content to train large language models (LLMs) signal that the internet is likely to go through a new revolution. While AI's impact on the online market might bring both opportunities and challenges, effective content management will remain crucial for profitability on the web.
M
Google Search: The Most-visited Website in the World
scoop.market.us
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market.us Scoop (2024). Google Search: The Most-visited Website in the World [Dataset]. https://scoop.market.us/google-search-the-most-visited-website-in-the-world/
Explore at:
Dataset updated
May 31, 2024
Dataset authored and provided by
Market.us Scoop
License
https://scoop.market.us/privacy-policyhttps://scoop.market.us/privacy-policy
Time period covered
2022 - 2032
Area covered
Global, World
Description
Google Search Statistics 2023

Google is the most searched website in the World.

Google receives more visitors than any other site. Google is accessed 89.3 trillion times per month.

Google is used by billions of people every day to conduct their searches. Google is much more than a simple search engine.

Google provides many other services. Google Shopping and Google News also feature. Google Mail, Google's popular email service, is included.

Google organic search traffic is 16.3% of the total US searches.
gpc-check.com Website Traffic, Ranking, Analytics [October 2025]
semrush.ebundletools.com
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). gpc-check.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/gpc-check.com/overview/
Explore at:
Dataset updated
Nov 12, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/
Time period covered
Nov 12, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
gpc-check.com is ranked #5072 in JP with 608.47K Traffic. Categories: Online Services. Learn more about website traffic, market share, and more!
Share of global mobile website traffic 2015-2025
statista.com
Updated Nov 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Share of global mobile website traffic 2015-2025 [Dataset]. https://www.statista.com/statistics/277125/share-of-website-traffic-coming-from-mobile-devices/
Explore at:
Dataset updated
Nov 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In the second quarter of 2025, mobile devices (excluding tablets) accounted for 62.54 percent of global website traffic. Since consistently maintaining a share of around 50 percent beginning in 2017, mobile usage surpassed this threshold in 2020 and has demonstrated steady growth in its dominance of global web access. Mobile traffic Due to low infrastructure and financial restraints, many emerging digital markets skipped the desktop internet phase entirely and moved straight onto mobile internet via smartphone and tablet devices. India is a prime example of a market with a significant mobile-first online population. Other countries with a significant share of mobile internet traffic include Nigeria, Ghana and Kenya. In most African markets, mobile accounts for more than half of the web traffic. By contrast, mobile only makes up around 45.49 percent of online traffic in the United States. Mobile usage The most popular mobile internet activities worldwide include watching movies or videos online, e-mail usage and accessing social media. Apps are a very popular way to watch video on the go and the most-downloaded entertainment apps in the Apple App Store are Netflix, Tencent Video and Amazon Prime Video.
search.yahoo.com Website Traffic, Ranking, Analytics [October 2025]
sam1.toolsspider.com
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). search.yahoo.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://sam1.toolsspider.com/website/search.yahoo.com/overview/
Explore at:
Dataset updated
Nov 12, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://sam1.toolsspider.com/company/legal/terms-of-service/https://sam1.toolsspider.com/company/legal/terms-of-service/
Time period covered
Nov 12, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
search.yahoo.com is ranked #6 in US with 1.64B Traffic. Categories: . Learn more about website traffic, market share, and more!
bigfive-test.com Website Traffic, Ranking, Analytics [October 2025]
semrush.ebundletools.com
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). bigfive-test.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/bigfive-test.com/overview/
Explore at:
Dataset updated
Nov 12, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/
Time period covered
Nov 12, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
bigfive-test.com is ranked #53497 in US with 472.26K Traffic. Categories: Human Resources. Learn more about website traffic, market share, and more!
e
av-test.org Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). av-test.org Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/av-test.org
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank
Description
Traffic analytics, rankings, and competitive metrics for av-test.org as of September 2025
search-owl.com Website Traffic, Ranking, Analytics [September 2025]
sem3.heaventechit.com
Updated Oct 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). search-owl.com Website Traffic, Ranking, Analytics [September 2025] [Dataset]. https://sem3.heaventechit.com/website/search-owl.com/overview/
Explore at:
Dataset updated
Oct 12, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://sem3.heaventechit.com/company/legal/terms-of-service/https://sem3.heaventechit.com/company/legal/terms-of-service/
Time period covered
Oct 12, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
search-owl.com is ranked #2259 in DE with 1.61M Traffic. Categories: . Learn more about website traffic, market share, and more!
e
test-ipv6.com Traffic Analytics Data
analytics.explodingtopics.com
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). test-ipv6.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/test-ipv6.com
Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank
Description
Traffic analytics, rankings, and competitive metrics for test-ipv6.com as of September 2025
himera-search.app Website Traffic, Ranking, Analytics [October 2025]
sem3.heaventechit.com
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semrush (2025). himera-search.app Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://sem3.heaventechit.com/website/himera-search.app/overview/
Explore at:
Dataset updated
Nov 12, 2025
Dataset authored and provided by
Semrushhttps://fr.semrush.com/
License
https://sem3.heaventechit.com/company/legal/terms-of-service/https://sem3.heaventechit.com/company/legal/terms-of-service/
Time period covered
Nov 12, 2025
Area covered
Worldwide
Variables measured
visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
Measurement technique
Semrush Traffic Analytics; Click-stream data
Description
himera-search.app is ranked #179714 in RU with 11.46K Traffic. Categories: . Learn more about website traffic, market share, and more!

Facebook

Twitter

Click to copy link

Link copied

Cite

Moran, Madeline; Honig, Joshua; Ferrell, Nathan; Soni, Shreena; Homan, Sophia; Chan-Tin, Eric (2024). Network Traffic Analysis: Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11479410

Network Traffic Analysis: Data and Code

Explore at:

Dataset updated

Jun 12, 2024

Dataset provided by

Loyola University Chicago

Authors

Moran, Madeline; Honig, Joshua; Ferrell, Nathan; Soni, Shreena; Homan, Sophia; Chan-Tin, Eric

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Code:

Packet_Features_Generator.py & Features.py

To run this code:

pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j

-h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j

Purpose:

Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.

Uses Features.py to calcualte the features.

startMachineLearning.sh & machineLearning.py

To run this code:

bash startMachineLearning.sh

This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags

Options (to be edited within this file):

--evaluate-only to test 5 fold cross validation accuracy

--test-scaling-normalization to test 6 different combinations of scalers and normalizers

Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use

--grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'

Purpose:

Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.

Data

Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.

Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:

First number is a classification number to denote what website, query, or vr action is taking place.

The remaining numbers in each line denote:

The size of a packet,

and the direction it is traveling.

negative numbers denote incoming packets

positive numbers denote outgoing packets

Figure 4 Data

This data uses specific lines from the Virtual Reality.txt file.

The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.

The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.

The .xlsx and .csv file are identical

Each file includes (from right to left):

The origional packet data,

each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,

and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.

Clear search

Close search

Google apps

Main menu

Network Traffic Analysis: Data and Code

Impact of AI on website traffic anticipated by digital marketers worldwide...

Website Visitor Tracking Software Report

Leading K12 and test preparation platforms in India 2022, by website traffic...

Total global visitor traffic to Google.com 2024

testing-library.com Traffic Analytics Data

test-velocidad.com Website Traffic, Ranking, Analytics [October 2025]

Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant

testing.com Website Traffic, Ranking, Analytics [October 2025]

e-verify.gov Traffic Analytics Data

Leading websites worldwide 2025, by monthly visits

Google Search: The Most-visited Website in the World

Google Search Statistics 2023

gpc-check.com Website Traffic, Ranking, Analytics [October 2025]

Share of global mobile website traffic 2015-2025

search.yahoo.com Website Traffic, Ranking, Analytics [October 2025]

bigfive-test.com Website Traffic, Ranking, Analytics [October 2025]

av-test.org Traffic Analytics Data

search-owl.com Website Traffic, Ranking, Analytics [September 2025]

test-ipv6.com Traffic Analytics Data

himera-search.app Website Traffic, Ranking, Analytics [October 2025]

Network Traffic Analysis: Data and Code