Google.com was the website with the most page views per day in Bolivia in February 2022, according to ranking by Alexa. The website had more than 18.49 daily page views and was followed by Unitel.bo, with 11 page views per day that month. Within Latin America, Mexico was the country where Amazon Alexa contained the largest number of skills.
Google.com, youtube.com, and facebook.com were the most visited websites in Ukraine in December 2021. Furthermore, Google's website on the Ukrainian domain, google.com.ua, ranked in the top 10 during that time.
This dataset is composed of the URLs of the top 1 million websites. The domains are ranked using the Alexa traffic ranking which is determined using a combination of the browsing behavior of users on the website, the number of unique visitors, and the number of pageviews. In more detail, unique visitors are the number of unique users who visit a website on a given day, and pageviews are the total number of user URL requests for the website. However, multiple requests for the same website on the same day are counted as a single pageview. The website with the highest combination of unique visitors and pageviews is ranked the highest
In 2019, the Chinese marketplace Alibaba was the leading worldwide B2B e-commerce in terms of online traffic. The Alexa tool assessing the online traffic of websites put it on the top of the ranking, with a score of 177. The Russian Rosfirm and the U.S. platform Vinsuite followed in the ranking with a score of 1,047 and 1.137, respectively.
This dataset was created by DNS_dataset
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General data recollected for the studio " Analysis of the Quantitative Impact of Social Networks on Web Traffic of Cybermedia in the 27 Countries of the European Union".
Four research questions are posed: what percentage of the total web traffic generated by cybermedia in the European Union comes from social networks? Is said percentage higher or lower than that provided through direct traffic and through the use of search engines via SEO positioning? Which social networks have a greater impact? And is there any degree of relationship between the specific weight of social networks in the web traffic of a cybermedia and circumstances such as the average duration of the user's visit, the number of page views or the bounce rate understood in its formal aspect of not performing any kind of interaction on the visited page beyond reading its content?
To answer these questions, we have first proceeded to a selection of the cybermedia with the highest web traffic of the 27 countries that are currently part of the European Union after the United Kingdom left on December 31, 2020. In each nation we have selected five media using a combination of the global web traffic metrics provided by the tools Alexa (https://www.alexa.com/), which ceased to be operational on May 1, 2022, and SimilarWeb (https:// www.similarweb.com/). We have not used local metrics by country since the results obtained with these first two tools were sufficiently significant and our objective is not to establish a ranking of cybermedia by nation but to examine the relevance of social networks in their web traffic.
In all cases, cybermedia whose property corresponds to a journalistic company have been selected, ruling out those belonging to telecommunications portals or service providers; in some cases they correspond to classic information companies (both newspapers and televisions) while in others they refer to digital natives, without this circumstance affecting the nature of the research proposed.
Below we have proceeded to examine the web traffic data of said cybermedia. The period corresponding to the months of October, November and December 2021 and January, February and March 2022 has been selected. We believe that this six-month stretch allows possible one-time variations to be overcome for a month, reinforcing the precision of the data obtained.
To secure this data, we have used the SimilarWeb tool, currently the most precise tool that exists when examining the web traffic of a portal, although it is limited to that coming from desktops and laptops, without taking into account those that come from mobile devices, currently impossible to determine with existing measurement tools on the market.
It includes:
Web traffic general data: average visit duration, pages per visit and bounce rate Web traffic origin by country Percentage of traffic generated from social media over total web traffic Distribution of web traffic generated from social networks Comparison of web traffic generated from social netwoks with direct and search procedures
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We are publishing a dataset we created for the HTTPS traffic classification.
Since the data were captured mainly in the real backbone network, we omitted IP addresses and ports. The datasets consist of calculated from bidirectional flows exported with flow probe Ipifixprobe. This exporter can export a sequence of packet lengths and times and a sequence of packet bursts and time. For more information, please visit ipfixprobe repository (Ipifixprobe).
During our research, we divided HTTPS into five categories: L -- Live Video Streaming, P -- Video Player, M -- Music Player, U -- File Upload, D -- File Download, W -- Website, and other traffic.
We have chosen the service representatives known for particular traffic types based on the Alexa Top 1M list and Moz's list of the most popular 500 websites for each category. We also used several popular websites that primarily focus on the audience in our country. The identified traffic classes and their representatives are provided below:
Taken from chapter 5 of Machine learning cookbook for cyber security
number in brackets states the number of described features
"..." - stands for multiple optional names that match the given pattern
.
..._ip, ..._port
(4): IP and port of client / srver
packets_...
(3): Number of packets sent by client / server / both
ack_...
(3): Number of ACK packets sent by client / server / both
packets_A_B_ratio
(1): Ratio between packets sent by client and sent by server
asn_...
(2): Number of autonomous systems served as client, server
push_...
(3): Number of packets with PSH flag sent by client / server / both
bytes_...
(3): Number of bytes sent by client / server / both
reset_...
(3): Number of packets with RST flag sent by client / server / both
bytes_A_B_ratio
(1): Ratio between number of bytes sent and number of bytes received
ssl_count_certificates
(1): Number of SSL certificates
cap_date
(1): date of data capturing start
ssl_count_client_...
(6): Client: Number of supported SSL cipher algorithms / ciphersuites / compressions / eliptic curves / key exchange algorithms / MAC algorithms
country_...
(2): Number of countries systems served as client / server
ssl_count_server_...
(6): Server: Number of supported SSL cipher algorithms / ciphersuites / compressions / eliptic curves / key exchange algorithms / MAC algorithms
daysTime
(1): When during the day communication was established
ssl_dom_server_ciphersuite
(1): Number of SSL versions
dns_alexaRank
(1): DNS response server Alexa rank
ssl_dom_server_compression
(5): Dominated SSL ciphersuite / eliptic curve / server name / server rank / version
dns_count_addresses
(4): Number of adresses / answer / authoritative / additional fields in DNS response
ssl_handshake_duration_...
(10): SSL handshake duration: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
dns_count_canon_names
(1): Number of canonical names in DNS response
ssl_ratio_...
(7): Ratio between ssl sessions and: expired certificates / client cipher algorithms / ciphersuits / eliptic curves / client key exchange algorithms / client MAC algorithms / server names
dns_flag
(1): DNS response flags combinations
ssl_req_bytes_...
(10): Number of request bytes: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
dns_host_name
(1): DNS host name
ssl_resp_bytes_...
(10): Number of response bytes: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
dns_min_ttl
(1): DNS response minimal time-to-live
start
(1): session start (date-time)
dns_pre_bad_requests
(1): Number of preceding bad DNS responses
tcp_analysis_...
(6): TCP: Number of packets with Keep Alive packets / lost segments / packets received out of order / retransmitted packets / reused ports / duplicake ACKs
dns_time
(1): Time took to receive DNS response
ttl_A_...
(10): TCP packet time-to-live sent by client: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
ds_field_...
(2): Differentiated Services (DS) field sent by client / server
ttl_...
(10): TCP packet time-to-live: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
duration
(1): Session duration
ttl_B_...
(10): TCP packet time-to-live sent by server: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
http_bytes_...
(10): Number of bytes sent by client over HTTP: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy.
urg_...
(3): Packets with URG flag sent by client / server / both
http_cookie_count
(1): Total number of cookie values
weekDay
(1): day of week (Sunday, Monday, …)
http_cookie_values_...
(10): Number of cookie values: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
domain / subdomain / suffix
(3): Dminated host's URL: domain / subdomain / suffix
http_count_...
(6): HTTP: Number of hosts / unique content types used in request / unique response codes / unique response content types / transactions / unique user agents
is_ad_http
(1): subdomain of HTTP dominated host includes ad-related keywords
http_dom_...
(8): Dominated HTTP: browser / browser version / host / host Alexa rank / operating system / operating system version / request contetn type / response code / response contetn type /
is_cdn_http
(1): subdomain of HTTP dominated host includes CDN-related keywords
http_dom_is_bot
(1): Is most of HTTP connections created by known bot
is_cloud_http
(1): subdomain of HTTP dominated host includes cloud-related keywords
http_GET
(1): Number of HTTP requests submited with GET method
is_...
(3): session protocol is DNS / HTTP / SSL
http_has_...
(4): Does HTTP request have a location / referrer / content type / user agent field
is_g_http
(1): subdomain of HTTP dominated host includes g-related keywords
http_has_resp_content_type
(1): Does HTTP response have a content type field
is_img_http
(1): subdomain of HTTP dominated host includes image-related keywords
http_inter_arrivel_...
(10): HTTP request-response inter arrival time: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
is_m_http
(1): subdomain of HTTP dominated host includes mobile-related keywords
http_POST
(1): Number of HTTP requests submited with POST method
is_maker_site_http
(1): subdomain of HTTP dominated host is of a maker's site
http_req_bytes_...
(10): HTTP request bytes: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
is_media_http
(1): subdomain of HTTP dominated host includes media-related keywords
http_resp_bytes_...
(10): HTTP response bytes: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
is_numeric_url_http
(1): subdomain of HTTP dominated host is numeric
http_time_...
(10): Time took to HTTP server to return response: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
is_numeric_url_with_port_http
(1): subdomain of HTTP dominated host is numeric plus port name
label
(1): malware label
is_tv_http
(1): HTTP dominated host has TV-related keywords
labelSS
(1): malware label
B_is_system_port
(1): destination port is in the range of [1, 1023]
packet_inter_arrivel_A_...
(10): Client packets inter arival time: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
B_is_user_port
(1): destination port is in the range of [1024, 49151]
packet_inter_arrivel_...
(10): Packets inter arival time: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
B_is_dynamic_and_or_private_port
(1): destination port is in the range of [49152, 65535]
packet_inter_arrivel_B_...
(10): Server packets inter arival time: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
B_port_is_...
(10): Destination port is one of recent top 10 most frequent: 80, 23, etc.
packet_size_A_...
(10): Client packets size: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
subdomain_is_...
(10): subdomain of HTTP dominated host is one of recent top 10 most frequent
packet_size_...
(10): Packets size: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
domain_is_...
(10): domain of HTTP dominated host is one of recent top 10 most frequent
packet_size_B_...
(10): Server packets size: Minimum value, quartile 1, average, median (quartile 2), sum, quartile 3, maximum value, standard deviation, variance, entropy
suffix_is_...
(4): suffix of HTTP dominated host is one of recent top 4most frequent: com, net, etc.
As of 2022, google.com was the most visited website in Morocco, registering an average of 371 million visits per month. The second most visited webpage was youtube.com, with over 200 million monthly visits. Furthermore, the most popular web browser in the country was Chrome.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
With the language skill Alexa answers the status of a particular swimming pool or gives you the status of all Viennese baths. The skill has already been created as a proof-of-concept and is available at the link below. e.g.: Alexa, ask Viennese baths about the status of Ottakringerbad.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are similarity matrices of countries based on dfferent modalities of web use. Alexa website traffic, trending vidoes on Youtube and Twitter trends. Each matrix is a month of data aggregated
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We present a dataset targeting a large set of popular pages (Alexa top-500), from probes from several ISPs networks, browsers software (Chrome, Firefox) and viewport combinations, for over 200,000 experiments realized in 2019.We purposely collect two distinct sets with two different tools, namely Web Page Test (WPT) and Web View (WV), varying a number of relevant parameters and conditions, for a total of 200K+ web sessions, roughly equally split among WV and WPT. Our dataset comprises variations in terms of geographical coverage, scale, diversity and representativeness (location, targets, protocol, browser, viewports, metrics).For Web Page Test, we used the online service www.webpagetest.org at different locations worldwide (Europe, Asia, USA) and private WPT instances in three locations in China (Beijing, Shanghai, Dongguan). The list of target URLs comprised the main pages and five random subpages from Alexa top-500 worldwide and China. We varied network conditions : native connections and 4G, FIOS, 3GFast, DSL, and custom shaping/loss conditions. The other elements in the configuration were fixed: Chrome browser on desktop with a fixed screen resolution, HTTP/2 protocol and IPv4.For Web View, we collected experiments from three machines located in France. We selected two versions of two browser families (Chrome 75/77, Firefox 63/68), two screen sizes (1920x1080, 1440x900), and employ different browser configurations (one half of the experiments activate the AdBlock plugin) from two different access technologies (fiber and ADSL). From a protocol standpoint, we used both IPv4 and IPv6, with HTTP/2 and QUIC, and performed repeated experiments with cached objects/DNS. Given the settings diversity, we restricted the number of websites to about 50 among the Alexa top-500 websites, to ensure statistical relevance of the collected samples for each page.The two archives IFIPNetworking2020_WebViewOrange.zip
and IFIPNetworking2020_Webpagetest.zip
correspond respectively to the Web View experiments and to the Web Page Test experiments.Each archive contains three files:- config.csv
: Description of parameters and conditions for each run,- metrics.csv
: Value of different metrics collected by the browser,- progressionCurves.csv
: Progression curves of the bytes progress as seen by the network, from 0 to 10 seconds by steps of 100 milliseconds,- listUrl
folder: Indexes the sets of urls.Regarding config.csv
, the columns are: - index: Index for this set of conditions, - location: Location of the machine, - listUrl: List of urls, located in the folder listUrl - browserUsed: Internet browser and version - terminal: Desktop or Mobile - collectionEnvironment: Identification of the collection environment - networkConditionsTrafficShaping (WPT only): Whether native condition or traffic shaping (4G, FIOS, 3GFast, DSL, or custom Emulator conditions) - networkConditionsBandwidth (WPT only): Bandwidth of the network - networkConditionsDelay (WPT only): Delay in the network - networkConditions (WV only): network conditions - ipMode (WV only): requested L3 protocol, - requestedProtocol (WV only): requested L7 protocol - adBlocker (WV only): Whether adBlocker is used or not - winSize (WV only): Window sizeRegarding metrics.csv
, the columns are: - id: Unique identification of an experiment (consisting of an index 'set of conditions' and an index 'current page') - DOM Content Loaded Event End (ms): DOM time, - First Paint (ms) (WV only): First paint time, - Load Event End (ms): Page Load Time from W3C, - RUM Speed Index (ms) (WV only): RUM Speed Index, - Speed Index (ms) (WPT only): Speed Index, - Time for Full Visual Rendering (ms) (WV only): Time for Full Visual Rendering - Visible portion (%) (WV only): Visible portion, - Time to First Byte (ms) (WPT only): Time to First Byte, - Visually Complete (ms) (WPT only): Visually Complete used to compute the Speed Index, - aatf: aatf using ATF-chrome-plugin - bi_aatf: bi_aatf using ATF-chrome-plugin - bi_plt: bi_plt using ATF-chrome-plugin - dom: dom using ATF-chrome-plugin - ii_aatf: ii_aatf using ATF-chrome-plugin - ii_plt: ii_plt using ATF-chrome-plugin - last_css: last_css using ATF-chrome-plugin - last_img: last_img using ATF-chrome-plugin - last_js: last_js using ATF-chrome-plugin - nb_ress_css: nb_ress_css using ATF-chrome-plugin - nb_ress_img: nb_ress_img using ATF-chrome-plugin - nb_ress_js: nb_ress_js using ATF-chrome-plugin - num_origins: num_origins using ATF-chrome-plugin - num_ressources: num_ressources using ATF-chrome-plugin - oi_aatf: oi_aatf using ATF-chrome-plugin - oi_plt: oi_plt using ATF-chrome-plugin - plt: plt using ATF-chrome-pluginRegarding progressionCurves.csv
, the columns are: - id: Unique identification of an experiment (consisting of an index 'set of conditions' and an index 'current page') - url: Url of the current page. SUBPAGE stands for a path. - run: Current run (linked with index of the config for WPT) - filename: Filename of the pcap - fullname: Fullname of the pcap - har_size: Size of the HAR for this experiment, - pagedata_size: Size of the page data report - pcap_size: Size of the pcap - App Byte Index (ms): Application Byte Index as computed from the har file (in the browser) - bytesIn_APP: Total bytes in as seen in the browser, - bytesIn_NET: Total bytes in as seen in the network, - X_BI_net: Network Byte Index computed from the pcap file (in the network) - X_bin_0_for_B_completion to X_bin_99_for_B_completion: X_bin_k_for_B_completion is the bytes progress reached after k*100 millisecondsIf you use these datasets in your research, you can reference to the appropriate paper:@inproceedings{qoeNetworking2020, title={Revealing QoE of Web Users from Encrypted Network Traffic}, author={Huet, Alexis and Saverimoutou, Antoine and Ben Houidi, Zied and Shi, Hao and Cai, Shengming and Xu, Jinchun and Mathieu, Bertrand and Rossi, Dario}, booktitle={2020 IFIP Networking Conference (IFIP Networking)}, year={2020}, organization={IEEE}}
The traffic ranking of Shopify stores indicated that colourpop.com was the most visited site, scoring 3,311 on the Alexa traffic rank. As of November 2022, the second-most visited e-commerce site built on Shopify software was jeffreestarcosmetics.com, while the online store of Fashion Nova ranked third.
In 2019, the North American marketplace Uship was the leading platform for B2B e-commerce in the logistics and transport sector in terms of online traffic.The online traffic ranking provided by Alexa showed that Flexport and Convoy were the second- and third-most visited marketplaces by other businesses in the logistics and trasport industry.
In the ranking of B2B marketplaces for medical supplies, the platform Net32 registered the highest online traffic worldwide as of 2019. According to the Alexa ranking, the North American e-commerce was followed by Openmarkets, the second-most visited marketplace by businesses operating in the sector of medical supplies.
In 2019, the North American marketplace Vinsuite was the leading platform for B2B e-commerce in the food industry in terms of worldwide online traffic.The online traffic ranking provided by Alexa showed that the Chinese e-commerce 21Food was the second-most visited marketplace by businesses operating in the food industry.
In 2019, the North American Car-Part was the leading platform for B2B e-commerce in the automotive industry in terms of worldwide online traffic. According to the Alexa traffic ranking, the Germany-based Tyre24.alzura was the second-most visited marketplace by businesses purchasing vehicle parts.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Google.com was the website with the most page views per day in Bolivia in February 2022, according to ranking by Alexa. The website had more than 18.49 daily page views and was followed by Unitel.bo, with 11 page views per day that month. Within Latin America, Mexico was the country where Amazon Alexa contained the largest number of skills.