20 datasets found
  1. e

    alexa.com Traffic Analytics Data

    • analytics.explodingtopics.com
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). alexa.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/alexa.com
    Explore at:
    Dataset updated
    Sep 1, 2025
    Variables measured
    Global Rank, Monthly Visits, Authority Score, US Country Rank
    Description

    Traffic analytics, rankings, and competitive metrics for alexa.com as of September 2025

  2. alexa.com Website Traffic, Ranking, Analytics [September 2025]

    • semrush.ebundletools.com
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). alexa.com Website Traffic, Ranking, Analytics [September 2025] [Dataset]. https://semrush.ebundletools.com/website/alexa.com/overview/
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/

    Time period covered
    Nov 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    alexa.com is ranked #317632 in US with 31.23K Traffic. Categories: Advertising and Marketing, Computer Software and Development, Information Technology, Online Services. Learn more about website traffic, market share, and more!

  3. Top websites by average monthly traffic according to Alexa in Ukraine 2021

    • statista.com
    Updated Feb 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Top websites by average monthly traffic according to Alexa in Ukraine 2021 [Dataset]. https://www.statista.com/statistics/1278726/ukraine-most-visited-websites-by-monthly-traffic/
    Explore at:
    Dataset updated
    Feb 15, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2021
    Area covered
    Ukraine
    Description

    Google.com, youtube.com, and facebook.com were the most visited websites in Ukraine in December 2021. Furthermore, Google's website on the Ukrainian domain, google.com.ua, ranked in the top 10 during that time.

  4. alexa.amazon.com Website Traffic, Ranking, Analytics [October 2025]

    • semrush.ebundletools.com
    Updated Nov 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). alexa.amazon.com Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/alexa.amazon.com/overview/
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/

    Time period covered
    Nov 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    alexa.amazon.com is ranked #4 in US with 680.95K Traffic. Categories: . Learn more about website traffic, market share, and more!

  5. Effective SEO parameters for all types of websites

    • kaggle.com
    zip
    Updated Jun 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashkan Goharfar (2020). Effective SEO parameters for all types of websites [Dataset]. https://www.kaggle.com/ashkangoharfar/sites-information-data-from-alexacom-dataset
    Explore at:
    zip(3434959 bytes)Available download formats
    Dataset updated
    Jun 16, 2020
    Authors
    Ashkan Goharfar
    Description

    Context

    Alexa Internet rank websites primarily on tracking a sample set of Internet traffic—users of its toolbar for the Internet Explorer, Firefox and Google Chrome web browsers. The Alexa Toolbar includes a popup blocker (which stops unwanted ads), a search box, links to Amazon.com and the Alexa homepage, and the Alexa ranking of the website that the user is visiting. It also allows the user to rate the website and view links to external, relevant websites. Also, Alexa has prepared a list of information for each site for comparison and ranking with other similar sites for each site.

    This dataset is a record of all information on the top websites in each category in Alexa ranking. Source: https://github.com/AshkanGoharfar/Crawler_for_alexa.com

    Content

    This dataset includes several site data, which were achieved from "alexa.com/siteinfo" (for example alexa.com/siteinfo/facebook.com). Data is included for the top 50 websites for every 550 categories in Alexa ranking. (The dataset was obtained for about 22000 sites.) The data also includes keyword opportunities breakdown fields, which vary between categories. As well as each site has important parameters like all_topics_top_keywords_search_traffic_parameter which represent search traffics in competitor websites to this site. For more details about each site's data, you can find the site's name and site's information in the dataset and you can search alexa.com/siteinfo/SiteName link to understand each parameter and columns in the dataset.

    Acknowledgements

    This dataset was collected using the selenium library and chrome web driver to crawl alexa.com data with python language.

    Provider: Ashkan Goharfar, ashkan_goharfar@aut.ac.ir, Department of Computer Engineering and Information Technology, Amirkabir University of Technology

    Citation Request

    A. Risheh, A. Goharfar, and N. T. Javan, "Clustering Alexa Internet Data using Auto Encoder Network and Affinity Propagation," 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 2020, pp. 437-443, doi: 10.1109/ICCKE50421.2020.9303705.

    Inspiration

    Possible uses for this dataset could include:

    Sentiment analysis in a variety of forms. Categorizing websites based on their competitor websites, daily time on the website and Keyword opportunities.

    Analyzing what factors affect on Comparison metrics search traffic, Comparison metrics data, Audience overlap sites overlap scores, top keywords share of voice, top keywords search traffic, optimization opportunities organic share of voice, Optimization opportunities search popularity, Buyer keywords organic competition, Buyer keywords Avg traffic, Easy to rank keywords search pop, Easy to rank keywords relevance to site, Keyword gaps search popularity, Keyword gaps Avg traffic and Keywords search traffic.

    Training ML algorithms like RNNs to generate a probability for each site in each category to being SEO by Google.

    Use NLP for columns like keyword gaps name, Easy to rank keywords name, Buyer keywords name, optimization opportunities name, Top keywords name and Audience overlap similar sites to this site.

  6. Ranking of the most popular web shops in Sweden 2019, by Alexa rank

    • statista.com
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Ranking of the most popular web shops in Sweden 2019, by Alexa rank [Dataset]. https://www.statista.com/statistics/856860/most-popular-web-shops-by-alexa-rank-sweden/
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2019
    Area covered
    Sweden
    Description

    This statistic shows the ten most popular web shops in Sweden in 2017, by Alexa global traffic rank. In first place was zalando.se, ranked ***** by Alexa, followed by adlibris.com, which was ranked ******. Komplett.se came in third place at ******.

  7. QAP regressions for popular websites (Alexa)/ videos (YouTube)/ topics...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yee Man Margaret Ng; Harsh Taneja (2023). QAP regressions for popular websites (Alexa)/ videos (YouTube)/ topics (Twitter) similarity across countries (Final block, September). [Dataset]. http://doi.org/10.1371/journal.pone.0278594.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yee Man Margaret Ng; Harsh Taneja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    QAP regressions for popular websites (Alexa)/ videos (YouTube)/ topics (Twitter) similarity across countries (Final block, September).

  8. Bolivia: Alexa ranking of leading websites 2022, by daily page views

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Bolivia: Alexa ranking of leading websites 2022, by daily page views [Dataset]. https://www.statista.com/statistics/1204245/alexa-leading-websites-bolivia-daily-page-views/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Bolivia
    Description

    Google.com was the website with the most page views per day in Bolivia in February 2022, according to ranking by Alexa. The website had more than ***** daily page views and was followed by Unitel.bo, with ** page views per day that month. Within Latin America, Mexico was the country where Amazon Alexa contained the largest number of skills.

  9. u

    Data from: Analysis of the Quantitative Impact of Social Networks General...

    • produccioncientifica.ucm.es
    • figshare.com
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parra, David; Martínez Arias, Santiago; Mena Muñoz, Sergio; Parra, David; Martínez Arias, Santiago; Mena Muñoz, Sergio (2022). Analysis of the Quantitative Impact of Social Networks General Data.doc [Dataset]. https://produccioncientifica.ucm.es/documentos/668fc409b9e7c03b01bd31e7
    Explore at:
    Dataset updated
    2022
    Authors
    Parra, David; Martínez Arias, Santiago; Mena Muñoz, Sergio; Parra, David; Martínez Arias, Santiago; Mena Muñoz, Sergio
    Description

    General data recollected for the studio " Analysis of the Quantitative Impact of Social Networks on Web Traffic of Cybermedia in the 27 Countries of the European Union". Four research questions are posed: what percentage of the total web traffic generated by cybermedia in the European Union comes from social networks? Is said percentage higher or lower than that provided through direct traffic and through the use of search engines via SEO positioning? Which social networks have a greater impact? And is there any degree of relationship between the specific weight of social networks in the web traffic of a cybermedia and circumstances such as the average duration of the user's visit, the number of page views or the bounce rate understood in its formal aspect of not performing any kind of interaction on the visited page beyond reading its content? To answer these questions, we have first proceeded to a selection of the cybermedia with the highest web traffic of the 27 countries that are currently part of the European Union after the United Kingdom left on December 31, 2020. In each nation we have selected five media using a combination of the global web traffic metrics provided by the tools Alexa (https://www.alexa.com/), which ceased to be operational on May 1, 2022, and SimilarWeb (https:// www.similarweb.com/). We have not used local metrics by country since the results obtained with these first two tools were sufficiently significant and our objective is not to establish a ranking of cybermedia by nation but to examine the relevance of social networks in their web traffic. In all cases, cybermedia whose property corresponds to a journalistic company have been selected, ruling out those belonging to telecommunications portals or service providers; in some cases they correspond to classic information companies (both newspapers and televisions) while in others they refer to digital natives, without this circumstance affecting the nature of the research proposed. Below we have proceeded to examine the web traffic data of said cybermedia. The period corresponding to the months of October, November and December 2021 and January, February and March 2022 has been selected. We believe that this six-month stretch allows possible one-time variations to be overcome for a month, reinforcing the precision of the data obtained. To secure this data, we have used the SimilarWeb tool, currently the most precise tool that exists when examining the web traffic of a portal, although it is limited to that coming from desktops and laptops, without taking into account those that come from mobile devices, currently impossible to determine with existing measurement tools on the market. It includes: Web traffic general data: average visit duration, pages per visit and bounce rate Web traffic origin by country Percentage of traffic generated from social media over total web traffic Distribution of web traffic generated from social networks Comparison of web traffic generated from social netwoks with direct and search procedures

  10. Descriptive statistics for matrices of Alexa, YouTube, and Twitter...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yee Man Margaret Ng; Harsh Taneja (2023). Descriptive statistics for matrices of Alexa, YouTube, and Twitter (September). [Dataset]. http://doi.org/10.1371/journal.pone.0278594.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yee Man Margaret Ng; Harsh Taneja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Descriptive statistics for matrices of Alexa, YouTube, and Twitter (September).

  11. alexia.fr Website Traffic, Ranking, Analytics [October 2025]

    • semrush.ebundletools.com
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). alexia.fr Website Traffic, Ranking, Analytics [October 2025] [Dataset]. https://semrush.ebundletools.com/website/alexia.fr/overview/
    Explore at:
    Dataset updated
    Nov 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://semrush.ebundletools.com/company/legal/terms-of-service/https://semrush.ebundletools.com/company/legal/terms-of-service/

    Time period covered
    Nov 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    alexia.fr is ranked #3711 in FR with 443.58K Traffic. Categories: . Learn more about website traffic, market share, and more!

  12. Traces captured by visiting the top 1500 website

    • kaggle.com
    zip
    Updated Aug 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DNS_dataset (2021). Traces captured by visiting the top 1500 website [Dataset]. https://www.kaggle.com/jacksontang16/traces-captured-by-visiting-the-top-1500-website
    Explore at:
    zip(5852806 bytes)Available download formats
    Dataset updated
    Aug 25, 2021
    Authors
    DNS_dataset
    Description

    Dataset

    This dataset was created by DNS_dataset

    Contents

  13. Data from: HTTPS traffic classification

    • kaggle.com
    zip
    Updated Mar 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Đinh Ngọc Ân (2024). HTTPS traffic classification [Dataset]. https://www.kaggle.com/datasets/inhngcn/https-traffic-classification/data
    Explore at:
    zip(36287490 bytes)Available download formats
    Dataset updated
    Mar 11, 2024
    Authors
    Đinh Ngọc Ân
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The people from Czech are publishing a dataset for the HTTPS traffic classification.

    Since the data were captured mainly in the real backbone network, they omitted IP addresses and ports. The datasets consist of calculated from bidirectional flows exported with flow probe Ipifixprobe. This exporter can export a sequence of packet lengths and times and a sequence of packet bursts and time. For more information, please visit ipfixprobe repository (Ipifixprobe).

    During research, they divided HTTPS into five categories: L -- Live Video Streaming, P -- Video Player, M -- Music Player, U -- File Upload, D -- File Download, W -- Website, and other traffic.

    They have chosen the service representatives known for particular traffic types based on the Alexa Top 1M list and Moz's list of the most popular 500 websites for each category. They also used several popular websites that primarily focus on the audience in Czech. The identified traffic classes and their representatives are provided below:

    Live Video Stream Twitch, Czech TV, YouTube Live Video Player DailyMotion, Stream.cz, Vimeo, YouTube Music Player AppleMusic, Spotify, SoundCloud File Upload/Download FileSender, OwnCloud, OneDrive, Google Drive Website and Other Traffic Websites from Alexa Top 1M list

  14. Dataset used for HTTPS traffic classification using packet burst statistics

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Apr 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tropkova Zdena; Hynek Karel; Cejka Tomas (2022). Dataset used for HTTPS traffic classification using packet burst statistics [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4911550
    Explore at:
    Dataset updated
    Apr 11, 2022
    Dataset provided by
    CESNEThttp://www.cesnet.cz/
    FIT CTU
    Authors
    Tropkova Zdena; Hynek Karel; Cejka Tomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We are publishing a dataset we created for the HTTPS traffic classification.

    Since the data were captured mainly in the real backbone network, we omitted IP addresses and ports. The datasets consist of calculated from bidirectional flows exported with flow probe Ipifixprobe. This exporter can export a sequence of packet lengths and times and a sequence of packet bursts and time. For more information, please visit ipfixprobe repository (Ipifixprobe).

    During our research, we divided HTTPS into five categories: L -- Live Video Streaming, P -- Video Player, M -- Music Player, U -- File Upload, D -- File Download, W -- Website, and other traffic.

    We have chosen the service representatives known for particular traffic types based on the Alexa Top 1M list and Moz's list of the most popular 500 websites for each category. We also used several popular websites that primarily focus on the audience in our country. The identified traffic classes and their representatives are provided below:

    Live Video Stream Twitch, Czech TV, YouTube Live

    Video Player DailyMotion, Stream.cz, Vimeo, YouTube

    Music Player AppleMusic, Spotify, SoundCloud

    File Upload/Download FileSender, OwnCloud, OneDrive, Google Drive

    Website and Other Traffic Websites from Alexa Top 1M list

  15. I

    Global Web Use Similarity

    • aws-databank-alb.library.illinois.edu
    • databank.illinois.edu
    Updated Dec 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yee Man Margaret Ng; Harsh Taneja (2022). Global Web Use Similarity [Dataset]. http://doi.org/10.13012/B2IDB-3150928_V1
    Explore at:
    Dataset updated
    Dec 5, 2022
    Authors
    Yee Man Margaret Ng; Harsh Taneja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are similarity matrices of countries based on dfferent modalities of web use. Alexa website traffic, trending vidoes on Youtube and Twitter trends. Each matrix is a month of data aggregated

  16. Popular websites across the globe

    • kaggle.com
    zip
    Updated May 27, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bpali26 (2017). Popular websites across the globe [Dataset]. https://www.kaggle.com/bpali26/popular-websites-across-the-globe
    Explore at:
    zip(639485 bytes)Available download formats
    Dataset updated
    May 27, 2017
    Authors
    bpali26
    Description

    Context

    This dataset includes some of the basic information of the websites we daily use. While scrapping this info, I learned quite a lot in R programming, system speed, memory usage etc. and developed my niche in Web Scrapping. It took about 4-5 hrs for scrapping this data through my system (4GB RAM) and nearly about 4-5 days working out my idea through this project.

    Content

    The dataset contains Top 50 ranked sites from each 191 countries along with their traffic (global) rank. Here, country_rank represent the traffic rank of that site within the country, and traffic_rank represent the global traffic rank of that site.

    Since most of the columns meaning can be derived from their name itself, its pretty much straight forward to understand this dataset. However, there are some instances of confusion which I would like to explain in here:

    1) most of the numeric values are in character format, hence, contain spaces which you might need to clean on.

    2) There are multiple instances of same website. for.e.g. Yahoo. com is present in 179 rows within this dataset. This is due to their different country rank in each country.

    3)The information provided in this dataset is for the top 50 websites in 191 countries as on 25th May 2017 and is subjected to change in future time due to the dynamic structure of ranking.

    4) The dataset inactual contains 9540 rows instead of 9550(50*191 rows). This was due to the unavailability of information for 10 websites.

    PS: in case if there are anymore queries, comment on this, I'll add an answer to that in above list.

    Acknowledgements

    I wouldn't have done this without the help of others. I've scrapped this information from publicly available (open to all) websites namely: 1) http://data.danetsoft.com/ 2) http://www.alexa.com/topsites , of which i'm highly grateful. I truly appreciate and thanks the owner of these sites for providing us with the information that I included today in this dataset.

    Inspiration

    I feel that there this a lot of scope for exploring & visualization this dataset to find out the trends in the attributes of these websites across countries. Also, one could try predicting the traffic(global) rank being a dependent factor on the other attributes of the website. In any case, this dataset will help you find out the popular sites in your area.

  17. Top 50 Most Popular Websites by Countries

    • kaggle.com
    zip
    Updated Feb 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yamac Eren Ay (2021). Top 50 Most Popular Websites by Countries [Dataset]. https://www.kaggle.com/yamaerenay/top50websites
    Explore at:
    zip(91920 bytes)Available download formats
    Dataset updated
    Feb 1, 2021
    Authors
    Yamac Eren Ay
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Content

    I collected data from here by country and with the help of a little bit of data wrangling, I could convert the data into the JSON and CSV format. The dataset contains 2 files:

    countries.json: The top 50 most popular websites by each country, the ranking order is stored by indexes. sites.csv: General information about every website on the list, such as: * Daily Minutes on Site: Estimated daily minutes on site per visitor to the site * Daily Pageviews per Visitor: Estimated daily unique pageviews per visitor on the site * Ratio of Traffic From Search: The ratio of all referrals that came from Search engines over the trailing month * Total Sites Linking In: The total number of sites that are linked to this website

    Acknowledgements

    Source: Alexa.com.

  18. Most visited Shopify stores 2022

    • statista.com
    Updated Dec 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Most visited Shopify stores 2022 [Dataset]. https://www.statista.com/statistics/1350323/most-visited-shopify-stores/
    Explore at:
    Dataset updated
    Dec 5, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The traffic ranking of Shopify stores indicated that colourpop.com was the most visited site, scoring ***** on the Alexa traffic rank. As of November 2022, the second-most visited e-commerce site built on Shopify software was jeffreestarcosmetics.com, while the online store of Fashion Nova ranked third.

  19. HTTPS Tunneling Predictive Analysis Dataset

    • kaggle.com
    zip
    Updated Oct 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    archie2023 (2024). HTTPS Tunneling Predictive Analysis Dataset [Dataset]. https://www.kaggle.com/archie2023/https-tunneling-predictive-analysis-dataset
    Explore at:
    zip(271612753 bytes)Available download formats
    Dataset updated
    Oct 28, 2024
    Authors
    archie2023
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    HTPA Dataset

    Dataset Name: HTPA(HTTPS Tunneling Predictive Analysis Dataset)

    Table of Contents

    1. Overview
    2. Dataset Structure
    3. Data Collection Methodology
    4. Privacy Considerations
    5. Feature Descriptions
    6. Usage Instructions
    7. Contact

    Overview

    The HTPA dataset is designed for research and development in detecting HTTPS tunnel traffic versus normal HTTPS traffic. It is especially suitable for knowledge-graph-based algorithms, such as the HINT method, due to its inclusion of multi-dimensional traffic features and large-scale network interactions.

    Key Features

    • HTTPS tunnel traffic, VPN.

    Dataset Structure

    The HTPA dataset is provided as a compressed file with the following structure:

    HTPA/
    ├── tunnel_traffic.pickle 
    ├── normal_traffic.pickle 
    ├── load_data.py 
    └── splited_data/                    
      ├── test_data_split_by_date.pickle 
      ├── test_data_split_by_service.pickle 
      ├── train_data_split_by_date.pickle 
      └── train_data_split_by_service.pickle 
    

    Data Collection Methodology

    HTPA was generated to capture diverse traffic interactions in a server-client setup. Tunnel traffic data was gathered from popular VPN services—Hotspot Shield Free, Browsec VPN, ZenMate VPN, Hoxx VPN, and ShadowsocksR—all of which utilize HTTPS tunneling for data transfer. In collecting HTTPS tunnel traffic, we developed a crawler script that automated the process of visiting websites via these VPNs. To simulated realistic user behavior patterns, the crawler script was designed to browse at preset intervals with random pauses, closely mimicking human interaction habits. Specifically, clients (computers and mobile devices) were connected to VPN servers with configured client software. The crawler then launched a Chrome browser to visit randomly chosen sites from the Alexa Top 10,000 websites. All traffic was routed through a configured router that mirrored it to a storage server, archiving it in pcap format, and this process was repeated multiple times to ensure dataset diversity.

    For non-VPN (normal HTTPS) traffic, data was collected passively from a corporate network environment with over one hundred users. Five volunteers logged their regular online activities without VPNs or proxies over a month, allowing us to record their service IPs and ports as non-VPN traffic.

    Privacy Considerations

    Due to the dataset's scale and privacy concerns, we provide extracted features rather than raw network packets. All IP addresses and domain names have been hashed to preserve anonymity.

    Feature Descriptions

    The feature data files (tunnel_traffic.pickle and normal_traffic.pickle) include:

    FieldDescription
    StartTimeThe timestamp (second) marking the beginning of a traffic flow.
    EndTimeThe timestamp (second) marking the end of a traffic flow.
    ServerIPThe hashed IP address of source.
    ServerPortThe port number of source.
    ClientIPThe hashed IP address of destination.
    ClientPortThe port number of destination.
    DomainThe hashed domain name.
    SizeSeqThe sequence of packet sizes (byte) for each packet within the flow.
    TimeSeqThe sequence of relative timestamps (second) for each packet within the flow.
    UpBytesThe total number of bytes sent from the source to the destination during the flow.
    DownBytesThe total number of bytes received by the source from the destination during the flow.
    UpPacketsThe total number of packets sent from the source to the destination during the flow.
    DownPacketsThe total number of packets received by the source from the destination during the flow.
    TcpRttThe round-trip time (second) of TCP packets during the three-way handshake.

    Usage Instructions

    In past experiments, using a simple random sampling to split train and test data led to very high accuracy (around 0.99) for some baseline models like Random Forest. However, in real-world scenarios, models face greater challenges, including unknown data and concept drift. To better mimic open-world scenarios, we suggest using the following two data split strategies:

    By Service: Here, we used 'ServerIP' and 'ServerPort' as unique identifiers for each service, ensuring the same service wasn’t present in both train and test data. This approach prevent...

  20. Ranking of the largest web shops in Sweden 2018, by revenue

    • statista.com
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Ranking of the largest web shops in Sweden 2018, by revenue [Dataset]. https://www.statista.com/statistics/698968/largest-webshops-in-sweden-by-revenue/
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2018
    Area covered
    Sweden
    Description

    The re-seller of IT-products and additional services Dustin and Dustin Home led in the ranking of largest web shops in Sweden in 2018, by revenue. Dustin had a revenue of roughly 8.7 billion Swedish kronor that year. It was followed by Cdon.com which is a web shop with a variety of products within the sector of sport, fashion, electronics, groceries and other. Its revenue was 1.7 billion Swedish kronor. In contrast, the fastest growing web-shop in Sweden that year was Bright 123. It had a revenue of 13.6 million Swedish kronor.

    What does Alexa say?     

    Another survey, conducted in 2018, ranked the ten most popular web-shops in Sweden by Alexa global traffic rank. Zalando.se came first in this list with a rank of 15,296 and it was followed by adibris.com and komplett.se.

      What the consumer values in web-shops   

    In the first quarter of 2018, 71 percent of the interviewed Swedish consumers shared that the right product selection was the reason why they repeated their purchases in web-shops. The reasons which followed were low prices, free shipping and fast shipping. Product specifications and pictures were the two the most important information feature in online shops, according to Swedish consumers in the same period. Both of the mentioned above features were picked by 95 percent of the interviewed online shoppers.

  21. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). alexa.com Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/alexa.com

alexa.com Traffic Analytics Data

Explore at:
Dataset updated
Sep 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank
Description

Traffic analytics, rankings, and competitive metrics for alexa.com as of September 2025

Search
Clear search
Close search
Google apps
Main menu