9 datasets found
  1. Alexa Top 1 Million Sites

    • kaggle.com
    Updated Jan 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sid Ghodke (2018). Alexa Top 1 Million Sites [Dataset]. https://www.kaggle.com/cheedcheed/top1m/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 16, 2018
    Dataset provided by
    Kaggle
    Authors
    Sid Ghodke
    Description

    Context

    A listing of the top 1-million websites according to Alexa.com.

    Content

    Rank and Site name

    Acknowledgements

    Fork from: https://github.com/mozilla/cipherscan/tree/master/top1m

  2. t

    Alexa Top 1 Million Dataset - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Alexa Top 1 Million Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/alexa-top-1-million-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The Alexa Top 1 Million dataset is used to generate realistic domain samples.

  3. O

    Alexa Domains

    • opendatalab.com
    zip
    Updated Sep 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Booz Allen Hamilton (2022). Alexa Domains [Dataset]. https://opendatalab.com/OpenDataLab/Alexa_Domains
    Explore at:
    zip(20248769 bytes)Available download formats
    Dataset updated
    Sep 30, 2022
    Dataset provided by
    Booz Allen Hamilton
    Description

    This dataset is composed of the URLs of the top 1 million websites. The domains are ranked using the Alexa traffic ranking which is determined using a combination of the browsing behavior of users on the website, the number of unique visitors, and the number of pageviews. In more detail, unique visitors are the number of unique users who visit a website on a given day, and pageviews are the total number of user URL requests for the website. However, multiple requests for the same website on the same day are counted as a single pageview. The website with the highest combination of unique visitors and pageviews is ranked the highest

  4. i

    Top 1 Million Site Scans

    • impactcybertrust.org
    • search.datacite.org
    Updated Jan 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Data Source (2019). Top 1 Million Site Scans [Dataset]. http://doi.org/10.23721/100/1478956
    Explore at:
    Dataset updated
    Jan 19, 2019
    Authors
    External Data Source
    Description

    This data is obtained daily by crawling the Alexa Top 1 Million sites and will soon include the Cisco Umbrella Top 1 Million. It includes data on the presence and configuration of various HTTP Response Headers, details on the TLS configuration, certificates, protocol, cipher and keys used and much more. These crawls were originally conducted every 6 months and the data published on my blog but I'm now crawling daily and making the data available to the wider community for further analysis. ; mail@scotthelme.co.uk

  5. h

    dga-detection

    • huggingface.co
    Updated Sep 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos A. Catania (Harpo) (2018). dga-detection [Dataset]. https://huggingface.co/datasets/harpomaxx/dga-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 16, 2018
    Authors
    Carlos A. Catania (Harpo)
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Description

    A dataset containing both DGA and normal domain names. The normal domain names were taken from the Alexa top one million domains. An additional 3,161 normal domains were included in the dataset, provided by the Bambenek Consulting feed. This later group is particularly interesting since it consists of suspicious domain names that were not generated by DGA. Therefore, the total amount of domains normal in the dataset is 1,003,161. DGA domains were obtained from the repositories of DGA domains of Andrey Abakumov and John Bambenek. The total amount of DGA domains is 1,915,335, and they correspond to 51 different malware families. DGA domains were generated by 51 different malware families. About the 55% of of the DGA portion of dataset is composed of samples from the Banjori, Post, Timba, Cryptolocker, Ramdo and Conficker malware.

  6. DNS Threats Dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Mar 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franco Palau; Catania; Catania; Guerra; Garcia; Garcia; Rigaki; Rigaki; Franco Palau; Guerra (2023). DNS Threats Dataset [Dataset]. http://doi.org/10.5281/zenodo.6508640
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Franco Palau; Catania; Catania; Guerra; Garcia; Garcia; Rigaki; Rigaki; Franco Palau; Guerra
    Description

    The dataset contains Normal, DGA and Tunneling domain names: i. the total number of normal domains are conformed by the Alexa top one million domains, 3,161 normal domains provided by the Bambenek Consulting feed, and another 177,017 normal domains; ii. the DGA domains were obtained from the repositories of DGA domains of Andrey Abakumov and John Bambenek, corresponding to 51 different malware families; iii. the DNS Tunneling consist of 8000 tunnel domains generated using a set of well known DNS tunneling tools under laboratory conditions: iodine, dnscat2 and dnsExfiltrator.

    The dataset is described in the paper:
    Palau, F., Catania, C., Guerra, J., García, S. J., & Rigaki, M. (2019). Detecting DNS threats: A deep learning model to rule them all. In XX Simposio Argentino de Inteligencia Artificial (ASAI 2019)-JAIIO 48 (Salta).

  7. Alexa rank of nykaa.com 2021

    • statista.com
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Alexa rank of nykaa.com 2021 [Dataset]. https://www.statista.com/topics/8128/nykaa/
    Explore at:
    Dataset updated
    Jan 10, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    The Alexa rank of nykaa.com moved from 4.8 thousand to more than seven thousand between January and March 2021. This indicated a decrease in the popularity of the website. In comparison, purplle.com, one of Nykaa's competitors, ranked at around 25 thousand, while Myntra ranked at about one thousand.

  8. Number of digital voice assistants in use worldwide 2019-2024

    • statista.com
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of digital voice assistants in use worldwide 2019-2024 [Dataset]. https://www.statista.com/statistics/973815/worldwide-digital-voice-assistant-in-use/
    Explore at:
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2020, there will be *** billion digital voice assistants being used in devices around the world. Forecasts suggest that by 2024, the number of digital voice assistants will reach *** billion units – a number higher than the world’s population. Virtual assistants Virtual assistants, an increasingly commonplace feature of many consumer electronics devices, can respond to commands, provide users with information, and assist in the control of other connected electronics. There are over *** million virtual assistant users in the United States alone, and the software is especially common in smartphones and smart speakers. As of 2019, Amazon’s Alexa was supported on around ****** different smart home devices around the world, providing an excellent example of just how popular the software has become. “Smart” everything Virtual assistants have become a key component of the smart device industry, being absolutely integral to the way that consumers interact with their devices. As the industry grows and its technology becomes more advanced, companies are increasingly searching for bigger and better uses of “smart” technology. Tech savvy consumers can now communicate with their connected homes and vehicles in much the same way that they can with their smartphones.

  9. Global smart speaker market share 2016-2022, by vendor

    • statista.com
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global smart speaker market share 2016-2022, by vendor [Dataset]. https://www.statista.com/statistics/792604/worldwide-smart-speaker-market-share/
    Explore at:
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    Amazon is the leading vendor in the global smart speaker market, having a market share of **. percent in the first quarter of 2022. Google is Amazon’s closest competitor, with a share of **** percent in the same quarter. Chinese vendors Baidu, Alibaba and Xiaomi have become strong players in recent quarters, thanks to growing demand in the Chinese domestic market. Smart speaker A wireless speaker with an integrated virtual voice assistant, a smart speaker performs tasks such as seeking information, playing music, making a shopping list, etc., upon receiving voice commands from users. Since Amazon introduced its pioneering Echo speaker into the consumer market in 2015, the smart speaker has gained increasing popularity among consumers. The market became more dynamic after Google entered the competition with its Google Home speaker and shipment figures went up dramatically, from **** million in 2016 to a projected *** million in 2021.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sid Ghodke (2018). Alexa Top 1 Million Sites [Dataset]. https://www.kaggle.com/cheedcheed/top1m/tasks
Organization logo

Alexa Top 1 Million Sites

Rankings of the top 1 million websites, in the world

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2018
Dataset provided by
Kaggle
Authors
Sid Ghodke
Description

Context

A listing of the top 1-million websites according to Alexa.com.

Content

Rank and Site name

Acknowledgements

Fork from: https://github.com/mozilla/cipherscan/tree/master/top1m

Search
Clear search
Close search
Google apps
Main menu