100+ datasets found
  1. Global share of human and bot web traffic 2013-2024

    • statista.com
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global share of human and bot web traffic 2013-2024 [Dataset]. https://www.statista.com/statistics/1264226/human-and-bot-web-traffic-share/
    Explore at:
    Dataset updated
    Jul 21, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, most of the global website traffic was still generated by humans, but bot traffic is constantly growing. Fraudulent traffic through bad bot actors accounted for 37 percent of global web traffic in the most recently measured period, representing an increase of 12 percent from the previous year. Sophistication of Bad Bots on the rise The complexity of malicious bot activity has dramatically increased in recent years. Advanced bad bots have doubled in prevalence over the past 2 years, indicating a surge in the sophistication of cyber threats. Simultaneously, the share of simple bad bots drastically increased over the last years, suggesting a shift in the landscape of automated threats. Meanwhile, areas like food and groceries, sports, gambling, and entertainment faced the highest amount of advanced bad bots, with more than 70 percent of their bot traffic affected by evasive applications. Good and bad bots across industries The impact of bot traffic varies across different sectors. Bad bots accounted for over 50 percent of the telecom and ISPs, community and society, and computing and IT segments web traffic. However, not all bot traffic is considered bad. Some of these applications help index websites for search engines or monitor website performance, assisting users throughout their online search. Therefore, areas like entertainment, food and groceries, and even areas targeted by bad bots themselves experienced notable levels of good bot traffic, demonstrating the diverse applications of benign automated systems across different sectors.

  2. h

    daily-papers-stats

    • huggingface.co
    Updated Jul 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hysts-bot-data (2024). daily-papers-stats [Dataset]. https://huggingface.co/datasets/hysts-bot-data/daily-papers-stats
    Explore at:
    Dataset updated
    Jul 28, 2024
    Dataset authored and provided by
    hysts-bot-data
    Description

    hysts-bot-data/daily-papers-stats dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. Global share of human and bot web traffic 2023, by industry

    • statista.com
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Global share of human and bot web traffic 2023, by industry [Dataset]. https://www.statista.com/statistics/1264540/human-and-bot-web-traffic-share-industry/
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Worldwide
    Description

    In 2023, the majority of website traffic was still generated by humans but bot traffic is constantly increasing. Fraudulent traffic through bad bot actors accounted for 57.2 percent of web traffic in the gaming industry, a stark contrast to the mere 16.5 percent of bad bot traffic in the marketing segment. On the other hand, entertainment, food and groceries, and financial services were also categories with notable percentages of good bot traffic.

  4. YoY change in human-initiated and bot attacks volume worldwide 2023, by...

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YoY change in human-initiated and bot attacks volume worldwide 2023, by region [Dataset]. https://www.statista.com/statistics/1180124/human-initiated-automated-bot-attacks-volume-worldwide-region/
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Worldwide
    Description

    In 2023, the North America region saw the most significant year-over-year increase in human-initiated attack volume, over ** percent. The highest spike in automated bot attacks was seen in Latin America (LATAM) ** percent.

  5. OpenML R Bot Benchmark Data

    • figshare.com
    txt
    Updated Dec 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philipp Probst; Daniel Kühn (2017). OpenML R Bot Benchmark Data [Dataset]. http://doi.org/10.6084/m9.figshare.5727073.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 21, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Philipp Probst; Daniel Kühn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the data that was created by the OpenML R Bot that executed benchmark experiments on the dataset collection OpenML100 with six R algorithms: glmnet, rpart, kknn, svm, ranger and xgboost. The hyperparameters of these algorithms were drawn randomly. In total it contains more than 5 million benchmark experiments and can be used by other researchers. Each file is a table that for each benchmark experiment contains: OpenML-Task ID, hyperparameter values, performance masures (AUC, accuracy, brier score), runtime, scimark (runtime reference of the machine), and some meta features of the dataset.

  6. Kaggle Bot Account Detection

    • kaggle.com
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shriyash Jagtap (2023). Kaggle Bot Account Detection [Dataset]. https://www.kaggle.com/datasets/shriyashjagtap/kaggle-bot-account-detection/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shriyash Jagtap
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The data in question was generated using the Faker library and is not authentic real-world data. In recent years, there have been numerous reports suggesting the presence of bot voting practices that have resulted in manipulated outcomes within data science competitions. As a result of this, the idea for creating a simulated dataset arose. Although this is the first time that this dataset has been created, it is open to feedback and constructive criticism in order to improve its overall quality and significance.

    NAME: The name of the individual. GENDER: The gender of the individual, either male or female. EMAIL_ID: The email address of the individual. IS_GLOGIN: A boolean indicating whether the individual used Google login to register or not. FOLLOWER_COUNT: The number of followers the individual has. FOLLOWING_COUNT: The number of individuals the individual is following. DATASET_COUNT: The number of datasets the individual has created. CODE_COUNT: The number of notebooks the individual has created. DISCUSSION_COUNT: The number of discussions the individual has participated in. AVG_NB_READ_TIME_MIN: The average time spent reading notebooks in minutes. REGISTRATION_IPV4: The IP address used to register. REGISTRATION_LOCATION: The location from where the individual registered. TOTAL_VOTES_GAVE_NB: The total number of votes the individual has given to notebooks. TOTAL_VOTES_GAVE_DS: The total number of votes the individual has given to datasets. TOTAL_VOTES_GAVE_DC: The total number of votes the individual has given to discussion comments. ISBOT: A boolean indicating whether the individual is a bot or not.

  7. f

    Twitter bot profiling

    • figshare.com
    • researchdata.smu.edu.sg
    • +1more
    pdf
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    Living Analytics Research Centre
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

    Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

    This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6

  8. OpenML R Bot Benchmark Data (final subset)

    • figshare.com
    application/gzip
    Updated May 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl (2018). OpenML R Bot Benchmark Data (final subset) [Dataset]. http://doi.org/10.6084/m9.figshare.5882230.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 18, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a clean subset of the data that was created by the OpenML R Bot that executed benchmark experiments on binary classification task of the OpenML100 benchmarking suite with six R algorithms: glmnet, rpart, kknn, svm, ranger and xgboost. The hyperparameters of these algorithms were drawn randomly. In total it contains more than 2.6 million benchmark experiments and can be used by other researchers. The subset was created by taking 500000 results of each learner (except of kknn for which only 1140 results are available). The csv-file for each learner is a table that for each benchmark experiment has a row that contains: OpenML-Data ID, hyperparameter values, performance measures (AUC, accuracy, brier score), runtime, scimark (runtime reference of the machine), and some meta features of the dataset.OpenMLRandomBotResults.RData (format for R) contains all data in seperate tables for the results, the hyperparameters, the meta features, the runtime, the scimark results and reference results.

  9. i

    Grant Giving Statistics for Invent-A-Bot Learning Center

    • instrumentl.com
    Updated Jul 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Grant Giving Statistics for Invent-A-Bot Learning Center [Dataset]. https://www.instrumentl.com/990-report/invent-a-bot-learning-center
    Explore at:
    Dataset updated
    Jul 7, 2021
    Description

    Financial overview and grant giving statistics of Invent-A-Bot Learning Center

  10. Data from: Discovery and classification of Twitter bots

    • zenodo.org
    • data.europa.eu
    doc, txt
    Updated Apr 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Shevtsov; Alexander Shevtsov; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou (2021). Discovery and classification of Twitter bots [Dataset]. http://doi.org/10.5281/zenodo.4715885
    Explore at:
    doc, txtAvailable download formats
    Dataset updated
    Apr 24, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Shevtsov; Alexander Shevtsov; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Online Social Networks (OSN) are used by millions of users, daily. This user-base shares and discovers different opinions on popular topics.
    Social influence of large groups may be influenced by user believes or be attracted the interest in particular news or products. A large number of users, gathered in a single group or number of followers, increases the probability to influence OSN users.
    Botnets, collections of automated accounts controlled by a single agent, are a common mechanism for exerting
    maximum influence. Botnets may be used to better infiltrate the social graph over time and create an illusion of community
    behaviour, amplifying their message and increasing persuasion.

    This paper investigates Twitter botnets, their behavior, their interaction with user communities and their evolution over time.
    We analyze a dense crawl of a subset of Twitter traffic, amounting to nearly all interactions by Greek-speaking Twitter users for a period
    of 36 months.

    The collected users are labeled as botnets, based on long term and frequent content similarity events. We detect over a million events, where seemingly unrelated accounts tweeted nearly identical content, at almost the same time. We filter these concurrent content injection events and detect a set of 1,850 accounts that repeatedly exhibit this pattern of behavior, suggesting that they are fully or in part controlled and orchestrated by the same entity. We find botnets that appear for brief intervals and disappear, as well as botnets that evolve and grow, spanning the duration of our dataset. We analyze statistical differences between the bot accounts and human users, as well as the botnet interactions with the user communities and the Twitter trending topics.

  11. bot-fight-data

    • hf.qhduan.com
    • huggingface.co
    Updated Jul 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huggingface Projects (2024). bot-fight-data [Dataset]. https://hf.qhduan.com/datasets/huggingface-projects/bot-fight-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2024
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Huggingface Projects
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    huggingface-projects/bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. Sophistication level of bad bots 2020-2024

    • statista.com
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Sophistication level of bad bots 2020-2024 [Dataset]. https://www.statista.com/statistics/1264631/bad-bots-sophistication/
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, most of the worldwide website traffic is generated by humans but bot traffic is constantly increasing. Fraudulent traffic through bad bot actors also exists at various levels of sophistication. Over the last two years, the amount of advanced bad bots exploded, doubling what was registered in the previous years. However, simple bad bots have increased by almost 6 percent compared to the previous year, suggesting a decrease in the number of moderate bad bots.

  13. Data from: CAS Botany (BOT)

    • gbif.org
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Magnaghi; Emily Magnaghi (2025). CAS Botany (BOT) [Dataset]. http://doi.org/10.15468/7gudyo
    Explore at:
    Dataset updated
    Aug 13, 2025
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    California Academy of Sciences
    Authors
    Emily Magnaghi; Emily Magnaghi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    The electronic catalog of the botanical collection at the California Academy of Sciences, San Francisco.

  14. m

    Bot Cooperation Experiment Dataset

    • data.mendeley.com
    • narcis.nl
    Updated Jul 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hirokazu Shirado (2020). Bot Cooperation Experiment Dataset [Dataset]. http://doi.org/10.17632/t963ktp6ft.1
    Explore at:
    Dataset updated
    Jul 21, 2020
    Authors
    Hirokazu Shirado
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is the code and data of the experiments involving networks of humans (1,024 subjects in 64 networks) playing a public-goods game to which we sometimes added autonomous agents (bots) programmed to use only local knowledge. We show that cooperation can not only be stabilized, but even promoted, when the bots intervene in the partner selections made by the humans themselves, re-shaping social connections locally within a larger group. The "code" directory has the R codes to analyze the experiment data at both group and individual level in the "data" directory. The dataset also has the raw data of each experiment session with the JSON format in the "data/raw" directory. The sub directory "exp1" has the data of experiment with 6 bot treatments and 1 additional bot visibility treatment. The sub directory "exp2" has the data of supplementary experiment with 1 single network-engineering bot having 5 ties.

  15. Bot_IoT

    • kaggle.com
    Updated Mar 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vignesh Venkateswaran (2023). Bot_IoT [Dataset]. https://www.kaggle.com/datasets/vigneshvenkateswaran/bot-iot/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vignesh Venkateswaran
    Description

    INFO ABOUT THE BOT-IOT DATASET, NOTE: only the csv files stated in the description are used

    The BoT-IoT dataset can be downloaded from HERE. You can also use our new datasets: the TON_IoT and UNSW-NB15.

    --------------------------------------------------------------------------

    The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of UNSW Canberra. The network environment incorporated a combination of normal and botnet traffic. The dataset’s source files are provided in different formats, including the original pcap files, the generated argus files and csv files. The files were separated, based on attack category and subcategory, to better assist in labeling process.

    The captured pcap files are 69.3 GB in size, with more than 72.000.000 records. The extracted flow traffic, in csv format is 16.7 GB in size. The dataset includes DDoS, DoS, OS and Service Scan, Keylogging and Data exfiltration attacks, with the DDoS and DoS attacks further organized, based on the protocol used.

    To ease the handling of the dataset, we extracted 5% of the original dataset via the use of select MySQL queries. The extracted 5%, is comprised of 4 files of approximately 1.07 GB total size, and about 3 million records.

    --------------------------------------------------------------------------

    Free use of the Bot-IoT dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes should be agreed by the authors. The authors have asserted their rights under the Copyright. To whom intent the use of the Bot-IoT dataset, the authors have to cite the following papers that has the dataset’s details: .

    Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull. "Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset." Future Generation Computer Systems 100 (2019): 779-796. Public Access Here.

    Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards developing network forensic mechanism for botnet activities in the iot based on machine learning techniques." In International Conference on Mobile Networks and Management, pp. 30-44. Springer, Cham, 2017.

    Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework." Future Generation Computer Systems 110 (2020): 91-106.

    Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with particle swarm and deep learning: The particle deep framework." arXiv preprint arXiv:2005.00722 (2020).

    Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram, and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability Perspectives in Smart Airports." IEEE Access (2020).

    Koroniotis, Nickolaos. "Designing an effective network forensic framework for the investigation of botnets in the Internet of Things." PhD diss., The University of New South Wales Australia, 2020.

    --------------------------------------------------------------------------

  16. Global Bot Detection And Mitigation Software Market Size By Type of...

    • verifiedmarketresearch.com
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2024). Global Bot Detection And Mitigation Software Market Size By Type of Solution, By Deployment Mode, By Application, By Industry Vertical, By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/bot-detection-and-mitigation-software-market/
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2024 - 2031
    Area covered
    Global
    Description

    Bot Detection And Mitigation Software Market size was valued at USD 20.5 Billion in 2023 and is projected to reach USD 35.2 Billion by 2031, growing at a CAGR of 8.32% during the forecast period 2024-2031.

    Global Bot Detection And Mitigation Software Market Drivers

    The market drivers for the Bot Detection And Mitigation Software Market can be influenced by various factors. These may include:

    Rising Incidence of Cyber Attacks: As the number and sophistication of cyber attacks increase, organizations are more aware of the threats posed by malicious bots. These bots can perpetrate a variety of harmful activities such as data theft, DDoS (Distributed Denial-of-Service) attacks, and fraudulent transactions. Therefore, there's a heightened demand for software solutions that can detect and mitigate these bot-related threats. Expansion of E-commerce and Online Services: The growth of e-commerce platforms and online services has led to an increased volume of online activities that can be targeted by bots. For instance, bots can be used for price scraping, inventory hoarding, and performing fraudulent transactions. To safeguard the integrity and performance of their platforms, businesses invest in bot detection and mitigation solutions. Increased Adoption of APIs: APIs (Application Programming Interfaces) are increasingly being used to enable interconnectivity between different software services and applications. This widespread use makes them susceptible to bot attacks that can exploit vulnerabilities or abuse the API functionality. Consequently, there's a rising need for bot detection solutions specifically designed to protect APIs. Regulatory Compliance and Data Protection: With stringent regulations like GDPR (General Data Protection Regulation) and CCPA (California Consumer Privacy Act) around data protection and privacy, companies are required to implement robust security measures to protect user data. Bot detection and mitigation software can help organizations comply with these regulations by preventing unwanted access and data breaches through malicious bots. Advancements in Machine Learning and AI: Advances in machine learning (ML) and artificial intelligence (AI) have enhanced the capabilities of bot detection solutions. These technologies enable the development of more sophisticated and accurate systems that can identify and adapt to the evolving behaviors of bots. As a result, companies are more inclined to adopt these cutting-edge solutions for better protection. Growing Concerns Over Ad Fraud: In the digital advertising industry, ad fraud perpetrated by bots is a significant concern. This includes fraudulent clicks, impressions, and conversions generated by bots to deceive advertisers and drain their advertising budgets. To combat this, advertisers and ad networks are increasingly relying on bot detection software to ensure the authenticity of their ad traffic. Increase in Online Transactions: The surge in online transactions, particularly due to the rise of digital payment methods and mobile banking, has made financial services a primary target for bot attacks. Bots can be used for credential stuffing, account takeover, and transaction fraud. Thus, financial institutions are investing heavily in bot mitigation solutions to secure their online platforms. Enhanced User Experience: Bots can significantly degrade user experience by slowing down website performance, causing downtime, and making it difficult for legitimate users to access services. Companies aim to maintain a seamless and efficient user experience by implementing bot detection and mitigation solutions to keep their platforms running smoothly. Increasing Awareness and Education: There is a growing awareness among businesses about the potential risks associated with bot activities and the importance of having robust defenses in place. As more organizations understand the impact of bot attacks, they are more likely to invest in comprehensive bot detection and mitigation solutions. Global Digital Transformation: As businesses and governments around the world undergo digital transformation, the importance of securing digital infrastructure becomes paramount. Bots pose a significant threat to these digital ecosystems, necessitating the need for effective bot detection and mitigation measures to protect critical infrastructure and services.

  17. Data from: NF-BoT-IoT

    • kaggle.com
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    StrGenIx | Laurens D'hooge (2023). NF-BoT-IoT [Dataset]. https://www.kaggle.com/datasets/dhoogla/nfbotiot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    StrGenIx | Laurens D'hooge
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    NF-BoT-IoT is the Netflow version of the UNSW-Bot-IoT dataset. This is one dataset in the NF-collection by the university of Queensland aimed at standardizing network-security datasets to achieve interoperability and larger analyses.

    All credit goes to the original authors: Dr. Mohanad Sarhan, Dr. Siamak Layeghy, Dr. Nour Moustafa & Dr. Marius Portmann. Please cite their original conference article when using this dataset.

    V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files

    In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records.

  18. m

    Nesting Bot Usage in Jeskai Decks Dataset

    • mtg-standard.com
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MTG-Standard.com (2025). Nesting Bot Usage in Jeskai Decks Dataset [Dataset]. https://mtg-standard.com/deckmaincards/C22/2396eeab-94ff-5384-b9e5-c5e1acc413d6
    Explore at:
    Dataset updated
    Jul 29, 2025
    Dataset provided by
    MTG-Standard.com
    Description

    Statistical analysis of Nesting Bot usage patterns in Jeskai Standard format decks

  19. T

    Thailand BOT: No of Job Vacancies

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Thailand BOT: No of Job Vacancies [Dataset]. https://www.ceicdata.com/en/thailand/employment-indicators/bot-no-of-job-vacancies
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2017 - Jun 1, 2018
    Area covered
    Thailand
    Variables measured
    Employment
    Description

    Thailand BOT: Number of Job Vacancies data was reported at 29,033.000 Person in Sep 2018. This records an increase from the previous number of 26,027.000 Person for Aug 2018. Thailand BOT: Number of Job Vacancies data is updated monthly, averaging 39,095.000 Person from Jan 1995 (Median) to Sep 2018, with 285 observations. The data reached an all-time high of 115,636.000 Person in Feb 2004 and a record low of 12,620.000 Person in Mar 2007. Thailand BOT: Number of Job Vacancies data remains active status in CEIC and is reported by Bank of Thailand. The data is categorized under Global Database’s Thailand – Table TH.G012: Employment Indicators.

  20. Data from: Incentivizing news consumption on social media platforms using...

    • zenodo.org
    • datadryad.org
    bin, csv
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hadi Askari; Hadi Askari; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak (2024). Incentivizing news consumption on social media platforms using large language models and realistic bot accounts [Dataset]. http://doi.org/10.5061/dryad.7sqv9s50w
    Explore at:
    csv, binAvailable download formats
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hadi Askari; Hadi Askari; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Measurement technique
    <p>Collected via Twitter API and the Python Tweepy library. Contains raw files from our pre and post metrics and also contains our final metrics after all of the classifications (politics and news). </p>
    Description

    This project examines how to enhance users' exposure to and engagement with verified and ideologically balanced news in an ecologically valid setting. We rely on a large-scale two-week long field experiment on 28,457 Twitter users. We created 28 bots utilizing GPT-2 that replied to users tweeting about sports, entertainment, or lifestyle with a contextual reply containing two hardcoded elements: a URL to the topic-relevant section of quality news organization and an encouragement to follow its Twitter account. Treated users were randomly assigned to receive responses by bots presented as female or male. We examine whether our intervention enhances the following of news media organization, the sharing/liking of news content and the tweeting/liking of political content. We find that the treated users followed more news accounts and the users in the female bot treatment were more likely to like news content than the control.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Global share of human and bot web traffic 2013-2024 [Dataset]. https://www.statista.com/statistics/1264226/human-and-bot-web-traffic-share/
Organization logo

Global share of human and bot web traffic 2013-2024

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 21, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description

In 2024, most of the global website traffic was still generated by humans, but bot traffic is constantly growing. Fraudulent traffic through bad bot actors accounted for 37 percent of global web traffic in the most recently measured period, representing an increase of 12 percent from the previous year. Sophistication of Bad Bots on the rise The complexity of malicious bot activity has dramatically increased in recent years. Advanced bad bots have doubled in prevalence over the past 2 years, indicating a surge in the sophistication of cyber threats. Simultaneously, the share of simple bad bots drastically increased over the last years, suggesting a shift in the landscape of automated threats. Meanwhile, areas like food and groceries, sports, gambling, and entertainment faced the highest amount of advanced bad bots, with more than 70 percent of their bot traffic affected by evasive applications. Good and bad bots across industries The impact of bot traffic varies across different sectors. Bad bots accounted for over 50 percent of the telecom and ISPs, community and society, and computing and IT segments web traffic. However, not all bot traffic is considered bad. Some of these applications help index websites for search engines or monitor website performance, assisting users throughout their online search. Therefore, areas like entertainment, food and groceries, and even areas targeted by bad bots themselves experienced notable levels of good bot traffic, demonstrating the diverse applications of benign automated systems across different sectors.

Search
Clear search
Close search
Google apps
Main menu