100+ datasets found

Global share of human and bot web traffic 2013-2024
statista.com
Updated Jul 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global share of human and bot web traffic 2013-2024 [Dataset]. https://www.statista.com/statistics/1264226/human-and-bot-web-traffic-share/
Explore at:
Dataset updated
Jul 21, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2024, most of the global website traffic was still generated by humans, but bot traffic is constantly growing. Fraudulent traffic through bad bot actors accounted for 37 percent of global web traffic in the most recently measured period, representing an increase of 12 percent from the previous year. Sophistication of Bad Bots on the rise The complexity of malicious bot activity has dramatically increased in recent years. Advanced bad bots have doubled in prevalence over the past 2 years, indicating a surge in the sophistication of cyber threats. Simultaneously, the share of simple bad bots drastically increased over the last years, suggesting a shift in the landscape of automated threats. Meanwhile, areas like food and groceries, sports, gambling, and entertainment faced the highest amount of advanced bad bots, with more than 70 percent of their bot traffic affected by evasive applications. Good and bad bots across industries The impact of bot traffic varies across different sectors. Bad bots accounted for over 50 percent of the telecom and ISPs, community and society, and computing and IT segments web traffic. However, not all bot traffic is considered bad. Some of these applications help index websites for search engines or monitor website performance, assisting users throughout their online search. Therefore, areas like entertainment, food and groceries, and even areas targeted by bad bots themselves experienced notable levels of good bot traffic, demonstrating the diverse applications of benign automated systems across different sectors.
h
daily-papers-stats
huggingface.co
Updated Jul 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hysts-bot-data (2024). daily-papers-stats [Dataset]. https://huggingface.co/datasets/hysts-bot-data/daily-papers-stats
Explore at:
Dataset updated
Jul 28, 2024
Dataset authored and provided by
hysts-bot-data
Description
hysts-bot-data/daily-papers-stats dataset hosted on Hugging Face and contributed by the HF Datasets community
Global share of human and bot web traffic 2023, by industry
statista.com
Updated Dec 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Global share of human and bot web traffic 2023, by industry [Dataset]. https://www.statista.com/statistics/1264540/human-and-bot-web-traffic-share-industry/
Explore at:
Dataset updated
Dec 10, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
In 2023, the majority of website traffic was still generated by humans but bot traffic is constantly increasing. Fraudulent traffic through bad bot actors accounted for 57.2 percent of web traffic in the gaming industry, a stark contrast to the mere 16.5 percent of bad bot traffic in the marketing segment. On the other hand, entertainment, food and groceries, and financial services were also categories with notable percentages of good bot traffic.
YoY change in human-initiated and bot attacks volume worldwide 2023, by...
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YoY change in human-initiated and bot attacks volume worldwide 2023, by region [Dataset]. https://www.statista.com/statistics/1180124/human-initiated-automated-bot-attacks-volume-worldwide-region/
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
In 2023, the North America region saw the most significant year-over-year increase in human-initiated attack volume, over ** percent. The highest spike in automated bot attacks was seen in Latin America (LATAM) ** percent.
OpenML R Bot Benchmark Data
figshare.com
txt
Updated Dec 21, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philipp Probst; Daniel Kühn (2017). OpenML R Bot Benchmark Data [Dataset]. http://doi.org/10.6084/m9.figshare.5727073.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5727073.v1
Dataset updated
Dec 21, 2017
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Philipp Probst; Daniel Kühn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the data that was created by the OpenML R Bot that executed benchmark experiments on the dataset collection OpenML100 with six R algorithms: glmnet, rpart, kknn, svm, ranger and xgboost. The hyperparameters of these algorithms were drawn randomly. In total it contains more than 5 million benchmark experiments and can be used by other researchers. Each file is a table that for each benchmark experiment contains: OpenML-Task ID, hyperparameter values, performance masures (AUC, accuracy, brier score), runtime, scimark (runtime reference of the machine), and some meta features of the dataset.
Kaggle Bot Account Detection
kaggle.com
Updated Feb 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shriyash Jagtap (2023). Kaggle Bot Account Detection [Dataset]. https://www.kaggle.com/datasets/shriyashjagtap/kaggle-bot-account-detection/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 7, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shriyash Jagtap
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
The data in question was generated using the Faker library and is not authentic real-world data. In recent years, there have been numerous reports suggesting the presence of bot voting practices that have resulted in manipulated outcomes within data science competitions. As a result of this, the idea for creating a simulated dataset arose. Although this is the first time that this dataset has been created, it is open to feedback and constructive criticism in order to improve its overall quality and significance.

NAME: The name of the individual. GENDER: The gender of the individual, either male or female. EMAIL_ID: The email address of the individual. IS_GLOGIN: A boolean indicating whether the individual used Google login to register or not. FOLLOWER_COUNT: The number of followers the individual has. FOLLOWING_COUNT: The number of individuals the individual is following. DATASET_COUNT: The number of datasets the individual has created. CODE_COUNT: The number of notebooks the individual has created. DISCUSSION_COUNT: The number of discussions the individual has participated in. AVG_NB_READ_TIME_MIN: The average time spent reading notebooks in minutes. REGISTRATION_IPV4: The IP address used to register. REGISTRATION_LOCATION: The location from where the individual registered. TOTAL_VOTES_GAVE_NB: The total number of votes the individual has given to notebooks. TOTAL_VOTES_GAVE_DS: The total number of votes the individual has given to datasets. TOTAL_VOTES_GAVE_DC: The total number of votes the individual has given to discussion comments. ISBOT: A boolean indicating whether the individual is a bot or not.
f
Twitter bot profiling
figshare.com
researchdata.smu.edu.sg
+1more
pdf
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062706.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Living Analytics Research Centre
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6
OpenML R Bot Benchmark Data (final subset)
figshare.com
application/gzip
Updated May 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl (2018). OpenML R Bot Benchmark Data (final subset) [Dataset]. http://doi.org/10.6084/m9.figshare.5882230.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5882230.v2
Dataset updated
May 18, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Daniel Kühn; Philipp Probst; Janek Thomas; Bernd Bischl
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a clean subset of the data that was created by the OpenML R Bot that executed benchmark experiments on binary classification task of the OpenML100 benchmarking suite with six R algorithms: glmnet, rpart, kknn, svm, ranger and xgboost. The hyperparameters of these algorithms were drawn randomly. In total it contains more than 2.6 million benchmark experiments and can be used by other researchers. The subset was created by taking 500000 results of each learner (except of kknn for which only 1140 results are available). The csv-file for each learner is a table that for each benchmark experiment has a row that contains: OpenML-Data ID, hyperparameter values, performance measures (AUC, accuracy, brier score), runtime, scimark (runtime reference of the machine), and some meta features of the dataset.OpenMLRandomBotResults.RData (format for R) contains all data in seperate tables for the results, the hyperparameters, the meta features, the runtime, the scimark results and reference results.
i
Grant Giving Statistics for Invent-A-Bot Learning Center
instrumentl.com
Updated Jul 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Grant Giving Statistics for Invent-A-Bot Learning Center [Dataset]. https://www.instrumentl.com/990-report/invent-a-bot-learning-center
Explore at:
Dataset updated
Jul 7, 2021
Description
Financial overview and grant giving statistics of Invent-A-Bot Learning Center
Data from: Discovery and classification of Twitter bots
zenodo.org
data.europa.eu
doc, txt
Updated Apr 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Shevtsov; Alexander Shevtsov; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou (2021). Discovery and classification of Twitter bots [Dataset]. http://doi.org/10.5281/zenodo.4715885
Explore at:
doc, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4715885
Dataset updated
Apr 24, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander Shevtsov; Alexander Shevtsov; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou; Maria Oikonomidou; Despoina Antonakaki; Polyvios Pratikakis; Alexandros Kanterakis; Sotiris Ioannidis; Paraskevi Fragopoulou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Online Social Networks (OSN) are used by millions of users, daily. This user-base shares and discovers different opinions on popular topics.
Social influence of large groups may be influenced by user believes or be attracted the interest in particular news or products. A large number of users, gathered in a single group or number of followers, increases the probability to influence OSN users.
Botnets, collections of automated accounts controlled by a single agent, are a common mechanism for exerting
maximum influence. Botnets may be used to better infiltrate the social graph over time and create an illusion of community
behaviour, amplifying their message and increasing persuasion.

This paper investigates Twitter botnets, their behavior, their interaction with user communities and their evolution over time.
We analyze a dense crawl of a subset of Twitter traffic, amounting to nearly all interactions by Greek-speaking Twitter users for a period
of 36 months.

The collected users are labeled as botnets, based on long term and frequent content similarity events. We detect over a million events, where seemingly unrelated accounts tweeted nearly identical content, at almost the same time. We filter these concurrent content injection events and detect a set of 1,850 accounts that repeatedly exhibit this pattern of behavior, suggesting that they are fully or in part controlled and orchestrated by the same entity. We find botnets that appear for brief intervals and disappear, as well as botnets that evolve and grow, spanning the duration of our dataset. We analyze statistical differences between the bot accounts and human users, as well as the botnet interactions with the user communities and the Twitter trending topics.
bot-fight-data
hf.qhduan.com
huggingface.co
Updated Jul 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huggingface Projects (2024). bot-fight-data [Dataset]. https://hf.qhduan.com/datasets/huggingface-projects/bot-fight-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 13, 2024
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Huggingface Projects
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
huggingface-projects/bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community
Sophistication level of bad bots 2020-2024
statista.com
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Sophistication level of bad bots 2020-2024 [Dataset]. https://www.statista.com/statistics/1264631/bad-bots-sophistication/
Explore at:
Dataset updated
Apr 15, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2024, most of the worldwide website traffic is generated by humans but bot traffic is constantly increasing. Fraudulent traffic through bad bot actors also exists at various levels of sophistication. Over the last two years, the amount of advanced bad bots exploded, doubling what was registered in the previous years. However, simple bad bots have increased by almost 6 percent compared to the previous year, suggesting a decrease in the number of moderate bad bots.
Data from: CAS Botany (BOT)
gbif.org
Updated Aug 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Magnaghi; Emily Magnaghi (2025). CAS Botany (BOT) [Dataset]. http://doi.org/10.15468/7gudyo
Explore at:
Unique identifier
https://doi.org/10.15468/7gudyo
Dataset updated
Aug 13, 2025
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
California Academy of Sciences
Authors
Emily Magnaghi; Emily Magnaghi
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered

Description
The electronic catalog of the botanical collection at the California Academy of Sciences, San Francisco.
m
Bot Cooperation Experiment Dataset
data.mendeley.com
narcis.nl
Updated Jul 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hirokazu Shirado (2020). Bot Cooperation Experiment Dataset [Dataset]. http://doi.org/10.17632/t963ktp6ft.1
Explore at:
Unique identifier
https://doi.org/10.17632/t963ktp6ft.1
Dataset updated
Jul 21, 2020
Authors
Hirokazu Shirado
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is the code and data of the experiments involving networks of humans (1,024 subjects in 64 networks) playing a public-goods game to which we sometimes added autonomous agents (bots) programmed to use only local knowledge. We show that cooperation can not only be stabilized, but even promoted, when the bots intervene in the partner selections made by the humans themselves, re-shaping social connections locally within a larger group. The "code" directory has the R codes to analyze the experiment data at both group and individual level in the "data" directory. The dataset also has the raw data of each experiment session with the JSON format in the "data/raw" directory. The sub directory "exp1" has the data of experiment with 6 bot treatments and 1 additional bot visibility treatment. The sub directory "exp2" has the data of supplementary experiment with 1 single network-engineering bot having 5 ties.
Bot_IoT
kaggle.com
Updated Mar 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vignesh Venkateswaran (2023). Bot_IoT [Dataset]. https://www.kaggle.com/datasets/vigneshvenkateswaran/bot-iot/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 14, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vignesh Venkateswaran
Description
INFO ABOUT THE BOT-IOT DATASET, NOTE: only the csv files stated in the description are used

The BoT-IoT dataset can be downloaded from HERE. You can also use our new datasets: the TON_IoT and UNSW-NB15.

--------------------------------------------------------------------------

The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of UNSW Canberra. The network environment incorporated a combination of normal and botnet traffic. The dataset’s source files are provided in different formats, including the original pcap files, the generated argus files and csv files. The files were separated, based on attack category and subcategory, to better assist in labeling process.

The captured pcap files are 69.3 GB in size, with more than 72.000.000 records. The extracted flow traffic, in csv format is 16.7 GB in size. The dataset includes DDoS, DoS, OS and Service Scan, Keylogging and Data exfiltration attacks, with the DDoS and DoS attacks further organized, based on the protocol used.

To ease the handling of the dataset, we extracted 5% of the original dataset via the use of select MySQL queries. The extracted 5%, is comprised of 4 files of approximately 1.07 GB total size, and about 3 million records.

--------------------------------------------------------------------------

Free use of the Bot-IoT dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes should be agreed by the authors. The authors have asserted their rights under the Copyright. To whom intent the use of the Bot-IoT dataset, the authors have to cite the following papers that has the dataset’s details: .

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull. "Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset." Future Generation Computer Systems 100 (2019): 779-796. Public Access Here.

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards developing network forensic mechanism for botnet activities in the iot based on machine learning techniques." In International Conference on Mobile Networks and Management, pp. 30-44. Springer, Cham, 2017.

Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework." Future Generation Computer Systems 110 (2020): 91-106.

Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with particle swarm and deep learning: The particle deep framework." arXiv preprint arXiv:2005.00722 (2020).

Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram, and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability Perspectives in Smart Airports." IEEE Access (2020).

Koroniotis, Nickolaos. "Designing an effective network forensic framework for the investigation of botnets in the Internet of Things." PhD diss., The University of New South Wales Australia, 2020.

--------------------------------------------------------------------------
Global Bot Detection And Mitigation Software Market Size By Type of...
verifiedmarketresearch.com
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VERIFIED MARKET RESEARCH (2024). Global Bot Detection And Mitigation Software Market Size By Type of Solution, By Deployment Mode, By Application, By Industry Vertical, By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/bot-detection-and-mitigation-software-market/
Explore at:
Dataset updated
Jul 22, 2024
Dataset provided by
Verified Market Researchhttps://www.verifiedmarketresearch.com/
Authors
VERIFIED MARKET RESEARCH
License
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Time period covered
2024 - 2031
Area covered
Global
Description
Bot Detection And Mitigation Software Market size was valued at USD 20.5 Billion in 2023 and is projected to reach USD 35.2 Billion by 2031, growing at a CAGR of 8.32% during the forecast period 2024-2031.

Global Bot Detection And Mitigation Software Market Drivers

The market drivers for the Bot Detection And Mitigation Software Market can be influenced by various factors. These may include:

Rising Incidence of Cyber Attacks: As the number and sophistication of cyber attacks increase, organizations are more aware of the threats posed by malicious bots. These bots can perpetrate a variety of harmful activities such as data theft, DDoS (Distributed Denial-of-Service) attacks, and fraudulent transactions. Therefore, there's a heightened demand for software solutions that can detect and mitigate these bot-related threats. Expansion of E-commerce and Online Services: The growth of e-commerce platforms and online services has led to an increased volume of online activities that can be targeted by bots. For instance, bots can be used for price scraping, inventory hoarding, and performing fraudulent transactions. To safeguard the integrity and performance of their platforms, businesses invest in bot detection and mitigation solutions. Increased Adoption of APIs: APIs (Application Programming Interfaces) are increasingly being used to enable interconnectivity between different software services and applications. This widespread use makes them susceptible to bot attacks that can exploit vulnerabilities or abuse the API functionality. Consequently, there's a rising need for bot detection solutions specifically designed to protect APIs. Regulatory Compliance and Data Protection: With stringent regulations like GDPR (General Data Protection Regulation) and CCPA (California Consumer Privacy Act) around data protection and privacy, companies are required to implement robust security measures to protect user data. Bot detection and mitigation software can help organizations comply with these regulations by preventing unwanted access and data breaches through malicious bots. Advancements in Machine Learning and AI: Advances in machine learning (ML) and artificial intelligence (AI) have enhanced the capabilities of bot detection solutions. These technologies enable the development of more sophisticated and accurate systems that can identify and adapt to the evolving behaviors of bots. As a result, companies are more inclined to adopt these cutting-edge solutions for better protection. Growing Concerns Over Ad Fraud: In the digital advertising industry, ad fraud perpetrated by bots is a significant concern. This includes fraudulent clicks, impressions, and conversions generated by bots to deceive advertisers and drain their advertising budgets. To combat this, advertisers and ad networks are increasingly relying on bot detection software to ensure the authenticity of their ad traffic. Increase in Online Transactions: The surge in online transactions, particularly due to the rise of digital payment methods and mobile banking, has made financial services a primary target for bot attacks. Bots can be used for credential stuffing, account takeover, and transaction fraud. Thus, financial institutions are investing heavily in bot mitigation solutions to secure their online platforms. Enhanced User Experience: Bots can significantly degrade user experience by slowing down website performance, causing downtime, and making it difficult for legitimate users to access services. Companies aim to maintain a seamless and efficient user experience by implementing bot detection and mitigation solutions to keep their platforms running smoothly. Increasing Awareness and Education: There is a growing awareness among businesses about the potential risks associated with bot activities and the importance of having robust defenses in place. As more organizations understand the impact of bot attacks, they are more likely to invest in comprehensive bot detection and mitigation solutions. Global Digital Transformation: As businesses and governments around the world undergo digital transformation, the importance of securing digital infrastructure becomes paramount. Bots pose a significant threat to these digital ecosystems, necessitating the need for effective bot detection and mitigation measures to protect critical infrastructure and services.
Data from: NF-BoT-IoT
kaggle.com
Updated Jan 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
StrGenIx | Laurens D'hooge (2023). NF-BoT-IoT [Dataset]. https://www.kaggle.com/datasets/dhoogla/nfbotiot
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 13, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
StrGenIx | Laurens D'hooge
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
NF-BoT-IoT is the Netflow version of the UNSW-Bot-IoT dataset. This is one dataset in the NF-collection by the university of Queensland aimed at standardizing network-security datasets to achieve interoperability and larger analyses.

All credit goes to the original authors: Dr. Mohanad Sarhan, Dr. Siamak Layeghy, Dr. Nour Moustafa & Dr. Marius Portmann. Please cite their original conference article when using this dataset.

V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files

In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records.
m
Nesting Bot Usage in Jeskai Decks Dataset
mtg-standard.com
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MTG-Standard.com (2025). Nesting Bot Usage in Jeskai Decks Dataset [Dataset]. https://mtg-standard.com/deckmaincards/C22/2396eeab-94ff-5384-b9e5-c5e1acc413d6
Explore at:
Dataset updated
Jul 29, 2025
Dataset provided by
MTG-Standard.com
Description
Statistical analysis of Nesting Bot usage patterns in Jeskai Standard format decks
T
Thailand BOT: No of Job Vacancies
ceicdata.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2025). Thailand BOT: No of Job Vacancies [Dataset]. https://www.ceicdata.com/en/thailand/employment-indicators/bot-no-of-job-vacancies
Explore at:
Dataset updated
Feb 15, 2025
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 1, 2017 - Jun 1, 2018
Area covered
Thailand
Variables measured
Employment
Description
Thailand BOT: Number of Job Vacancies data was reported at 29,033.000 Person in Sep 2018. This records an increase from the previous number of 26,027.000 Person for Aug 2018. Thailand BOT: Number of Job Vacancies data is updated monthly, averaging 39,095.000 Person from Jan 1995 (Median) to Sep 2018, with 285 observations. The data reached an all-time high of 115,636.000 Person in Feb 2004 and a record low of 12,620.000 Person in Mar 2007. Thailand BOT: Number of Job Vacancies data remains active status in CEIC and is reported by Bank of Thailand. The data is categorized under Global Database’s Thailand – Table TH.G012: Employment Indicators.
Data from: Incentivizing news consumption on social media platforms using...
zenodo.org
datadryad.org
bin, csv
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hadi Askari; Hadi Askari; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak (2024). Incentivizing news consumption on social media platforms using large language models and realistic bot accounts [Dataset]. http://doi.org/10.5061/dryad.7sqv9s50w
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.7sqv9s50w
Dataset updated
Jun 13, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hadi Askari; Hadi Askari; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak; Michael Heseltine; Anshuman Chhabra; Bernhard Clemm von Hohenberg; Magdalena Wojcieszak
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Measurement technique
<p>Collected via Twitter API and the Python Tweepy library. Contains raw files from our pre and post metrics and also contains our final metrics after all of the classifications (politics and news). </p>
Description
This project examines how to enhance users' exposure to and engagement with verified and ideologically balanced news in an ecologically valid setting. We rely on a large-scale two-week long field experiment on 28,457 Twitter users. We created 28 bots utilizing GPT-2 that replied to users tweeting about sports, entertainment, or lifestyle with a contextual reply containing two hardcoded elements: a URL to the topic-relevant section of quality news organization and an encouragement to follow its Twitter account. Treated users were randomly assigned to receive responses by bots presented as female or male. We examine whether our intervention enhances the following of news media organization, the sharing/liking of news content and the tweeting/liking of political content. We find that the treated users followed more news accounts and the users in the female bot treatment were more likely to like news content than the control.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Global share of human and bot web traffic 2013-2024 [Dataset]. https://www.statista.com/statistics/1264226/human-and-bot-web-traffic-share/

Global share of human and bot web traffic 2013-2024

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jul 21, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

Worldwide

Description

In 2024, most of the global website traffic was still generated by humans, but bot traffic is constantly growing. Fraudulent traffic through bad bot actors accounted for 37 percent of global web traffic in the most recently measured period, representing an increase of 12 percent from the previous year. Sophistication of Bad Bots on the rise The complexity of malicious bot activity has dramatically increased in recent years. Advanced bad bots have doubled in prevalence over the past 2 years, indicating a surge in the sophistication of cyber threats. Simultaneously, the share of simple bad bots drastically increased over the last years, suggesting a shift in the landscape of automated threats. Meanwhile, areas like food and groceries, sports, gambling, and entertainment faced the highest amount of advanced bad bots, with more than 70 percent of their bot traffic affected by evasive applications. Good and bad bots across industries The impact of bot traffic varies across different sectors. Bad bots accounted for over 50 percent of the telecom and ISPs, community and society, and computing and IT segments web traffic. However, not all bot traffic is considered bad. Some of these applications help index websites for search engines or monitor website performance, assisting users throughout their online search. Therefore, areas like entertainment, food and groceries, and even areas targeted by bad bots themselves experienced notable levels of good bot traffic, demonstrating the diverse applications of benign automated systems across different sectors.

Clear search

Close search

Google apps

Main menu

Global share of human and bot web traffic 2013-2024

daily-papers-stats

Global share of human and bot web traffic 2023, by industry

YoY change in human-initiated and bot attacks volume worldwide 2023, by...

OpenML R Bot Benchmark Data

Kaggle Bot Account Detection

Twitter bot profiling

OpenML R Bot Benchmark Data (final subset)

Grant Giving Statistics for Invent-A-Bot Learning Center

Data from: Discovery and classification of Twitter bots

bot-fight-data

Sophistication level of bad bots 2020-2024

Data from: CAS Botany (BOT)

Bot Cooperation Experiment Dataset

Bot_IoT

INFO ABOUT THE BOT-IOT DATASET, NOTE: only the csv files stated in the description are used

The BoT-IoT dataset can be downloaded from HERE. You can also use our new datasets: the TON_IoT and UNSW-NB15.

--------------------------------------------------------------------------

To ease the handling of the dataset, we extracted 5% of the original dataset via the use of select MySQL queries. The extracted 5%, is comprised of 4 files of approximately 1.07 GB total size, and about 3 million records.

--------------------------------------------------------------------------

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull. "Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset." Future Generation Computer Systems 100 (2019): 779-796. Public Access Here.

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards developing network forensic mechanism for botnet activities in the iot based on machine learning techniques." In International Conference on Mobile Networks and Management, pp. 30-44. Springer, Cham, 2017.

Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework." Future Generation Computer Systems 110 (2020): 91-106.

Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with particle swarm and deep learning: The particle deep framework." arXiv preprint arXiv:2005.00722 (2020).

Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram, and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability Perspectives in Smart Airports." IEEE Access (2020).

Koroniotis, Nickolaos. "Designing an effective network forensic framework for the investigation of botnets in the Internet of Things." PhD diss., The University of New South Wales Australia, 2020.

--------------------------------------------------------------------------

Global Bot Detection And Mitigation Software Market Size By Type of...

Data from: NF-BoT-IoT

Nesting Bot Usage in Jeskai Decks Dataset

Thailand BOT: No of Job Vacancies

Data from: Incentivizing news consumption on social media platforms using...

Global share of human and bot web traffic 2013-2024