100+ datasets found

Most targeted industry sectors worldwide targeted by phishing Q4 2024
statista.com
ai-chatbox.pro
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Most targeted industry sectors worldwide targeted by phishing Q4 2024 [Dataset]. https://www.statista.com/statistics/266161/websites-most-affected-by-phishing/
Explore at:
Dataset updated
Apr 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
During the fourth quarter of 2024, nearly 23 percent of phishing attacks worldwide targeted social media. Web-based software services and webmail were targeted by over 23 percent of registered phishing attacks. Furthermore, financial institutions accounted for 12 percent of attacks.
S
Phishing Statistics By Demographic, Healthcare, Industry And Country (2025)
sci-tech-today.com
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sci-Tech Today (2025). Phishing Statistics By Demographic, Healthcare, Industry And Country (2025) [Dataset]. https://www.sci-tech-today.com/stats/phishing-statistics-updated/
Explore at:
Dataset updated
Jun 24, 2025
Dataset authored and provided by
Sci-Tech Today
License
https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
Introduction

Phishing Statistics: Phishing is a kind of cyberattack in which criminals try to fool people into sharing personal information such as passwords or credit card numbers, often by pretending to be a trusted company or person through fake emails, websites, or messages. Phishing has become more common as many people use the Internet for banking, shopping, and communication.

In 2024, phishing attacks are a major threat to both individuals and businesses. Criminals are using more advanced techniques, and these attacks are costing billions of dollars globally. People need to stay aware and cautious online to avoid falling victim to these scams.
U.S. number of BEC victims 2020-2023
statista.com
ai-chatbox.pro
Updated Sep 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ani Petrosyan (2024). U.S. number of BEC victims 2020-2023 [Dataset]. https://www.statista.com/topics/8385/phishing/
Explore at:
Dataset updated
Sep 24, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Ani Petrosyan
Description
In 2023, 21,489 individuals in the United States reported encountering business e-mail compromise (BEC) scams. This figure has slightly increased in the last three years, with 19,954 reported victims in 2021 and, 21,832 in 2022.
U.S. number of phishing victims 2018-2024
statista.com
Updated Jul 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). U.S. number of phishing victims 2018-2023 [Dataset]. https://www.statista.com/statistics/1390362/phishing-victim-number-us/
Explore at:
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
In 2024, over 193,000 individuals in the United States reported encountering phishing attacks. This figure had decreased compared to the previous year, when the number of phishing attacks nationwide amounted to nearly 300,000. However, in 2020 and 2019, this number was relatively low, around 241 thousand and 114 thousand, respectively.
Phishing Websites Dataset
kaggle.com
zip
Updated Mar 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arnav Samal (2024). Phishing Websites Dataset [Dataset]. https://www.kaggle.com/datasets/arnavs19/phishing-websites-dataset
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 23, 2024
Authors
Arnav Samal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These data consist of a collection of legitimate as well as phishing website instances. Each website is represented by the set of features which denote, whether website is legitimate or not. Data can serve as an input for machine learning process.

Here, the two variants of the Phishing Dataset are presented.

Full variant - dataset_full.csv

Total number of instances: 88,647

Number of legitimate website instances (labeled as 0): 58,000

Number of phishing website instances (labeled as 1): 30,647

Total number of features: 111

Small variant - dataset_small.csv

Total number of instances: 58,645

Number of legitimate website instances (labeled as 0): 27,998

Number of phishing website instances (labeled as 1): 30,647

Total number of features: 111
Phishing attacks – who is most at risk?
gov.uk
s3.amazonaws.com
Updated Sep 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2022). Phishing attacks – who is most at risk? [Dataset]. https://www.gov.uk/government/statistics/phishing-attacks-who-is-most-at-risk
Explore at:
Dataset updated
Sep 26, 2022
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Office for National Statistics
Description
Official statistics are produced impartially and free from political influence.
Z
Phishing Website Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Putra, I Kadek Agus Ariesta (2023). Phishing Website Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8041386
Explore at:
Dataset updated
Jul 3, 2023
Dataset authored and provided by
Putra, I Kadek Agus Ariesta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains a collection of legitimate and phishing websites, along with information on the target brands (brands.csv) being impersonated in the phishing attacks. The dataset includes a total of 10,395 websites, 5,244 of which are legitimate and 5,151 of which are phishing websites. These websites impersonate a total of 86 different target brands.

For phishing datasets, the files can be downloaded in a zip file with a "phishing" prefix, while for legitimate websites, the files can be downloaded in a zip file with a "not-phishing" prefix.

In addition, the dataset includes features such as screenshots, text, CSS, and HTML structure for each website, as well as domain information (WHOIS data), IP information, and SSL information. Each website is labeled as either legitimate or phishing and includes additional metadata such as the date it was discovered, the target brand being impersonated, and any other relevant information.

The dataset has been curated for research purposes and can be used to analyze the effectiveness of phishing attacks, develop and evaluate anti-phishing solutions, and identify trends and patterns in phishing attacks. It is hoped that this dataset will contribute to the advancement of research in the field of cybersecurity and help improve our understanding of phishing attacks.
h
data-phishing-detection
huggingface.co
Updated Oct 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reva (2024). data-phishing-detection [Dataset]. https://huggingface.co/datasets/RevaHQ/data-phishing-detection
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 23, 2024
Dataset authored and provided by
Reva
Description
data-phishing-detection

A dataset to test methods to detect phishing emails The file data.parquet contains the dataset, 400 emails. 200 are synthetic phishing attempts and 200 are synthetic regular emails.

Schema

input - an email, synthesized by an LLM, that is either a phishing attempt or a regular email. output - 'Yes' if the email is a phishing attempt, 'No' otherwise.

Prompt

The prompt.md file contains a prompt that can be used with an LLM as a starting… See the full description on the dataset page: https://huggingface.co/datasets/RevaHQ/data-phishing-detection.
o
Textual Data of Phishing Scams Targeting Academia
openicpsr.org
delimited
Updated Apr 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ethan Morrow (2024). Textual Data of Phishing Scams Targeting Academia [Dataset]. http://doi.org/10.3886/E201721V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E201721V1
Dataset updated
Apr 30, 2024
Dataset provided by
University of Illinois at Urbana-Champaign
Authors
Ethan Morrow
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
A partial dataset and document-term matrix of phishing emails targeting an institution of higher education and an associated script used for data analysis.
c
Email Phishing Dataset
cubig.ai
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Email Phishing Dataset [Dataset]. https://cubig.ai/store/products/384/email-phishing-dataset
Explore at:
Dataset updated
May 28, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Email Phishing Dataset is designed for phishing email detection using machine learning.

2) Data Utilization (1) Email Phishing Dataset has characteristics that: • All emails were refined and subjected to a custom NLP feature extraction pipeline focused on phishing metrics. • This dataset contains no raw text or headers, only features engineered for model training/testing. (2) Email Phishing Dataset can be used to: • Developing an email detection model: It can be used to train and evaluate AI models that classify normal mail and phishing mail using various characteristics such as email body, subject, and sender. • E-mail security policy and threat analysis research: Analyzing real phishing cases and normal email data to derive the characteristics of phishing attacks, and use them to establish effective email security policies and develop threat response strategies.
Fraudulent Bank Websites, Phishing E-mails and Similar Scams | DATA.GOV.HK
data.gov.hk
Updated Oct 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.gov.hk (2018). Fraudulent Bank Websites, Phishing E-mails and Similar Scams | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hk-hkma-banksvf-fraudulent-bank-scams
Explore at:
Dataset updated
Oct 26, 2018
Dataset provided by
data.gov.hk
Description
This API is providing the information of press releases issued by the authorized institutions and other similar press releases issued by the HKMA in the past regarding fraudulent bank websites, phishing E-mails and similar scams information.
m
Web page phishing detection
data.mendeley.com
Updated Jun 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelhakim Hannousse (2021). Web page phishing detection [Dataset]. http://doi.org/10.17632/c2gw7fy2j4.3
Explore at:
Unique identifier
https://doi.org/10.17632/c2gw7fy2j4.3
Dataset updated
Jun 25, 2021
Authors
Abdelhakim Hannousse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The provided dataset includes 11430 URLs with 87 extracted features. The dataset are designed to be used as a a benchmark for machine learning based phishing detection systems. Features are from three different classes: 56 extracted from the structure and syntax of URLs, 24 extracted from the content of their correspondent pages and 7 are extracetd by querying external services. The datatset is balanced, it containes exactly 50% phishing and 50% legitimate URLs. Associated to the dataset, we provide Python scripts used for the extraction of the features for potential replication or extension. Datasets are constructed on May 2020.

dataset_A: contains a list a URLs together with their DOM tree objects that can be used for replication and experimenting new URL and content-based features overtaking short-time living of phishing web pages.

dataset_B: containes the extracted feature values that can be used directly as inupt to classifiers for examination. Note that the data in this dataset are indexed with URLs so that one need to remove the index before experimentation.
Data from: Phishing Detection Dataset
kaggle.com
Updated Apr 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Verugopal Iyyer (2021). Phishing Detection Dataset [Dataset]. https://www.kaggle.com/verugopaliyyer/phishing-detection-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Verugopal Iyyer
Description
The dataset 1 contains the age, qualification level, their awareness about phishing and if they became victim to phishing. The dataset 1 contains the result to detection rate before awareness and briefing of phishing after a successful spear phishing.

The dataset 2 contains the age, qualification level, their awareness about phishing and if they became victim to phishing. The dataset 2 contains the result to detection rate after awareness and briefing of phishing after a successful smishing.
o
Phishing URL Classifier Dataset
opendatabay.com
.undefined
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Phishing URL Classifier Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/705b35a9-e638-462d-a5e1-d9f70ff4234a
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Website Analytics & User Experience
Description
This dataset is a curated collection of over 800,000 URLs, designed to represent a variety of online domains. Approximately 52% of these domains are identified as legitimate entities, while the remaining 47% are categorised as phishing domains, indicating potential online threats. The dataset consists of two key columns: "url" and "status". The "status" column uses binary encoding, where 0 signifies phishing domains and 1 indicates legitimate domains. This balanced distribution between phishing and legitimate instances helps ensure the dataset's robustness for analysis and model development.

Columns

url: This field contains the Uniform Resource Locators (URLs) for each domain, including both legitimate and phishing entries.

status: This field denotes the classification of the URL. A value of 0 represents a phishing domain, indicating a potential risk, while a value of 1 signifies a legitimate domain, offering assurance.

Distribution

The dataset is provided in a CSV file format. It contains 808,042 unique entries. The distribution of statuses is approximately 394,982 entries flagged as phishing (0) and 427,028 entries flagged as legitimate (1). This offers an almost equal balance across the two categories.

Usage

This dataset is ideal for applications aimed at understanding, combating, and mitigating online threats. It can be used for developing models related to phishing detection, binary classification, and website analytics. It is also suitable for data cleaning exercises and projects involving Natural Language Processing (NLP) and Deep Learning.

Coverage

The data collection for this dataset is global in scope. While a specific time range for data collection is not provided, the dataset was listed on 05/06/2025.

License

CCO

Who Can Use It

This dataset is particularly valuable for researchers and practitioners working in the fields of AI and Machine Learning. Intended users include those looking to: * Develop and train models for identifying malicious URLs. * Analyse patterns distinguishing legitimate websites from phishing attempts. * Enhance cybersecurity measures and protect users from online threats.

Dataset Name Suggestions

URL Phishing Detection

Legitimate and Malicious URLs

Online Threat URL Dataset

Phishing URL Classifier Data

Web Security URL Collection

Attributes

Original Data Source: Phishing and Legitimate URLS
S
Spear Phishing Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jun 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spear Phishing Report [Dataset]. https://www.datainsightsmarket.com/reports/spear-phishing-1951598
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Jun 6, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The spear phishing market is experiencing robust growth, driven by the increasing sophistication of cyberattacks and the expanding digital landscape. While precise market sizing data is unavailable, considering the substantial investments in cybersecurity and the consistent rise in reported phishing incidents, a reasonable estimate for the 2025 market size would be in the range of $5-7 billion. This figure reflects the rising costs associated with data breaches, regulatory fines, and the increasing demand for advanced threat detection and response solutions. A Compound Annual Growth Rate (CAGR) of 12-15% over the forecast period (2025-2033) is plausible, considering ongoing technological advancements in spear phishing techniques and the corresponding need for robust countermeasures. Key drivers include the growth of remote work, increasing reliance on cloud services, and the evolving tactics employed by cybercriminals to target specific individuals and organizations. Trends point towards a greater focus on artificial intelligence (AI) and machine learning (ML) in threat detection, as well as a shift towards proactive security measures and employee training programs to mitigate the impact of spear phishing attacks. However, restraints include the ever-evolving nature of spear phishing techniques, the persistent skills gap in cybersecurity professionals, and the potential for false positives in automated detection systems. Segmentation within the market is likely to exist based on solution type (e.g., email security, security awareness training), deployment model (cloud, on-premises), and target industry (financial services, healthcare, government). Companies like BAE Systems, Check Point Software Technologies, Cisco Systems, and Proofpoint are key players actively innovating and competing within this dynamic market. The significant market expansion is further fueled by the high financial stakes involved in successful spear phishing campaigns. The impact of successful attacks, including data breaches, financial losses, and reputational damage, encourages organizations to invest heavily in comprehensive security solutions. The proliferation of sophisticated spear phishing techniques, such as personalized phishing emails and the use of social engineering, necessitates advanced detection and prevention technologies. The market's competitive landscape is characterized by both established cybersecurity vendors and emerging players who are constantly developing new solutions to combat the threat of spear phishing. The competitive dynamics will likely lead to further innovation and drive market growth in the coming years, enhancing the overall sophistication of spear phishing detection and prevention solutions.
e
Data set of "Falling and failing (to learn)"
datarepository.eur.nl
pdf
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aurelien Baillon; Francesco Capozza; David Gonzalez-Jimenez (2025). Data set of "Falling and failing (to learn)" [Dataset]. https://datarepository.eur.nl/articles/dataset/Data_set_of_Falling_and_failing_to_learn_/28123376
Explore at:
pdfAvailable download formats
Dataset updated
Jul 16, 2025
Dataset provided by
Erasmus University Rotterdam (EUR)
Authors
Aurelien Baillon; Francesco Capozza; David Gonzalez-Jimenez
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
Data set of "Falling and failing (to learn): Evidence from a Nation-Wide Cybersecurity Field Experiment with SMEs"Accepted for publication in the Journal of Economic Behavior and Organization Abstract:Prior experiences are crucial in shaping risk prevention behavior. Previous studies have shown that experiencing a simulated phishing attack (a ``phishing drill") reduces the likelihood of clicking on unsafe links and disclosing one's password. In a large field experiment involving 670 small and medium-sized enterprises (SMEs) and their 33,000 employees, we examined the impact of experience on individuals' ability to detect cyber-security threats, and whether this effect persisted over several months. We collected data at both the company and individual levels, including risk preference, time preference, and trust. Our findings indicate only a non-systematic, short-term effect of previous phishing emails on clicking behavior. A cluster of individuals with greater patience, trust, and risk seeking was more likely to click on phishing links in the first place but then also more likely to benefit from phishing drills.
Phishing Protection Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Phishing Protection Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-phishing-protection-market
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Phishing Protection Market Outlook

The global phishing protection market size was valued at approximately USD 900 million in 2023 and is projected to reach USD 2.4 billion by 2032, growing at a compound annual growth rate (CAGR) of 11.5% from 2024 to 2032. The growth of this market is fueled by the escalating volume and sophistication of phishing attacks, coupled with increasing awareness about cybersecurity among organizations across various industries.

One of the significant growth factors driving the phishing protection market is the increasing number of cyberattacks targeting both individuals and organizations. Phishing attacks have become more sophisticated, making it crucial for businesses to invest in advanced protection measures. The rise in spear-phishing, where attackers target specific individuals within an organization, has heightened the need for robust phishing protection solutions. Moreover, the financial and reputational damage caused by successful phishing attacks is pushing organizations to adopt comprehensive security solutions, thereby driving market growth.

Another critical factor contributing to the market's expansion is the growing regulatory landscape around data protection and cybersecurity. Governments and regulatory bodies across the globe are implementing stringent regulations to ensure data security and protect consumer information. Compliance with regulations such as GDPR in Europe, CCPA in California, and other data protection laws worldwide necessitates the deployment of advanced phishing protection solutions. Organizations must adhere to these regulations to avoid hefty fines and legal repercussions, further propelling the adoption of phishing protection services and software.

The increasing adoption of digital transformation strategies by enterprises is also a significant driver of market growth. As businesses migrate their operations to cloud platforms and adopt new technologies, they become more vulnerable to cyber threats, including phishing attacks. The shift towards remote work and the integration of Bring Your Own Device (BYOD) policies have expanded the attack surface for cybercriminals. Consequently, organizations are prioritizing investments in phishing protection solutions to safeguard their digital assets and maintain business continuity in a highly digitized environment.

In addition to phishing attacks, organizations are increasingly facing threats from credential stuffing, a type of cyberattack where attackers use automated tools to try multiple username and password combinations to gain unauthorized access to user accounts. This has led to a growing demand for Credential Stuffing Protection solutions, which are designed to detect and block such attempts. These solutions often employ advanced techniques such as behavioral analytics and machine learning to identify suspicious login activities and prevent unauthorized access. As businesses continue to digitize their operations and store sensitive data online, the need for robust Credential Stuffing Protection measures becomes even more critical. By implementing these solutions, organizations can safeguard their user accounts and maintain trust with their customers.

Regionally, North America is anticipated to dominate the phishing protection market during the forecast period, owing to the high incidence of cyberattacks and the presence of leading cybersecurity companies. Europe is also expected to witness significant growth, driven by stringent data protection regulations and increasing cyber threats. The Asia Pacific region is projected to exhibit the highest CAGR, fueled by rapid digitalization, increasing internet penetration, and growing awareness about cybersecurity threats. Latin America, the Middle East, and Africa are also expected to contribute to the market's growth, albeit at a slower pace compared to other regions.

Component Analysis

The phishing protection market is segmented by components into software and services. The software segment is expected to hold a significant share of the market, as organizations increasingly rely on advanced software solutions to detect and prevent phishing attacks. Software solutions typically include email filtering, URL filtering, and anti-phishing tools that help identify and block malicious content. Moreover, the continuous advancements in machine learning and artificial intelligence are enhancing the capabilities of phishing protection software, making them more effective in ide
Phishing: distribution of attacks 2023, by region
statista.com
ai-chatbox.pro
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Phishing: distribution of attacks 2023, by region [Dataset]. https://www.statista.com/statistics/266362/phishing-attacks-country/
Explore at:
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
In 2023, users in Vietnam were most frequently targeted by phishing attacks. The phishing attack rate among internet users in the country was ***** percent. In the examined year, Peru was the second region, with an attack rate of nearly ** percent, while Taiwan followed with ***** percent.
Z
Phishing website dataset
data.niaid.nih.gov
Updated Jun 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Dooremaal, Bram (2021). Phishing website dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4922597
Explore at:
Dataset updated
Jun 10, 2021
Dataset provided by
van Dooremaal, Bram
Allodi, Luca
Burda, Pavlo
Zannone, Nicola
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset comprises phishing and legitimate web pages, which have been used for experiments on early phishing detection.

Detailed information on the dataset and data collection is available at

Bram van Dooremaal, Pavlo Burda, Luca Allodi, and Nicola Zannone. 2021.Combining Text and Visual Features to Improve the Identification of Cloned Webpages for Early Phishing Detection. In ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security. ACM.
m
PhiUSIIL Phishing URL Dataset
data.mendeley.com
Updated Nov 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arvind Prasad (2023). PhiUSIIL Phishing URL Dataset [Dataset]. http://doi.org/10.17632/shwpxscxy2.2
Explore at:
Unique identifier
https://doi.org/10.17632/shwpxscxy2.2
Dataset updated
Nov 15, 2023
Authors
Arvind Prasad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PhiUSIIL Phishing URL Dataset is a substantial dataset comprising 134,850 legitimate and 100,945 phishing URLs. Most of the URLs we analyzed while constructing the dataset are the latest URLs. Features are extracted from the source code of the webpage and URL. Features such as CharContinuationRate, URLTitleMatchScore, URLCharProb, and TLDLegitimateProb are derived from existing features.

Citation: Prasad, A., & Chandra, S. (2023). PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computers & Security, 103545. doi: https://doi.org/10.1016/j.cose.2023.103545

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Most targeted industry sectors worldwide targeted by phishing Q4 2024 [Dataset]. https://www.statista.com/statistics/266161/websites-most-affected-by-phishing/

Most targeted industry sectors worldwide targeted by phishing Q4 2024

Explore at:

15 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 23, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

Worldwide

Description

During the fourth quarter of 2024, nearly 23 percent of phishing attacks worldwide targeted social media. Web-based software services and webmail were targeted by over 23 percent of registered phishing attacks. Furthermore, financial institutions accounted for 12 percent of attacks.

Clear search

Close search

Google apps

Main menu

Most targeted industry sectors worldwide targeted by phishing Q4 2024

Phishing Statistics By Demographic, Healthcare, Industry And Country (2025)

Introduction

U.S. number of BEC victims 2020-2023

U.S. number of phishing victims 2018-2024

Phishing Websites Dataset

Phishing attacks – who is most at risk?

Phishing Website Dataset

data-phishing-detection

Textual Data of Phishing Scams Targeting Academia

Email Phishing Dataset

Fraudulent Bank Websites, Phishing E-mails and Similar Scams | DATA.GOV.HK

Web page phishing detection

Data from: Phishing Detection Dataset

Phishing URL Classifier Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Spear Phishing Report

Data set of "Falling and failing (to learn)"

Phishing Protection Market Report | Global Forecast From 2025 To 2033

Phishing Protection Market Outlook

Component Analysis

Phishing: distribution of attacks 2023, by region

Phishing website dataset

PhiUSIIL Phishing URL Dataset

Most targeted industry sectors worldwide targeted by phishing Q4 2024