100+ datasets found

Z
Phishing website dataset
data.niaid.nih.gov
zenodo.org
Updated Jun 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Burda, Pavlo (2021). Phishing website dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4922597
Explore at:
Dataset updated
Jun 10, 2021
Dataset provided by
Zannone, Nicola
van Dooremaal, Bram
Burda, Pavlo
Allodi, Luca
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset comprises phishing and legitimate web pages, which have been used for experiments on early phishing detection.

Detailed information on the dataset and data collection is available at

Bram van Dooremaal, Pavlo Burda, Luca Allodi, and Nicola Zannone. 2021.Combining Text and Visual Features to Improve the Identification of Cloned Webpages for Early Phishing Detection. In ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security. ACM.
Number of global phishing sites Q3 2013- Q3 2024
statista.com
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Number of global phishing sites Q3 2013- Q3 2024 [Dataset]. https://www.statista.com/statistics/266155/number-of-phishing-domain-names-worldwide/
Explore at:
Dataset updated
Dec 9, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In the 3rd quarter of 2024, over 932 thousand unique phishing sites were detected worldwide, representing a slight increase from the preceding quarter. By far, the number of unique phishing sites has seen the most significant jump between the second and the third quarters of 2020, from nearly 147 thousand to approximately 572 thousand. This figure is based on the number of the unique base URLs of the phishing sites.
Phishing dataset
figshare.com
txt
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toluwase Olowookere; Adebayo Olusola Adetunmbi; John Ojo Ajayi (2023). Phishing dataset [Dataset]. http://doi.org/10.6084/m9.figshare.16680874.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16680874.v1
Dataset updated
Jun 5, 2023
Dataset provided by
figshare
Authors
Toluwase Olowookere; Adebayo Olusola Adetunmbi; John Ojo Ajayi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was obtained from UCI machine learning repository in 2019. The dataset consists of eleven thousand and fifty-five (11055) instances with thirty-one (31) attributes and does not contain any missing value whatsoever. The dataset has two decisional conditions (that is, class labels); thus: Phishing is -1 and non-phishing is 1. Of the total 11055 instances, the total number occurrence of instances in the phishing class is 4898, while the non-phishing class contains 6157 total instances.
Phishing most targeted industry sectors worldwide Q3 2024
statista.com
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Phishing most targeted industry sectors worldwide Q3 2024 [Dataset]. https://www.statista.com/statistics/266161/websites-most-affected-by-phishing/
Explore at:
Dataset updated
Dec 9, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
During the third quarter of 2024, 30.5 percent of phishing attacks worldwide targeted Social media. Web-based software services and webmail followed, with around 21.2 percent of registered phishing attacks. Furthermore, Financial institutions accounted for 13 percent of attacks.
m
Web page phishing detection
data.mendeley.com
narcis.nl
Updated Sep 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelhakim Hannousse (2020). Web page phishing detection [Dataset]. http://doi.org/10.17632/c2gw7fy2j4.1
Explore at:
Unique identifier
https://doi.org/10.17632/c2gw7fy2j4.1
Dataset updated
Sep 26, 2020
Authors
Abdelhakim Hannousse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The provided dataset includes 11430 URLs with 87 extracted features. The dataset are designed to be used as a a benchmark for machine learning based phishing detection systems. Features are from three different classes: 56 extracted from the structure and syntax of URLs, 24 extracted from the content of their correspondent pages and 7 are extracetd by querying external services. The datatset is balanced, it containes exactly 50% phishing and 50% legitimate URLs. Associated to the dataset, we provide Python scripts used for the extraction of the features for potential replication or extension.

dataset_A: contains a list a URLs together with their DOM tree objects that can be used for replication and experimenting new URL and content-based features overtaking short-time living of phishing web pages.

dataset_B: containes the extracted feature values that can be used directly as inupt to classifiers for examination. Note that the data in this dataset are indexed with URLs so that one need to remove the index before experimentation.

Datasets are constructed on May 2020. Due to huge size of dataset A, only a sample of the dataset is provided, I will try to divide into sample files and upload them one by one, for full copy, please contact directly the author at any time at: hannousse.abdelhakim@univ-guelma.dz
A
‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 5, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-phishing-dataset-for-machine-learning-2690/f1656d17/?iid=000-751&v=presentation
Explore at:
Dataset updated
Nov 5, 2019
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Phishing Dataset for Machine Learning’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shashwatwork/phishing-dataset-for-machine-learning on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

Anti-phishing refers to efforts to block phishing attacks. Phishing is a kind of cybercrime where attackers pose as known or trusted entities and contact individuals through email, text or telephone and ask them to share sensitive information. Typically, in a phishing email attack, and the message will suggest that there is a problem with an invoice, that there has been suspicious activity on an account, or that the user must login to verify an account or password. Users may also be prompted to enter credit card information or bank account details as well as other sensitive data. Once this information is collected, attackers may use it to access accounts, steal data and identities, and download malware onto the user’s computer.

Content

This dataset contains 48 features extracted from 5000 phishing webpages and 5000 legitimate webpages, which were downloaded from January to May 2015 and from May to June 2017. An improved feature extraction technique is employed by leveraging the browser automation framework (i.e., Selenium WebDriver), which is more precise and robust compared to the parsing approach based on regular expressions.

Anti-phishing researchers and experts may find this dataset useful for phishing features analysis, conducting rapid proof of concept experiments or benchmarking phishing classification models.

Acknowledgements

Tan, Choon Lin (2018), “Phishing Dataset for Machine Learning: Feature Evaluation”, Mendeley Data, V1, doi: 10.17632/h3cgnj8hft.1 Source of the Dataset.

--- Original source retains full ownership of the source dataset ---
h
dataset-phishing
huggingface.co
Updated May 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
dataset-phishing [Dataset]. https://huggingface.co/datasets/itsprofarul/dataset-phishing
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 1, 2024
Authors
Arul Kumar Natarajan
License
https://choosealicense.com/licenses/gemma/https://choosealicense.com/licenses/gemma/
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

Dataset Sources [optional]… See the full description on the dataset page: https://huggingface.co/datasets/itsprofarul/dataset-phishing.
Global number of e-mail phishing attacks 2022-2023
statista.com
Updated Sep 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Global number of e-mail phishing attacks 2022-2023 [Dataset]. https://www.statista.com/statistics/1493550/phishing-attacks-global-number/
Explore at:
Dataset updated
Sep 23, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2022 - Dec 2023
Area covered
Worldwide
Description
In December 2023, around 9.45 million phishing e-mails were detected worldwide, up from 5.59 million in September 2023. This figure has seen a continuous increase since January 2022. It is partially associated with the launch of ChatGPT in November 2022.
Legitimate and phishing website dataset
kaggle.com
Updated Apr 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kunal Raut (2022). Legitimate and phishing website dataset [Dataset]. https://www.kaggle.com/datasets/kunalraut21/legitimate-and-phishing-website-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 17, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kunal Raut
Description
Dataset

This dataset was created by Kunal Raut

Contents
phishing-dataset-part-1
kaggle.com
zip
Updated Oct 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashish Goraniya (2024). phishing-dataset-part-1 [Dataset]. https://www.kaggle.com/datasets/ashishgoraniya/phishing-dataset-part-1/suggestions
Explore at:
zip(831772 bytes)Available download formats
Dataset updated
Oct 5, 2024
Authors
Ashish Goraniya
Description
Dataset

This dataset was created by Ashish Goraniya

Contents
U.S. number of phishing victims 2018-2023
statista.com
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). U.S. number of phishing victims 2018-2023 [Dataset]. https://www.statista.com/statistics/1390362/phishing-victim-number-us/
Explore at:
Dataset updated
Apr 15, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
In 2023, over 298 thousand individuals in the United States reported encountering phishing attacks. This figure had decreased by 0.5 percent compared to the previous year, when the number of phishing attacks nationwide amounted to over 300 thousand. However, in 2020 and 2019, this number was relatively low, around 241 thousand and 114 thousand, respectively.
phishing
huggingface.co
Updated Jun 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Safehouse Tech (2024). phishing [Dataset]. https://huggingface.co/datasets/safehousetech/phishing
Explore at:
Dataset updated
Jun 29, 2024
Dataset provided by
SafeHouse Technologies Limited
Authors
Safehouse Tech
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
safehousetech/phishing dataset hosted on Hugging Face and contributed by the HF Datasets community
Phishing: distribution of attacks 2023, by region
statista.com
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Phishing: distribution of attacks 2023, by region [Dataset]. https://www.statista.com/statistics/266362/phishing-attacks-country/
Explore at:
Dataset updated
Sep 13, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
In 2023, users in Vietnam were most frequently targeted by phishing attacks. The phishing attack rate among internet users in the country was 18.91 percent. In the examined year, Peru was the second region, with an attack rate of nearly 17 percent, while Taiwan followed with 15.59 percent.
Global Phishing Protection Market Report 2025 Edition, Market Size, Share,...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated May 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2024). Global Phishing Protection Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/phishing-protection-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
May 19, 2024
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the Global market for phishing protection is expected to have a market size of XX million in 2024 with a growing CAGR of XX% during the forecast period.

The Asia-Pacific region has the largest market share with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period. North America is the fastest growing with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period. The solution segment has the largest market share with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period. The cloud segment has the largest market share with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period. Email-Phishing has the largest market share with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period. BFSI has the largest market share with an expected market size of XX million in 2024 with a growing CAGR of XX% during the forecast period.

Market Dynamics

Key Drivers

The rise in the number of phishing attacks globally is favoring the market growth

Strong phishing protection solutions are in more demand as a result of the growing anxiety that phishing assaults are causing among both individuals and enterprises. Phishing is a type of cybercrime that has become more common and sophisticated. In it, attackers pose as reputable organizations in an attempt to trick consumers into disclosing personal information. Because of this and the need for more stringent security and regulatory compliance, the phishing prevention industry is growing quickly. Phishing instances have increased significantly in recent years, according to the FBI's Internet Crime Complaint Centre (IC3). With over 300,000 complaints, phishing was the most frequently reported cybercrime, according to the 2022 Internet Crime Report. Comparing this to other years, there has been a noticeable increase, highlighting the expanding threat scenario. Phishing assaults require sophisticated defenses because they have progressed from straightforward email scams to intricate multi-phase operations that take advantage of human behavior. Businesses are realising that the newest phishing techniques cannot be defeated by using antiquated security solutions. Due to this, phishing prevention systems are becoming more all-inclusive and include features like email filtering, multi-factor authentication, user education, and threat intelligence. The need for these cutting-edge solutions has been further spurred by the growing threat of business email compromise (BEC), in which attackers pose as company officials to start fraudulent activities. Increasing the effectiveness of defenses against phishing is emphasized by government organizations and cybersecurity specialists. Organizations can reduce their exposure to phishing threats by frequently utilizing information and guidance offered by the Cybersecurity and Infrastructure Security Agency (CISA). They advise adopting a zero-trust security paradigm, setting up technical controls to identify and stop questionable emails, and teaching staff members to spot phishing attempts. The need for a multi-layered approach to phishing prevention is highlighted by these best practices, and this is what is fueling the market's expansion. Furthermore, the growing ubiquity of remote labor coupled with digital transformation programs has increased the attack surface available to cybercriminals. Workers who work remotely might not have the same security measures in place as those who work in an office setting, which leaves them more susceptible to phishing scams. Businesses have been forced by this tendency to spend money on cloud-based phishing prevention technologies that can safeguard a staff that is dispersed. Therefore, the escalating frequency and complexity of phishing attacks are propelling the growth of the phishing protection market.

The various regulations and compliances around the globe are favoring market growth

As cyber risks continue to grow, phishing protection has become a key concern for both individuals and enterprises. The laws and regulatory requirements that are propelling the phishing protection market's expansion are intended to strengthen cybersecurity defenses and shield private information from nefarious individuals. The Ge...
P
Phishing Attack Simulation Training Report
datainsightsmarket.com
doc, pdf, ppt
Updated Dec 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2024). Phishing Attack Simulation Training Report [Dataset]. https://www.datainsightsmarket.com/reports/phishing-attack-simulation-training-1391923
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Dec 25, 2024
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global phishing attack simulation training market is estimated to be worth USD XXX million in 2023 and is projected to grow at a CAGR of XX% from 2023 to 2033. The growth of the market is attributed to the increasing number of phishing attacks, rising awareness about the importance of cybersecurity, and growing adoption of cloud-based solutions. Key drivers of the market include the increasing adoption of phishing simulation training by enterprises to improve employee awareness and reduce the risk of successful phishing attacks. The growing sophistication of phishing campaigns and the need for organizations to comply with regulatory requirements are also contributing to the market growth. The market is segmented by application (SMEs, large enterprises), type (online simulation, offline simulation), and region (North America, South America, Europe, Middle East & Africa, Asia Pacific). North America is expected to dominate the market during the forecast period. The presence of a large number of enterprises and the high awareness about cybersecurity in the region are the key factors driving the growth of the market in North America. Europe is expected to be the second-largest market for phishing attack simulation training. The growing adoption of cloud-based solutions and the presence of a large number of SMEs in the region are contributing to the growth of the market in Europe. Asia Pacific is expected to be the fastest-growing phishing attack simulation training market during the forecast period. The increasing adoption of smartphones and the growing number of internet users in the region are the key factors driving the growth of the market in Asia Pacific.
u
Don't Take the Bait: Recognize and Avoid Phishing Attacks
data.urbandatacentre.ca
datasets.ai
+3more
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Don't Take the Bait: Recognize and Avoid Phishing Attacks [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-2bbfd0ea-1757-488e-89bf-8ad90c521a52
Explore at:
Dataset updated
Oct 1, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Phishing is an attack where a scammer calls you, texts or emails you, or uses social media to trick you into clicking a malicious link, downloading malware, or sharing sensitive information. Phishing attempts are often generic mass messages, but the message appears to be legitimate and from a trusted source (e.g. from a bank, courier company).
e
Dataset of "Informing, simulating experience, or both: A field experiment on...
datarepository.eur.nl
narcis.nl
pdf
Updated Nov 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aurelien Baillon (2019). Dataset of "Informing, simulating experience, or both: A field experiment on phishing risks" [Dataset]. http://doi.org/10.25397/eur.10500029.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25397/eur.10500029.v1
Dataset updated
Nov 20, 2019
Dataset provided by
Erasmus University Rotterdam (EUR)
Authors
Aurelien Baillon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset used in "Informing, simulating experience, or both: A field experiment on phishing risks".
phishing data set
kaggle.com
zip
Updated Mar 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samvsam (2024). phishing data set [Dataset]. https://www.kaggle.com/datasets/samvsamv/phishing-data-set/discussion
Explore at:
zip(1063372 bytes)Available download formats
Dataset updated
Mar 27, 2024
Authors
Samvsam
Description
Dataset

This dataset was created by Samvsam

Contents
t
Phishing Websites Dataset - Dataset - LDM
service.tib.eu
Updated Jan 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Phishing Websites Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/phishing-websites-dataset
Explore at:
Dataset updated
Jan 2, 2025
Description
A comprehensive model for phishing website detection is proposed in this study.
f
Learning rate—Performance comparison with Phishtank dataset.
figshare.com
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashit Kumar Dutta (2023). Learning rate—Performance comparison with Phishtank dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0258361.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0258361.t003
Dataset updated
Jun 9, 2023
Dataset provided by
PLOS ONE
Authors
Ashit Kumar Dutta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Learning rate—Performance comparison with Phishtank dataset.

Facebook

Twitter

Click to copy link

Link copied

Cite

Burda, Pavlo (2021). Phishing website dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4922597

Phishing website dataset

Explore at:

Dataset updated

Jun 10, 2021

Dataset provided by

Zannone, Nicola
van Dooremaal, Bram
Burda, Pavlo
Allodi, Luca

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset comprises phishing and legitimate web pages, which have been used for experiments on early phishing detection.

Detailed information on the dataset and data collection is available at

Bram van Dooremaal, Pavlo Burda, Luca Allodi, and Nicola Zannone. 2021.Combining Text and Visual Features to Improve the Identification of Cloned Webpages for Early Phishing Detection. In ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security. ACM.

Clear search

Close search

Google apps

Main menu

Phishing website dataset

Number of global phishing sites Q3 2013- Q3 2024

Phishing dataset

Phishing most targeted industry sectors worldwide Q3 2024

Web page phishing detection

‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2

Context

Content

Acknowledgements

dataset-phishing

Global number of e-mail phishing attacks 2022-2023

Legitimate and phishing website dataset

Dataset

Contents

phishing-dataset-part-1

Dataset

Contents

U.S. number of phishing victims 2018-2023

phishing

Phishing: distribution of attacks 2023, by region

Global Phishing Protection Market Report 2025 Edition, Market Size, Share,...

Phishing Attack Simulation Training Report

Don't Take the Bait: Recognize and Avoid Phishing Attacks

Dataset of "Informing, simulating experience, or both: A field experiment on...

phishing data set

Dataset

Contents

Phishing Websites Dataset - Dataset - LDM

Learning rate—Performance comparison with Phishtank dataset.

Phishing website dataset