54 datasets found
  1. Windows Malware Dataset

    • kaggle.com
    zip
    Updated Nov 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PRK Varma (2022). Windows Malware Dataset [Dataset]. https://www.kaggle.com/datasets/ravikiranvarmap/somlap-data-set
    Explore at:
    zip(3391613 bytes)Available download formats
    Dataset updated
    Nov 27, 2022
    Authors
    PRK Varma
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Cyber threat intelligence (CTI) strategies involve gathering several data attributes, building profiles, using intelligent algorithms, and developing optimized threat detection and mitigation techniques. Windows based exe file malware can be detected through its Portable Executable (PE) file header features. Researchers require good datasets to develop efficient Anti-Malware technology. A A new dataset called SOMLAP (Swarm Optimization and Machine Learning Applied to PE Malware Detection) with a value addition to the existing benchmark dataset is developed. The SOMLAP data contains 51,409 samples that include both benign and malware files, with a total of 108 pure PE file header attributes. The data contains 19,809 (38.54%) malware file features gathered from Virus Share and 31,600 (61.46%) benign executables and DLLs were gathered from Windows 10 OS.

    For more details please refer our research article: https://doi.org/10.3390/electronics12020342

    If you use this data in your work, please cite the paper:

    Kattamuri, Santosh Jhansi, Ravi Kiran Varma Penmatsa, Sujata Chakravarty, and Venkata Sai Pavan Madabathula. 2023. "Swarm Optimization and Machine Learning Applied to PE Malware Detection towards Cyber Threat Intelligence" Electronics 12, no. 2: 342. https://doi.org/10.3390/electronics12020342

  2. MalwareBazaar Malware Dataset (Sep - Oct 2025)

    • kaggle.com
    zip
    Updated Oct 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    José Reyes (2025). MalwareBazaar Malware Dataset (Sep - Oct 2025) [Dataset]. https://www.kaggle.com/datasets/arkreyes/malwarebazaar-malware-dataset-sep-oct-2025
    Explore at:
    zip(9415213 bytes)Available download formats
    Dataset updated
    Oct 9, 2025
    Authors
    José Reyes
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    MalwareBazaar Malware Dataset.

    Introduction.

    This dataset is useful to practice skills in Data Analysis or Data Science, contains information about indicators of crompromise found in MalwareBazaar's database.

    Description.

    The dataset was retrieved from MalwareBazaar's database, full dump CSV. Curated, formatted and cleaned by myself.

    • Metadata removed (footer with unreadable information).
    • 'date' formatted to datetime (better reading format).
    • Data filtered from the last 90 days.
    • Unnecessary columns with "NaN" data removed.
  3. Android Malware Detection Dataset

    • kaggle.com
    zip
    Updated Feb 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danny Revaldo (2024). Android Malware Detection Dataset [Dataset]. https://www.kaggle.com/datasets/dannyrevaldo/android-malware-detection-dataset
    Explore at:
    zip(123470 bytes)Available download formats
    Dataset updated
    Feb 24, 2024
    Authors
    Danny Revaldo
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The "Android Malware Detection Dataset" is a comprehensive collection of data designed to facilitate research in the detection and analysis of malware targeting the Android platform. This dataset encompasses a wide range of features extracted from Android applications, providing valuable insights into their behaviors and functionalities.

    Key features of the dataset include:

    • Permission Features: Various permissions requested by Android applications, such as access to location (coarse and fine), camera, microphone, contacts, SMS, calendar, storage, and more.
    • System Features: Features related to system functions and controls, including access to device hardware (e.g., sensors, Bluetooth, NFC), system settings (e.g., changing network state, WiFi settings), and system services (e.g., managing accounts, managing documents).
    • Security-related Features: Features related to security functionalities and behaviors, encompassing permission management, authentication, encryption (e.g., cryptographic operations), and security policy enforcement.
    • Communication Features: Features related to communication functionalities, including sending and receiving SMS messages, making phone calls, accessing network state, and managing network connections.
    • Data Access Features: Features related to accessing and manipulating data, such as reading and writing to various data sources (e.g., external storage, databases), accessing user information (e.g., contacts, call logs), and accessing app-specific data.
    • App Lifecycle Features: Features related to managing the application lifecycle, including app installation and uninstallation, app startup and shutdown, app updates, and app permissions.
    • Device Control Features: Features related to controlling device behavior and settings, such as changing system settings, modifying audio settings, controlling device display, and managing device power.
    • Miscellaneous Features: Other miscellaneous features including accessing system logs, system services and components (e.g., camera, location manager), handling system events (e.g., incoming calls, boot completed), and interacting with system UI components.

    This dataset provides researchers with a rich source of information to develop and evaluate effective malware detection and analysis techniques, ultimately contributing to the enhancement of mobile security on the Android platform.

  4. T

    Maldeb Dataset

    • dataverse.telkomuniversity.ac.id
    • ieee-dataport.org
    • +1more
    png
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Telkom University Dataverse (2024). Maldeb Dataset [Dataset]. http://doi.org/10.34820/FK2/HQYV4X
    Explore at:
    png(37009), png(40485), png(17688), png(34844), png(9493), png(29711), png(20558), png(28684), png(29803), png(6311), png(40949), png(40392), png(38400), png(4038), png(5275), png(17960), png(38508), png(37266), png(31778), png(40248), png(28914), png(38992), png(40895), png(7485), png(28915), png(17724), png(25025), png(38142), png(27095), png(26777), png(37000), png(33749), png(12823), png(16016), png(12597), png(14025), png(7385), png(42604), png(26334), png(27060), png(19233), png(28916), png(12160), png(31488), png(3872), png(36959), png(16928), png(3667), png(32525), png(18253), png(29577), png(40024), png(39597), png(39050), png(11090), png(9764), png(41011), png(39924), png(31149), png(4693), png(39079), png(36808), png(2226), png(38297), png(32701), png(7143), png(5541), png(31606), png(39359), png(11048), png(32711), png(12788), png(26224), png(38202), png(36818), png(20676), png(9677), png(41423), png(24325), png(30595), png(36543), png(7767), png(36066), png(37337), png(33854), png(28742), png(24158), png(42716), png(14727), png(41822), png(27177), png(31238), png(42792), png(34881), png(38036), png(37751), png(14483), png(24093), png(13037), png(42313), png(23072), png(15264), png(19868), png(30260), png(38010), png(30017), png(34029), png(19782), png(41975), png(3367), png(12188), png(32190), png(42775), png(2606), png(41390), png(34637), png(38167), png(10958), png(9704), png(40913), png(42849), png(6512), png(12577), png(30133), png(40975), png(42692), png(13627), png(29584), png(10867), png(10814), png(18784), png(27712), png(11945), png(3054), png(42333), png(27827), png(8053), png(24375), png(31575), png(33487), png(13038)Available download formats
    Dataset updated
    Mar 28, 2024
    Dataset provided by
    Telkom University Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    Directorate General of Higher Education, Ministry of Education and Culture Republic of Indonesia
    Japanese Student Service Association (JASSO)
    Description

    Malware-benign Image representation. The Dataset were collected from several malware repositories, including TekDefense, TheZoo, The Malware-Repo, Malware Database amd Malware Bazar. The benign samples were collected from Microsoft 10 and 11 system apps and several open source software repository including CNET, Sourceforge, FileForum, PortableFreeware. The samples were validated by scanning them using Virustotal Malware scanning services. The Samples underwent preprocessing by converting the malware binary into grayscale images following rules from Nataraj (2011). Nataraj Paper: https://vision.ece.ucsb.edu/research/signal-processing-malware-analysis. Maldeb Dataset is collected by Debi Amalia Septiyani and Halimul Hakim Khairul D. A. Septiyani, “Generating Grayscale and RGB Images dataset for windows PE malware using Gist Features extaction method,” Institut Teknologi Bandung, 2022, and Dani Agung Prastiyo, "Design and implementation of a machine learning-based malware classification system with an audio signal feature Analysis Approach," Institut Teknologi Bandung, 2023. The complete dataset can be accessed on this link https://ieee-dataport.org/documents/maldeb-dataset and https://github.com/julismail/Self-Supervised

  5. Data from: Malware Finances and Operations: a Data-Driven Study of the Value...

    • data.europa.eu
    • data.niaid.nih.gov
    unknown
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2023). Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-8047205?locale=fi
    Explore at:
    unknown(8866943)Available download formats
    Dataset updated
    Oct 18, 2023
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description The datasets demonstrate the malware economy and the value chain published in our paper, Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access, at the 12th International Workshop on Cyber Crime (IWCC 2023), part of the ARES Conference, published by the International Conference Proceedings Series of the ACM ICPS. Using the well-documented scripts, it is straightforward to reproduce our findings. It takes an estimated 1 hour of human time and 3 hours of computing time to duplicate our key findings from MalwareInfectionSet; around one hour with VictimAccessSet; and minutes to replicate the price calculations using AccountAccessSet. See the included README.md files and Python scripts. We choose to represent each victim by a single JavaScript Object Notation (JSON) data file. Data sources provide sets of victim JSON data files from which we've extracted the essential information and omitted Personally Identifiable Information (PII). We collected, curated, and modelled three datasets, which we publish under the Creative Commons Attribution 4.0 International License. 1. MalwareInfectionSet We discover (and, to the best of our knowledge, document scientifically for the first time) that malware networks appear to dump their data collections online. We collected these infostealer malware logs available for free. We utilise 245 malware log dumps from 2019 and 2020 originating from 14 malware networks. The dataset contains 1.8 million victim files, with a dataset size of 15 GB. 2. VictimAccessSet We demonstrate how Infostealer malware networks sell access to infected victims. Genesis Market focuses on user-friendliness and continuous supply of compromised data. Marketplace listings include everything necessary to gain access to the victim's online accounts, including passwords and usernames, but also detailed collection of information which provides a clone of the victim's browser session. Indeed, Genesis Market simplifies the import of compromised victim authentication data into a web browser session. We measure the prices on Genesis Market and how compromised device prices are determined. We crawled the website between April 2019 and May 2022, collecting the web pages offering the resources for sale. The dataset contains 0.5 million victim files, with a dataset size of 3.5 GB. 3. AccountAccessSet The Database marketplace operates inside the anonymous Tor network. Vendors offer their goods for sale, and customers can purchase them with Bitcoins. The marketplace sells online accounts, such as PayPal and Spotify, as well as private datasets, such as driver's licence photographs and tax forms. We then collect data from Database Market, where vendors sell online credentials, and investigate similarly. To build our dataset, we crawled the website between November 2021 and June 2022, collecting the web pages offering the credentials for sale. The dataset contains 33,896 victim files, with a dataset size of 400 MB. Credits Authors Billy Bob Brumley (Tampere University, Tampere, Finland) Juha Nurmi (Tampere University, Tampere, Finland) Mikko Niemelä (Cyber Intelligence House, Singapore) Funding This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under project numbers 804476 (SCARE) and 952622 (SPIRS). Alternative links to download: AccountAccessSet, MalwareInfectionSet, and VictimAccessSet.

  6. Complete Antivirus Database

    • comodo.com
    cav
    Updated Apr 15, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Comodo (2010). Complete Antivirus Database [Dataset]. https://www.comodo.com/home/internet-security/updates/vdp/database.php
    Explore at:
    cavAvailable download formats
    Dataset updated
    Apr 15, 2010
    Dataset provided by
    Comodo Grouphttp://www.comodo.com/
    Authors
    Comodo
    License

    https://www.comodo.com/home/internet-security/updates/vdp/database.phphttps://www.comodo.com/home/internet-security/updates/vdp/database.php

    Description

    The complete Comodo Internet Security database is available for download...

  7. G

    Ransomware Recovery Services Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Ransomware Recovery Services Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/ransomware-recovery-services-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Ransomware Recovery Services Market Outlook



    According to our latest research, the global ransomware recovery services market size reached USD 22.6 billion in 2024, driven by the exponential rise in sophisticated ransomware attacks targeting enterprises across diverse sectors. The market is growing at a robust CAGR of 17.3% and is forecasted to achieve a value of USD 62.5 billion by 2033. This remarkable growth is primarily attributed to the escalating frequency and severity of ransomware incidents, increasing awareness about cybersecurity resilience, and the growing regulatory emphasis on data protection and business continuity.




    One of the primary growth factors fueling the ransomware recovery services market is the surge in cybercrime sophistication and frequency. As threat actors increasingly employ advanced tactics such as double extortion, fileless malware, and targeted attacks on critical infrastructure, organizations are compelled to invest in specialized ransomware recovery services. These services not only facilitate rapid data restoration and system recovery but also minimize downtime and financial losses, ensuring business continuity. The proliferation of remote work and the expansion of digital operations have further widened the attack surface, making organizations more vulnerable and heightening the demand for robust recovery solutions.




    Another significant driver is the evolving regulatory landscape, which mandates stringent data protection and incident response protocols. Governments and regulatory bodies worldwide are enacting comprehensive cybersecurity frameworks that require organizations to implement effective recovery strategies as part of their overall risk management approach. Non-compliance with these regulations can result in hefty penalties and reputational damage, prompting organizations to seek expert ransomware recovery services. The increasing adoption of cyber insurance policies, which often stipulate the engagement of professional recovery providers, also contributes to the market’s expansion.




    Additionally, the growing adoption of digital transformation initiatives across industries is accelerating the need for ransomware recovery services. As organizations migrate critical workloads to cloud environments and integrate IoT devices, they face new vulnerabilities that can be exploited by ransomware operators. The complexity of hybrid and multi-cloud infrastructures necessitates comprehensive recovery strategies that encompass data recovery, forensic analysis, and incident response. Moreover, the rising awareness among small and medium enterprises (SMEs) about the existential threat posed by ransomware is leading to increased investments in specialized recovery solutions, further propelling market growth.




    Regionally, North America dominates the ransomware recovery services market, accounting for the largest revenue share in 2024, followed closely by Europe and Asia Pacific. The high incidence of ransomware attacks on critical sectors such as healthcare, BFSI, and government agencies in these regions has spurred the adoption of advanced recovery solutions. Meanwhile, Asia Pacific is witnessing the fastest growth, driven by rapid digitalization, increasing cyber threats, and expanding regulatory mandates. Latin America and the Middle East & Africa are also experiencing steady growth as organizations in these regions enhance their cybersecurity resilience in response to rising attack volumes.





    Service Type Analysis



    The ransomware recovery services market is segmented by service type into data recovery, system restoration, incident response, forensic analysis, and others. Data recovery remains the cornerstone of ransomware recovery services, as organizations prioritize the restoration of mission-critical files and databases compromised during an attack. Advanced data recovery services employ a combination of backup management, decryption tools, and data integrity checks to ensure seamless restoration without reinfection. The growin

  8. Quttera Website Malware Threat Encyclopedia

    • threats.quttera.com
    json
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quttera (2025). Quttera Website Malware Threat Encyclopedia [Dataset]. https://threats.quttera.com/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset authored and provided by
    Quttera
    Time period covered
    2024 - Present
    Description

    Comprehensive database of website malware threats, vulnerabilities, and security risks detected by Quttera's malware scanner.

  9. m

    Android Malware and Normal permissions dataset

    • data.mendeley.com
    • impactcybertrust.org
    Updated Mar 13, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arvind Mahindru (2018). Android Malware and Normal permissions dataset [Dataset]. http://doi.org/10.17632/958wvr38gy.1
    Explore at:
    Dataset updated
    Mar 13, 2018
    Authors
    Arvind Mahindru
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 18,850 normal android application packages and 10,000 malware android packages which are used to identify the behaviour of malware application on permission they need at run-time.

  10. A Dataset of Information (DNS, IP, WHOIS/RDAP, TLS, GeoIP) for a Large...

    • zenodo.org
    json
    Updated Dec 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radek Hranický; Radek Hranický; Adam Horák; Ondřej Ondryáš; Ondřej Ondryáš; Adam Horák (2024). A Dataset of Information (DNS, IP, WHOIS/RDAP, TLS, GeoIP) for a Large Corpus of Benign, Phishing, and Malware Domain Names 2024 [Dataset]. http://doi.org/10.5281/zenodo.13330074
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Radek Hranický; Radek Hranický; Adam Horák; Ondřej Ondryáš; Ondřej Ondryáš; Adam Horák
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 16, 2024
    Description

    The dataset contains DNS records, IP-related features, WHOIS/RDAP information, information from TLS handshakes and certificates, and GeoIP information for 368,956 benign domains from Cisco Umbrella, 461,338 benign domains from the actual CESNET network traffic, 164,425 phishing domains from PhishTank and OpenPhish services, and 100,809 malware domains from various sources like ThreatFox, The Firebog, MISP threat intelligence platform, and other sources. The ground truth for the phishing dataset was double-check with the VirusTotal (VT) service. Domain names not considered malicious by VT have been removed from phishing and malware datasets. Similarly, benign domain names that were considered risky by VT have been removed from the benign datasets. The data was collected between March 2023 and July 2024. The final assessment of the data was conducted in August 2024.

    The dataset is useful for cybersecurity research, e.g. statistical analysis of domain data or feature extraction for training machine learning-based classifiers, e.g. for phishing and malware website detection.

    Data Files

    • The data is located in the following individual files:

      • benign_umbrella.json - data for 368,956 benign domains from Cisco Umbrella,
      • benign_cesnet.json - data for 461,338 benign domains from the CESNET network,
      • phishing.json - data for 164,425 phishing domains, and
      • malware.json - data for 100,809 malware domains.

    Data Structure

    Both files contain a JSON array of records generated using mongoexport. The following table documents the structure of a record. Please note that:

    • some fields may be missing (they should be interpreted as nulls),
    • extra fields may be present (they should be ignored).

    Field name

    Field type

    Nullable

    Description

    domain_name

    String

    No

    The evaluated domain name

    url

    String

    No

    The source URL for the domain name

    evaluated_on

    Date

    No

    Date of last collection attempt

    source

    String

    No

    An identifier of the source

    sourced_on

    Date

    No

    Date of ingestion of the domain name

    dns

    Object

    Yes

    Data from DNS scan

    rdap

    Object

    Yes

    Data from RDAP or WHOIS

    tls

    Object

    Yes

    Data from TLS handshake

    ip_data

    Array of Objects

    Yes

    Array of data objects capturing the IP addresses related to the domain name

    DNS data (dns field)

    A

    Array of Strings

    No

    Array of IPv4 addresses

    AAAA

    Array of Strings

    No

    Array of IPv6 addresses

    TXT

    Array of Strings

    No

    Array of raw TXT values

    CNAME

    Object

    No

    The CNAME target and related IPs

    MX

    Array of Objects

    No

    Array of objects with the MX target hostname, priority and related IPs

    NS

    Array of Objects

    No

    Array of objects with the NS target hostname and related IPs

    SOA

    Object

    No

    All the SOA fields, present if found at the target domain name

    zone_SOA

    Object

    No

    The SOA fields of the target’s zone (closest point of delegation), present if found and not a record in the target domain directly

    dnssec

    Object

    No

    Flags describing the DNSSEC validation result for each record type

    ttls

    Object

    No

    The TTL values for each record type

    remarks

    Object

    No

    The zone domain name and DNSSEC flags

    RDAP data (rdap field)

    copyright_notice

    String

    No

    RDAP/WHOIS data usage copyright notice

    dnssec

    Bool

    No

    DNSSEC presence flag

    entitites

    Object

    No

    An object with various arrays representing the found related entity types (e.g. abuse, admin, registrant). The arrays contain objects describing the individual entities.

    expiration_date

    Date

    Yes

    The current date of expiration

    handle

    String

    No

    RDAP handle

    last_changed_date

    Date

    Yes

    The date when the domain was last changed

    name

    String

    No

    The target domain name for which the data in this object are stored

    nameservers

    Array of Strings

    No

    Nameserver hostnames provided by RDAP or WHOIS

    registration_date

    Date

    Yes

    First registration date

    status

    Array of Strings

  11. Ransomware Mitigation: An Analytical Investigation into the Effects and...

    • psyarxiv.com
    • osf.io
    Updated Dec 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghifari Permana; Thomas Trowbridge; Bradley Sherborne (2022). Ransomware Mitigation: An Analytical Investigation into the Effects and Trends of Ransomware Attacks on Global Business [Dataset]. http://doi.org/10.31234/osf.io/ayc2d
    Explore at:
    Dataset updated
    Dec 13, 2022
    Dataset provided by
    Center for Open Sciencehttps://cos.io/
    Authors
    Ghifari Permana; Thomas Trowbridge; Bradley Sherborne
    Description

    Abstract—Global business offerings and services are steadily expanding each year, with more registered businesses and an ever-increasing global population. An increase in population and consumers creates evermore data for companies to store, increasing the incentive for cybercriminals to target these databases. One preferred malware deployed by cyber criminals is Ransomware, locking the data away from the user until payment terms are met. Companies are not integrating sufficient protection protocols and fail-safe measures to successfully protect themselves against Ransomware. The resulting attack negatively affects core business components such as reputation, revenue, and customer safety. This investigation of mitigation methods is crucial for the future of cyber protection and without these recommendations attacks like ransomware will continue to thrive. Globally, there are concerning rates of successful ransomware attacks, each having a varying negative impact on the victims. Furthermore, we have proposed four methods of Network Controls, User Training, Two-Factor Authentication, and Machine Learning. After a thorough investigation of these mitigation methods, it has been found that the most cost-effective for any business size is User Training. Implementing our recommendation will significantly reduce the likelihood of becoming a victim of ransomware.

  12. t

    Live Ransomware Victims & Groups Database

    • thehgtech.com
    json
    Updated Jan 4, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TheHGTech (2026). Live Ransomware Victims & Groups Database [Dataset]. https://thehgtech.com/ransomware-tracker.html
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 4, 2026
    Dataset authored and provided by
    TheHGTech
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - Present
    Area covered
    Global
    Variables measured
    Country, Industry, Attack Date, Victim Name, Ransomware Group
    Description

    Real-time database of ransomware victims and active ransomware groups. Includes victim names, industries, countries, attack dates, and group activity trends. Updated twice daily from ransomware leak sites.

  13. Z

    Dataset and Source Code for the Paper: A Framework for Developing Strategic...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Jul 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gulbay, BURAK (2024). Dataset and Source Code for the Paper: A Framework for Developing Strategic Cyber Threat Intelligence from Advanced Persistent Threat Analysis Reports Using Graph-Based Algorithms [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12741054
    Explore at:
    Dataset updated
    Jul 14, 2024
    Dataset provided by
    Gazi University
    Authors
    Gulbay, BURAK
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here are the data set and source code related to the paper: "A Framework for Developing Strategic Cyber Threat Intelligence from Advanced Persistent Threat Analysis Reports Using Graph-Based Algorithms"

    1- aptnotes-downloader.zip : contains source code that downloads all APT reports listed in https://github.com/aptnotes/data and https://github.com/CyberMonitor/APT_CyberCriminal_Campagin_Collections

    2- apt-groups.zip : contains all APT group names gathered from https://docs.google.com/spreadsheets/d/1H9_xaxQHpWaa4O_Son4Gx0YOIzlcBWMsdvePFX68EKU/edit?gid=1864660085#gid=1864660085 and https://malpedia.caad.fkie.fraunhofer.de/actors and https://malpedia.caad.fkie.fraunhofer.de/actors

    3- apt-reports.zip : contains all deduplicated APT reports gathered from https://github.com/aptnotes/data and https://github.com/CyberMonitor/APT_CyberCriminal_Campagin_Collections

    4- countries.zip : contains country name list.

    5- ttps.zip : contains all MITRE techniques gathered from https://attack.mitre.org/resources/attack-data-and-tools/

    6- malware-families.zip : contains all malware family names gathered from https://malpedia.caad.fkie.fraunhofer.de/families

    7- ioc-searcher-app.zip : contains source code that extracts IoCs from APT reports. Extracted IoC files are provided in report-analyser.zip. Original code repo can be found at https://github.com/malicialab/iocsearcher

    8- extracted-iocs.zip : contains extracted IoCs by ioc-searcher-app.zip

    9- report-analyser.zip : contains source code that searchs APT reports, malware families, countries and TTPs. I case of a match, it updates files in extracted-iocs.zip.

    10- cti-transformation-app.zip : contains source code that transforms files in extracted-iocs.zip to CTI triples and saves into Neo4j graph database.

    11- graph-db-backup.zip : contains volume folder of Neo4j Docker container. When it is mounted to a Docker container, all CTI database becomes reachable from Neo4j web interface. Here is how to run a Neo4j Docker container that mounts folder in the zip:

    docker run -d --publish=7474:7474 --publish=7687:7687 --volume={PATH_TO_VOLUME}/DEVIL_NEO4J_VOLUME/neo4j/data:/data --volume={PATH_TO_VOLUME}/DEVIL_NEO4J_VOLUME/neo4j/plugins:/plugins --volume={PATH_TO_VOLUME}/DEVIL_NEO4J_VOLUME/neo4j/logs:/logs --volume={PATH_TO_VOLUME}/DEVIL_NEO4J_VOLUME/neo4j/conf:/conf --env 'NEO4J_PLUGINS=["apoc","graph-data-science"]' --env NEO4J_apoc_export_file_enabled=true --env NEO4J_apoc_import_file_enabled=true --env NEO4J_apoc_import_file_use_neo4j_config=true --env=NEO4J_AUTH=none neo4j:5.13.0

    web interface: http://localhost:7474

    username: neo4j

    password: neo4j

  14. Portable Executable Malware Data

    • kaggle.com
    zip
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    malwareTBugs (2025). Portable Executable Malware Data [Dataset]. https://www.kaggle.com/datasets/malwaretbugs/maldata
    Explore at:
    zip(23094201 bytes)Available download formats
    Dataset updated
    Mar 10, 2025
    Authors
    malwareTBugs
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by malwareTBugs

    Released under Database: Open Database, Contents: Database Contents

    Contents

  15. d

    Data from: Health IT, hacking, and cybersecurity: national trends in data...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jay G. Ronquillo; J. Erik Winterholler; Kamil Cwikla; Raphael Szymanski; Christopher Levy (2019). Health IT, hacking, and cybersecurity: national trends in data breaches of protected health information [Dataset]. http://doi.org/10.5061/dryad.24275c6
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 25, 2019
    Dataset provided by
    Dryad
    Authors
    Jay G. Ronquillo; J. Erik Winterholler; Kamil Cwikla; Raphael Szymanski; Christopher Levy
    Time period covered
    May 24, 2018
    Description

    Objective: The rapid adoption of health information technology (IT) coupled with growing reports of ransomware, and hacking has made cybersecurity a priority in health care. This study leverages federal data in order to better understand current cybersecurity threats in the context of health IT.

    Materials and Methods: Retrospective observational study of all available reported data breaches in the United States from 2013 to 2017, downloaded from a publicly available federal regulatory database.

    Results: There were 1512 data breaches affecting 154 415 257 patient records from a heterogeneous distribution of covered entities (P < .001). There were 128 electronic medical record-related breaches of 4 867 920 patient records, while 363 hacking incidents affected 130 702 378 records.

    Discussion and Conclusion: Despite making up less than 25% of all breaches, hacking was responsible for nearly 85% of all affected patient records. As medicine becomes increasingly interconnected and inform...

  16. Number of ransomware attempts per year 2017-2023

    • statista.com
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Number of ransomware attempts per year 2017-2023 [Dataset]. https://www.statista.com/statistics/494947/ransomware-attempts-per-year-worldwide/
    Explore at:
    Dataset updated
    Feb 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2023, organizations all around the world detected 317.59 million ransomware attempts. Overall, this number decreased significantly between the third and fourth quarters of 2022, going from around 102 million to nearly 155 million cases, respectively. Ransomware attacks usually target organizations that collect large amounts of data and are critically important. In case of an attack, these organizations prefer paying the ransom to restore stolen data rather than to report the attack immediately. The incidents of data loss also damage companies’ reputation, which is one of the reasons why ransomware attacks are not reported. Most targeted industries and regions As a part of critical infrastructure, the manufacturing industry is usually targeted by ransomware attacks. In 2022, manufacturing organizations worldwide saw 437 such attacks. The food and beverage industry ranked second, with over 50 ransomware attacks. By the share of ransomware attacks on critical infrastructure, North America ranked first among other worldwide regions, followed by Europe. Healthcare and public health sector organizations filed the highest number of complaints to the U.S. law enforcement in 2022 about ransomware attacks. Ransomware as a service (RaaS) The Ransomware as a Service (RaaS) business model has existed for over a decade. The model involves hackers and affiliates. Hackers develop ransomware attack models and sell them to affiliates. The latter then use them independently to attack targets. According to the business model, the hacker who created the RaaS receives a service fee per collected ransom. In the first quarter of 2022, there were 31 Ransomware as a Service (RaaS) extortion groups worldwide, compared to the 19 such groups in the same quarter of 2021.

  17. D

    Database Security Service Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Jan 5, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2026). Database Security Service Report [Dataset]. https://www.archivemarketresearch.com/reports/database-security-service-560130
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jan 5, 2026
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2026 - 2034
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Database Security Service market is booming, projected to reach $15 billion in 2025 and grow at a CAGR of 15%. This comprehensive analysis explores market drivers, trends, and key players, providing insights into the evolving landscape of database protection against cyber threats. Learn about the impact of AI, cloud computing, and data privacy regulations on this rapidly expanding sector.

  18. D

    Data Encryption Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Data Encryption Market Report [Dataset]. https://www.promarketreports.com/reports/data-encryption-market-9193
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Feb 19, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Data Encryption Market market was valued at USD 14.5 Billion in 2024 and is projected to reach USD 40.98 Billion by 2033, with an expected CAGR of 16% during the forecast period. Recent developments include: On Apr. 11, 2023, Menlo Security, a leading provider of browser security solutions, published the results of the 10th Annual Cyberthreat Defense Report (CDR) by the CyberEdge Group. The report, partially sponsored by Menlo Security, highlights the augmenting importance of browser isolation technologies to combat ransomware and other malicious threats., The research revealed that most ransomware attacks include threats beyond data encryption. According to the report, around 51% of respondents confirmed that they have been using at least one type of browser or Internet isolation to protect their organizational data, while another 40% are about to deploy data encryption technology. Furthermore, around 33% of respondents noted that browser isolation is a key cybersecurity strategy to protect against sophisticated attacks, including ransomware, phishing, and zero-day attacks., On Feb.14, 2023, EnterpriseDB, a relational database provider, announced the addition of Transparent Data Encryption (TDE) based on open-source PostgreSQL to its databases. The new TDE feature will be shipped along with the firm's enterprise version of its database. TDE is a method of encrypting database files to ensure data security while at rest and in motion., Adding that most enterprises use TDE for compliance issues helps ensure data encryption on the hard drive and files on a backup. Before the development of built-in TDE, enterprises relied on either full-disk encryption or stackable cryptographic file system encryption., On Jan.25, 2023, Researchers from the Tokyo University of Science, Japan, announced the development of a faster and cheaper method for handling encrypted data while improving security. The new data encryption method developed by Japanese researchers combines the best of homomorphic encryption and secret sharing to handle encrypted data., Homomorphic encryption and secret sharing are key methods to compute sensitive data while preserving privacy. Homomorphic encryption is computationally intensive and involves performing computational data encryption on a single server, while secret sharing is fast and computationally efficient., In this method, the encrypted data/secret input is divided and distributed across multiple servers, each performing a computation, such as multiplication, on its data. The results of the computations are then used to reconstruct the original data., September 2022: Convergence Technology Solutions Corp., a supplier of software-enabled IT and cloud solutions, declared that it has obtained certification in Canada to sell and deploy IBM zsystems and LinuxONE., November 2019: Penta Security Systems announced that it has been selected as a finalist for the 2020 SC Magazine Awards, which are given by SC Media and celebrated in the United States. As a result, MyDiamo from Penta Security has been named the Best Database Security Solution of 2020. Additionally, this will result in the expansion of common-level encryption and improve the open-source DBMS installation procedure.. Potential restraints include: ISSUE REGARDING SECURITY AND DATA BREACH 44, HIGH IMPLEMENTATION COSTS AND COMPLEXITY 45; ISSUE WITH RESPECT TO DATA CONSISTENCY AND INTEROPERABILITY ACROSS DIFFERENT EDGE PLATFORMS 45.

  19. Z

    WinMET Dataset

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Mar 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raducu, Razvan; Villagrasa-Labrador, Alain; Rodríguez, Ricardo J.; Álvarez, Pedro (2025). WinMET Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12647555
    Explore at:
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    Universidad de Zaragoza
    Authors
    Raducu, Razvan; Villagrasa-Labrador, Alain; Rodríguez, Ricardo J.; Álvarez, Pedro
    License

    https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html

    Description

    WinMET (Windows Malware Execution Traces) Dataset

    WinMET dataset contains the reports generated with CAPE sandbox after analyzing several malware samples. The reports are valid JSON files that contain the spawned processes, the sequence of WinAPI and system calls invoked by each process, their parameters, their return values, and OS accesed resources, amongst many others.

    Please use this DOI reference that always points to the latest WinMET version: https://doi.org/10.5281/zenodo.12647555

    This dataset was generated using the MALVADA framework, which you can read more about in our publication https://doi.org/10.1016/j.softx.2025.102082. The article also provides insights about the contents of this dataset.

    Razvan Raducu, Alain Villagrasa-Labrador, Ricardo J. Rodríguez, Pedro Álvarez, MALVADA: A framework for generating datasets of malware execution traces, SoftwareX, Volume 30, 2025, 102082, ISSN 2352-7110, https://doi.org/10.1016/j.softx.2025.102082.(https://www.sciencedirect.com/science/article/pii/S2352711025000494)

    How to use the dataset

    The 7z file is password protected. The password is: infected.

    Compressed size on disk: ~2.5GiB.Decompressed size on disk: ~105GiB.Total decompressed .json files: 9889.

    The name of each .json file is irrelevant. It corresponds to its analysis ID.

    cape_report_to_label_mapping.json and avclass_report_to_label_mapping.json contain the mappings of each report with its corresponding consensus label, sorted in descendent order (given the number of reports belonging to each label/family).

    Integrity checks for WinMET.7z:

    MD5: 75b3354fb186ae5a47c320e253bd96ee

    SHA256: 00faac011f4938a29ba9afbd9f0b50d89ede342d1d0d6877cb90b46eabd92c72

    SHA512: 038ca9303623cadaa72eab680221e81e1d335449d08f6395b39eb99baad4092e02c00955089fba31ce1a9dd04260ae80b622491f754774331bced18e8e3be1c4

    Citation

    If you use this dataset, cite it as follows:

    TBA.

    Statistics

    The following statistic (and many more) can be obtained by analyzing the WinMET dataset with the MALVADA framework.

    Total reports: 9889.

    Average VT (VirusTotal) detections: ~53.

    There 268 benign or undetected reports. That is, 10 or less VT detections (default threshold).

    There are 2584 reports with no CAPE consensus label.

    There are 695 reports with no AVClass consensus label.

    Top 20 CAPE consensus labels (there are many more):

    "(n/a)": 2584

    "Redline": 1227

    "Agenttesla": 1010

    "Crifi": 622

    "Amadey": 606

    "Smokeloader": 538

    "Virlock": 471

    "Msilheracles": 408

    "Tedy": 364

    "Disabler": 343

    "Xorstringsnet": 321

    "Snake": 252

    "Autorun": 252

    "Metastealer": 246

    "Formbook": 244

    "Lokibot": 202

    "Strab": 188

    "Loki": 185

    "Mint": 179

    "Taskun": 178

    Top 20 AVClass consensus labels (there are many more)

    "Reline": 2187

    "Disabler": 732

    "(n/a)": 695

    "Amadey": 575

    "Agenttesla": 478

    "Taskun": 382

    "Virlock": 293

    "Equationdrug": 270

    "Stop": 268

    "Strab": 260

    "Noon": 259

    "Gamarue": 181

    "Dofoil": 135

    "Makoob": 113

    "Mokes": 110

    "Snakelogger": 110

    "Bladabindi": 98

    "Zard": 84

    "Gcleaner": 83

    "Deyma": 80

    Changelog

    Version 2.0: Added cape and avclass label mappings.

  20. Data Collection & Requirements

    • zenodo.org
    bin
    Updated Mar 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). Data Collection & Requirements [Dataset]. http://doi.org/10.5281/zenodo.14976797
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Open-Source Cybersecurity and AI Security Datasets

    This project provides a comprehensive collection of open-source datasets focused on cybersecurity threats and AI security vulnerabilities. The datasets are carefully selected to align with specific security threats, such as:

    • Data Exfiltration
    • Data Poisoning
    • Model Manipulation
    • Adversarial Examples
    • Model Inversion
    • Model Extraction
    • Spoofing Attacks
    • Unauthorized Access
    • Supply Chain Compromise

    Dataset Collection

    Each dataset includes a detailed description, source type, purpose, and direct access links for easy retrieval.

    1. DARPA Intrusion Detection Dataset

    • Access Here
    • Description: Simulated network traffic with various cyber attack scenarios (e.g., DoS, Probe, U2R, R2L).
    • Format: PCAP
    • Update Frequency: Static
    • Use Cases: IDS training, intrusion detection research

    2. MITRE ATT&CK Framework Data

    • Access Here
    • Description: A globally-accessible knowledge base of adversarial tactics, techniques, and procedures (TTPs).
    • Format: JSON, STIX
    • Update Frequency: Quarterly
    • Use Cases: Threat intelligence, adversary simulation, AI model defense

    3. VirusShare Malware Repository

    • Access Here (Registration Required)
    • Description: Large-scale collection of live malware samples for security research.
    • Format: ZIP, PE files
    • Update Frequency: Weekly
    • Use Cases: AI-based malware detection, sandbox testing

    4. National Vulnerability Database (NVD)

    • Access Here
    • Description: A repository of reported vulnerabilities (CVEs) with severity scores and descriptions.
    • Format: XML, JSON
    • Update Frequency: Daily
    • Use Cases: Vulnerability management, exploit mitigation research

    5. LANL Unified Host and Network Dataset

    • Access Here
    • Description: Enterprise-scale dataset containing network and host logs with real-world red-team attack events.
    • Format: Text files
    • Update Frequency: Static
    • Use Cases: Insider threat detection, anomaly detection in network security

    6. CIC-IDS2017 (Intrusion Detection Dataset)

    • Access Here
    • Description: Network traffic dataset with multiple attack types, including DDoS, brute-force, and infiltration attacks.
    • Format: PCAP, CSV
    • Update Frequency: Static
    • Use Cases: Machine learning-based intrusion detection, behavioral analysis

    7. CIC IoV CAN Bus Dataset 2024

    • Access Here
    • Description: Vehicle CAN bus data, including spoofing and denial-of-service (DoS) attack traces.
    • Format: CSV, PCAP
    • Update Frequency: Static
    • Use Cases: Automotive security, AI-based anomaly detection in vehicles

    8. ImageNet-A (Adversarial Image Dataset)

    • Access Here
    • Description: A dataset of real-world images that cause misclassification in deep learning models.
    • Format: JPEG
    • Update Frequency: Static
    • Use Cases: Adversarial robustness evaluation, model retraining for security
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
PRK Varma (2022). Windows Malware Dataset [Dataset]. https://www.kaggle.com/datasets/ravikiranvarmap/somlap-data-set
Organization logo

Windows Malware Dataset

SOMLAP DATA SET: Windows PE Header Malware Dataset

Explore at:
zip(3391613 bytes)Available download formats
Dataset updated
Nov 27, 2022
Authors
PRK Varma
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Cyber threat intelligence (CTI) strategies involve gathering several data attributes, building profiles, using intelligent algorithms, and developing optimized threat detection and mitigation techniques. Windows based exe file malware can be detected through its Portable Executable (PE) file header features. Researchers require good datasets to develop efficient Anti-Malware technology. A A new dataset called SOMLAP (Swarm Optimization and Machine Learning Applied to PE Malware Detection) with a value addition to the existing benchmark dataset is developed. The SOMLAP data contains 51,409 samples that include both benign and malware files, with a total of 108 pure PE file header attributes. The data contains 19,809 (38.54%) malware file features gathered from Virus Share and 31,600 (61.46%) benign executables and DLLs were gathered from Windows 10 OS.

For more details please refer our research article: https://doi.org/10.3390/electronics12020342

If you use this data in your work, please cite the paper:

Kattamuri, Santosh Jhansi, Ravi Kiran Varma Penmatsa, Sujata Chakravarty, and Venkata Sai Pavan Madabathula. 2023. "Swarm Optimization and Machine Learning Applied to PE Malware Detection towards Cyber Threat Intelligence" Electronics 12, no. 2: 342. https://doi.org/10.3390/electronics12020342

Search
Clear search
Close search
Google apps
Main menu